Server Admin Log/Archive 86
Appearance
2024-10-31
- 23:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T376905)', diff saved to https://phabricator.wikimedia.org/P70827 and previous config saved to /var/cache/conftool/dbconfig/20241031-234959-ladsgroup.json
- 23:41 urbanecm: Run extensions/Flow/maintenance/FlowMoveBoardsToSubpages.php for several wikis (T376749; wiki list is on task)
- 23:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2167 (T376905)', diff saved to https://phabricator.wikimedia.org/P70809 and previous config saved to /var/cache/conftool/dbconfig/20241031-234030-ladsgroup.json
- 23:40 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
- 23:40 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
- 23:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T376905)', diff saved to https://phabricator.wikimedia.org/P70808 and previous config saved to /var/cache/conftool/dbconfig/20241031-234003-ladsgroup.json
- 23:37 swfrench@deploy2002: Finished scap sync-world: Deployment to clear noop chart diff from 1085491 - T372604 T377040 (duration: 01m 49s)
- 23:35 swfrench@deploy2002: Started scap sync-world: Deployment to clear noop chart diff from 1085491 - T372604 T377040
- 23:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P70807 and previous config saved to /var/cache/conftool/dbconfig/20241031-232456-ladsgroup.json
- 23:15 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['an-presto1019.eqiad.wmnet']
- 23:13 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['an-presto1019.eqiad.wmnet']
- 23:12 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['an-presto1019.eqiad.wmnet']
- 23:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P70806 and previous config saved to /var/cache/conftool/dbconfig/20241031-230949-ladsgroup.json
- 22:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T376905)', diff saved to https://phabricator.wikimedia.org/P70805 and previous config saved to /var/cache/conftool/dbconfig/20241031-225442-ladsgroup.json
- 22:48 dancy@deploy2002: Finished scap sync-world: Backport for Dummy commit for testing (duration: 07m 28s)
- 22:46 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-presto1019.eqiad.wmnet with OS bullseye
- 22:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2166 (T376905)', diff saved to https://phabricator.wikimedia.org/P70804 and previous config saved to /var/cache/conftool/dbconfig/20241031-224513-ladsgroup.json
- 22:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
- 22:44 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
- 22:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T376905)', diff saved to https://phabricator.wikimedia.org/P70803 and previous config saved to /var/cache/conftool/dbconfig/20241031-224446-ladsgroup.json
- 22:43 dancy@deploy2002: dancy: Continuing with sync
- 22:43 dancy@deploy2002: dancy: Backport for Dummy commit for testing synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 22:40 dancy@deploy2002: Started scap sync-world: Backport for Dummy commit for testing
- 22:30 dancy@deploy2002: Installation of scap version "4.119.4" completed for 1 hosts
- 22:29 dancy@deploy2002: Installing scap version "4.119.4" for 1 hosts
- 22:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P70802 and previous config saved to /var/cache/conftool/dbconfig/20241031-222939-ladsgroup.json
- 22:21 bking@cumin2002: START - Cookbook sre.hosts.reimage for host an-presto1019.eqiad.wmnet with OS bullseye
- 22:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P70801 and previous config saved to /var/cache/conftool/dbconfig/20241031-221432-ladsgroup.json
- 21:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T376905)', diff saved to https://phabricator.wikimedia.org/P70800 and previous config saved to /var/cache/conftool/dbconfig/20241031-215925-ladsgroup.json
- 21:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2164 (T376905)', diff saved to https://phabricator.wikimedia.org/P70799 and previous config saved to /var/cache/conftool/dbconfig/20241031-215056-ladsgroup.json
- 21:51 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 21:51 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 21:51 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
- 21:50 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
- 21:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T376905)', diff saved to https://phabricator.wikimedia.org/P70798 and previous config saved to /var/cache/conftool/dbconfig/20241031-215025-ladsgroup.json
- 21:50 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host an-presto1019.eqiad.wmnet with OS bullseye
- 21:40 bking@cumin2002: START - Cookbook sre.hosts.reimage for host an-presto1019.eqiad.wmnet with OS bullseye
- 21:40 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['an-presto1019.eqiad.wmnet']
- 21:37 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['an-presto1019.eqiad.wmnet']
- 21:37 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['an-presto1019.eqiad.wmnet']
- 21:35 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['an-presto1019.eqiad.wmnet']
- 21:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P70797 and previous config saved to /var/cache/conftool/dbconfig/20241031-213518-ladsgroup.json
- 21:35 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['an-presto1019.eqiad.wmnet']
- 21:22 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['an-presto1019.eqiad.wmnet']
- 21:22 urandom: Bootstrapping Cassandra/aqs1022-b — T378725
- 21:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P70796 and previous config saved to /var/cache/conftool/dbconfig/20241031-212011-ladsgroup.json
- 21:19 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-presto1019.eqiad.wmnet with OS bullseye
- 21:18 bking@cumin2002: START - Cookbook sre.hosts.reimage for host an-presto1019.eqiad.wmnet with OS bullseye
- 21:18 dancy@deploy2002: Installing scap version "4.119.3" for 210 hosts
- 21:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T376905)', diff saved to https://phabricator.wikimedia.org/P70795 and previous config saved to /var/cache/conftool/dbconfig/20241031-210504-ladsgroup.json
- 20:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2163 (T376905)', diff saved to https://phabricator.wikimedia.org/P70794 and previous config saved to /var/cache/conftool/dbconfig/20241031-205631-ladsgroup.json
- 20:56 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
- 20:56 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
- 20:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T376905)', diff saved to https://phabricator.wikimedia.org/P70793 and previous config saved to /var/cache/conftool/dbconfig/20241031-205604-ladsgroup.json
- 20:55 jsn@deploy2002: Finished scap sync-world: Backport for Translations for configuration for same-user-same-page reverts in Automoderator (T370795), Add follow-up message (T372476) (duration: 27m 10s)
- 20:46 jsn@deploy2002: jsn: Continuing with sync
- 20:46 jsn@deploy2002: jsn: Backport for Translations for configuration for same-user-same-page reverts in Automoderator (T370795), Add follow-up message (T372476) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P70792 and previous config saved to /var/cache/conftool/dbconfig/20241031-204057-ladsgroup.json
- 20:28 jsn@deploy2002: Started scap sync-world: Backport for Translations for configuration for same-user-same-page reverts in Automoderator (T370795), Add follow-up message (T372476)
- 20:25 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
- 20:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P70791 and previous config saved to /var/cache/conftool/dbconfig/20241031-202549-ladsgroup.json
- 20:25 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
- 20:23 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 20:22 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 20:15 dancy@deploy2002: Finished scap sync-world: Backport for tcywikisource: fix typo of author namespace (T378555) (duration: 07m 46s)
- 20:10 dancy@deploy2002: dancy, anzx: Continuing with sync
- 20:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T376905)', diff saved to https://phabricator.wikimedia.org/P70790 and previous config saved to /var/cache/conftool/dbconfig/20241031-201042-ladsgroup.json
- 20:10 dancy@deploy2002: dancy, anzx: Backport for tcywikisource: fix typo of author namespace (T378555) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:07 dancy@deploy2002: Started scap sync-world: Backport for tcywikisource: fix typo of author namespace (T378555)
- 20:03 dancy@deploy2002: Installation of scap version "4.119.2" completed for 210 hosts
- 20:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2162 (T376905)', diff saved to https://phabricator.wikimedia.org/P70789 and previous config saved to /var/cache/conftool/dbconfig/20241031-200214-ladsgroup.json
- 20:02 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
- 20:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
- 20:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T376905)', diff saved to https://phabricator.wikimedia.org/P70788 and previous config saved to /var/cache/conftool/dbconfig/20241031-200148-ladsgroup.json
- 19:58 dancy@deploy2002: Installing scap version "4.119.2" for 210 hosts
- 19:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P70787 and previous config saved to /var/cache/conftool/dbconfig/20241031-194640-ladsgroup.json
- 19:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P70786 and previous config saved to /var/cache/conftool/dbconfig/20241031-193133-ladsgroup.json
- 19:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T376905)', diff saved to https://phabricator.wikimedia.org/P70785 and previous config saved to /var/cache/conftool/dbconfig/20241031-191626-ladsgroup.json
- 19:15 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.44.0-wmf.1 refs T375660
- 19:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2161 (T376905)', diff saved to https://phabricator.wikimedia.org/P70784 and previous config saved to /var/cache/conftool/dbconfig/20241031-190648-ladsgroup.json
- 19:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance
- 19:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance
- 19:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T376905)', diff saved to https://phabricator.wikimedia.org/P70783 and previous config saved to /var/cache/conftool/dbconfig/20241031-190622-ladsgroup.json
- 19:06 swfrench@deploy2002: Finished scap sync-world: Backport for TimedMediaHandler: revert commonswiki changes due to capacity issues (duration: 07m 38s)
- 19:01 swfrench@deploy2002: swfrench, hnowlan: Continuing with sync
- 19:01 swfrench@deploy2002: swfrench, hnowlan: Backport for TimedMediaHandler: revert commonswiki changes due to capacity issues synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 18:58 swfrench@deploy2002: Started scap sync-world: Backport for TimedMediaHandler: revert commonswiki changes due to capacity issues
- 18:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P70782 and previous config saved to /var/cache/conftool/dbconfig/20241031-185115-ladsgroup.json
- 18:47 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 18:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P70781 and previous config saved to /var/cache/conftool/dbconfig/20241031-183608-ladsgroup.json
- 18:26 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 18:26 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 18:24 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 18:23 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 18:23 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 18:23 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 18:22 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 18:22 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 18:22 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 18:21 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 18:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T376905)', diff saved to https://phabricator.wikimedia.org/P70780 and previous config saved to /var/cache/conftool/dbconfig/20241031-182101-ladsgroup.json
- 18:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2154 (T376905)', diff saved to https://phabricator.wikimedia.org/P70779 and previous config saved to /var/cache/conftool/dbconfig/20241031-181225-ladsgroup.json
- 18:13 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
- 18:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
- 18:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T376905)', diff saved to https://phabricator.wikimedia.org/P70778 and previous config saved to /var/cache/conftool/dbconfig/20241031-181158-ladsgroup.json
- 18:05 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 17:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P70777 and previous config saved to /var/cache/conftool/dbconfig/20241031-175651-ladsgroup.json
- 17:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P70776 and previous config saved to /var/cache/conftool/dbconfig/20241031-174144-ladsgroup.json
- 17:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T376905)', diff saved to https://phabricator.wikimedia.org/P70775 and previous config saved to /var/cache/conftool/dbconfig/20241031-172637-ladsgroup.json
- 17:26 volans: uploaded spicerack_8.15.2 to apt.wikimedia.org bullseye-wikimedia
- 17:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2152 (T376905)', diff saved to https://phabricator.wikimedia.org/P70774 and previous config saved to /var/cache/conftool/dbconfig/20241031-171824-ladsgroup.json
- 17:18 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
- 17:18 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
- 17:16 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 17:16 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 17:13 swfrench@deploy2002: Finished scap sync-world: Deployment to pick up PHP version parameterization - T372604 T377040 (duration: 01m 52s)
- 17:11 swfrench@deploy2002: Started scap sync-world: Deployment to pick up PHP version parameterization - T372604 T377040
- 17:01 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-presto1020.eqiad.wmnet with OS bullseye
- 17:00 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-presto1020.eqiad.wmnet with OS bullseye
- 16:57 Emperor: set mgr mgr/prometheus/scrape_interval 15.0 in both apus clusters
- 16:56 urandom: Bootstrapping Cassandra/aqs1022-a — T378725
- 16:52 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1022.eqiad.wmnet with reason: Bootstrapping — T378725
- 16:52 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on aqs1022.eqiad.wmnet with reason: Bootstrapping — T378725
- 16:45 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-presto1016.eqiad.wmnet with OS bullseye
- 16:37 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-presto1016.eqiad.wmnet with OS bullseye
- 16:27 taavi@deploy2002: Finished scap sync-world: Backport for Drop 'nonglobal' dblist (duration: 08m 44s)
- 16:23 taavi@deploy2002: taavi: Continuing with sync
- 16:21 taavi@deploy2002: taavi: Backport for Drop 'nonglobal' dblist synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 16:19 taavi@deploy2002: Started scap sync-world: Backport for Drop 'nonglobal' dblist
- 16:16 taavi@deploy2002: Finished scap sync-world: Backport for Drop labtestwiki config (T378260) (duration: 09m 39s)
- 16:12 taavi@deploy2002: taavi: Continuing with sync
- 16:09 taavi@deploy2002: taavi: Backport for Drop labtestwiki config (T378260) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 16:07 eevans@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:07 eevans@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Additional IPs for Cassandra — aqs1022 - eevans@cumin1002"
- 16:07 taavi@deploy2002: Started scap sync-world: Backport for Drop labtestwiki config (T378260)
- 16:06 ryankemper: [archiva] Freed up space on `archiva1002.wikimedia.org` like so: `sudo rm -rfv /var/cache/archiva/temp* && sudo systemctl restart archiva`. We're down to 31% usage now
- 16:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 100%: post T378267 reclone', diff saved to https://phabricator.wikimedia.org/P70772 and previous config saved to /var/cache/conftool/dbconfig/20241031-160542-arnaudb.json
- 16:04 dancy@deploy2002: scap failed: <CalledProcessError> Command '['sudo', '-u', 'mwbuilder', '-n', '--', '/home/dancy/src/venvs/scap/bin/scap', 'mwshell', '--no-local-config', '--directory', '/srv/mediawiki-staging', '--user', 'www-data', '--', 'rm -f /srv/mediawiki-staging/php-1.43.0-wmf.28/cache/l10n/*.tmp.*']' returned non-zero exit status 1. (scap version: 4.118.0) (duration: 00m 01s)
- 16:04 dancy@deploy2002: Started scap sync-world: Backport for Drop labtestwiki config (T378260)
- 16:03 eevans@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Additional IPs for Cassandra — aqs1022 - eevans@cumin1002"
- 15:59 eevans@cumin1002: START - Cookbook sre.dns.netbox
- 15:55 samtar@deploy2002: Finished scap sync-world: Backport for [CommunityRequests] disable wgCommunityRequestsEnable by default (T366194) (duration: 07m 51s)
- 15:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 75%: post T378267 reclone', diff saved to https://phabricator.wikimedia.org/P70770 and previous config saved to /var/cache/conftool/dbconfig/20241031-155037-arnaudb.json
- 15:50 samtar@deploy2002: samtar, musikanimal: Continuing with sync
- 15:49 samtar@deploy2002: samtar, musikanimal: Backport for [CommunityRequests] disable wgCommunityRequestsEnable by default (T366194) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 15:47 samtar@deploy2002: Started scap sync-world: Backport for [CommunityRequests] disable wgCommunityRequestsEnable by default (T366194)
- 15:45 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2190']
- 15:44 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2190']
- 15:35 eevans@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 15:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 50%: post T378267 reclone', diff saved to https://phabricator.wikimedia.org/P70769 and previous config saved to /var/cache/conftool/dbconfig/20241031-153531-arnaudb.json
- 15:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: post maintenance', diff saved to https://phabricator.wikimedia.org/P70768 and previous config saved to /var/cache/conftool/dbconfig/20241031-152220-arnaudb.json
- 15:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 25%: post T378267 reclone', diff saved to https://phabricator.wikimedia.org/P70767 and previous config saved to /var/cache/conftool/dbconfig/20241031-152026-arnaudb.json
- 15:15 eevans@cumin1002: START - Cookbook sre.dns.netbox
- 15:08 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 15:08 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 15:07 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Add tooltips to expressions - oblivian@cumin1002"
- 15:07 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Add tooltips to expressions - oblivian@cumin1002
- 15:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: post maintenance', diff saved to https://phabricator.wikimedia.org/P70766 and previous config saved to /var/cache/conftool/dbconfig/20241031-150714-arnaudb.json
- 15:06 oblivian@cumin1002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Add tooltips to expressions - oblivian@cumin1002
- 15:06 oblivian@cumin1002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Add tooltips to expressions - oblivian@cumin1002"
- 15:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 10%: post T378267 reclone', diff saved to https://phabricator.wikimedia.org/P70765 and previous config saved to /var/cache/conftool/dbconfig/20241031-150521-arnaudb.json
- 15:00 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 14:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 14:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: post maintenance', diff saved to https://phabricator.wikimedia.org/P70764 and previous config saved to /var/cache/conftool/dbconfig/20241031-145209-arnaudb.json
- 14:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 5%: post T378267 reclone', diff saved to https://phabricator.wikimedia.org/P70763 and previous config saved to /var/cache/conftool/dbconfig/20241031-145015-arnaudb.json
- 14:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 100%: post db1234.eqiad.wmnet clone', diff saved to https://phabricator.wikimedia.org/P70762 and previous config saved to /var/cache/conftool/dbconfig/20241031-144902-arnaudb.json
- 14:37 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on db2190.codfw.wmnet with reason: host has hardware issues T378628
- 14:37 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on db2190.codfw.wmnet with reason: host has hardware issues T378628
- 14:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 25%: post maintenance', diff saved to https://phabricator.wikimedia.org/P70761 and previous config saved to /var/cache/conftool/dbconfig/20241031-143704-arnaudb.json
- 14:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 4%: post T378267 reclone', diff saved to https://phabricator.wikimedia.org/P70760 and previous config saved to /var/cache/conftool/dbconfig/20241031-143510-arnaudb.json
- 14:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 75%: post db1234.eqiad.wmnet clone', diff saved to https://phabricator.wikimedia.org/P70759 and previous config saved to /var/cache/conftool/dbconfig/20241031-143356-arnaudb.json
- 14:24 fnegri@cumin1002: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database tcywikisource (T378469)
- 14:23 fnegri@cumin1002: START - Cookbook sre.wikireplicas.add-wiki for database tcywikisource (T378469)
- 14:22 fnegri@cumin1002: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database tcywiktionary (T378462)
- 14:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: post maintenance', diff saved to https://phabricator.wikimedia.org/P70758 and previous config saved to /var/cache/conftool/dbconfig/20241031-142158-arnaudb.json
- 14:21 fnegri@cumin1002: START - Cookbook sre.wikireplicas.add-wiki for database tcywiktionary (T378462)
- 14:21 fnegri@cumin1002: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database ibawiki (T376571)
- 14:21 fnegri@cumin1002: START - Cookbook sre.wikireplicas.add-wiki for database ibawiki (T376571)
- 14:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 2%: post T378267 reclone', diff saved to https://phabricator.wikimedia.org/P70757 and previous config saved to /var/cache/conftool/dbconfig/20241031-142004-arnaudb.json
- 14:19 fnegri@cumin1002: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database bclwikisource (T377087)
- 14:19 fnegri@cumin1002: START - Cookbook sre.wikireplicas.add-wiki for database bclwikisource (T377087)
- 14:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 50%: post db1234.eqiad.wmnet clone', diff saved to https://phabricator.wikimedia.org/P70756 and previous config saved to /var/cache/conftool/dbconfig/20241031-141851-arnaudb.json
- 14:14 sergi0: Running `foreachwiki userOptions.php --delete --old=sectionlevelimages growthexperiments-homepage-variant` (T375753)
- 14:11 sergi0: eswiki, arwiki, cswiki, frwiki running `mwscript userOptions.php --wiki=frwiki --delete-defaults growthexperiments-homepage-variant` (T374664)
- 14:06 arnaudb@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 5%: post maintenance', diff saved to https://phabricator.wikimedia.org/P70755 and previous config saved to /var/cache/conftool/dbconfig/20241031-140653-arnaudb.json
- 14:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 1%: post T378267 reclone', diff saved to https://phabricator.wikimedia.org/P70754 and previous config saved to /var/cache/conftool/dbconfig/20241031-140459-arnaudb.json
- 14:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 25%: post db1234.eqiad.wmnet clone', diff saved to https://phabricator.wikimedia.org/P70753 and previous config saved to /var/cache/conftool/dbconfig/20241031-140345-arnaudb.json
- 13:50 urbanecm@deploy2002: Finished scap sync-world: Backport for tcywikisource: add logo (T378555) (duration: 08m 56s)
- 13:46 urbanecm@deploy2002: urbanecm, anzx: Continuing with sync
- 13:44 urbanecm@deploy2002: urbanecm, anzx: Backport for tcywikisource: add logo (T378555) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:41 urbanecm@deploy2002: Started scap sync-world: Backport for tcywikisource: add logo (T378555)
- {{safesubst:SAL entry|1=13:38 urbanecm@deploy2002: Finished scap sync-world: Backport for Set username in user mock and reset state after test (T378573), Fix and re-enable selenium test (T378581), Fix selenium test loading the wrong talk page, HomepageHooks: do not store assigned variant on account creation (T377713), [[gerrit:1085347|SpecialHomepage: show community update}}
- 13:34 urbanecm@deploy2002: hnowlan, sgimeno, urbanecm: Continuing with sync
- {{safesubst:SAL entry|1=13:30 urbanecm@deploy2002: hnowlan, sgimeno, urbanecm: Backport for Set username in user mock and reset state after test (T378573), Fix and re-enable selenium test (T378581), Fix selenium test loading the wrong talk page, HomepageHooks: do not store assigned variant on account creation (T377713), [[gerrit:1085347|SpecialHomepage: show community upda}}
- {{safesubst:SAL entry|1=13:28 urbanecm@deploy2002: Started scap sync-world: Backport for Set username in user mock and reset state after test (T378573), Fix and re-enable selenium test (T378581), Fix selenium test loading the wrong talk page, HomepageHooks: do not store assigned variant on account creation (T377713), [[gerrit:1085347|SpecialHomepage: show community update}}
- 13:25 urbanecm@deploy2002: Finished scap sync-world: Backport for tcywikisource: Add namespaces, SITENAME and timezone (T378555), tcywiktionary: add SITENAME and timezone (T378556), tcywiktionary: add logo (T378556) (duration: 09m 39s)
- 13:20 urbanecm@deploy2002: anzx, urbanecm: Continuing with sync
- 13:19 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync
- 13:18 urbanecm@deploy2002: anzx, urbanecm: Backport for tcywikisource: Add namespaces, SITENAME and timezone (T378555), tcywiktionary: add SITENAME and timezone (T378556), tcywiktionary: add logo (T378556) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:18 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync
- 13:15 urbanecm@deploy2002: Started scap sync-world: Backport for tcywikisource: Add namespaces, SITENAME and timezone (T378555), tcywiktionary: add SITENAME and timezone (T378556), tcywiktionary: add logo (T378556)
- 13:14 urbanecm@deploy2002: Finished scap sync-world: Backport for TimedMediaHandler: use shellbox globally (T357309), Remove RunSingleJobStdin script (T369048) (duration: 09m 43s)
- 13:09 urbanecm@deploy2002: urbanecm, hnowlan: Continuing with sync
- 13:08 urbanecm@deploy2002: urbanecm, hnowlan: Backport for TimedMediaHandler: use shellbox globally (T357309), Remove RunSingleJobStdin script (T369048) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:04 urbanecm@deploy2002: Started scap sync-world: Backport for TimedMediaHandler: use shellbox globally (T357309), Remove RunSingleJobStdin script (T369048)
- 12:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1237 (T376905)', diff saved to https://phabricator.wikimedia.org/P70752 and previous config saved to /var/cache/conftool/dbconfig/20241031-122719-ladsgroup.json
- 12:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1237', diff saved to https://phabricator.wikimedia.org/P70751 and previous config saved to /var/cache/conftool/dbconfig/20241031-121212-ladsgroup.json
- 12:06 elukey@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1002.eqiad.wmnet
- 12:06 elukey@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1002.eqiad.wmnet
- 12:01 fnegri@cumin1002: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database annwiki (T377118)
- 12:01 fnegri@cumin1002: START - Cookbook sre.wikireplicas.add-wiki for database annwiki (T377118)
- 12:01 fnegri@cumin1002: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database tddwiki (T375016)
- 12:00 fnegri@cumin1002: START - Cookbook sre.wikireplicas.add-wiki for database tddwiki (T375016)
- 12:00 fnegri@cumin1002: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database rskwiki (T375016)
- 11:59 fnegri@cumin1002: START - Cookbook sre.wikireplicas.add-wiki for database rskwiki (T375016)
- 11:59 fnegri@cumin1002: END (ERROR) - Cookbook sre.wikireplicas.add-wiki (exit_code=97) for database rskwiki (T375016)
- 11:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1237', diff saved to https://phabricator.wikimedia.org/P70750 and previous config saved to /var/cache/conftool/dbconfig/20241031-115705-ladsgroup.json
- 11:54 fnegri@cumin1002: START - Cookbook sre.wikireplicas.add-wiki for database rskwiki (T375016)
- 11:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1232.eqiad.wmnet onto db1234.eqiad.wmnet
- 11:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1237 (T376905)', diff saved to https://phabricator.wikimedia.org/P70747 and previous config saved to /var/cache/conftool/dbconfig/20241031-114158-ladsgroup.json
- 11:38 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1002.eqiad.wmnet with OS bookworm
- 11:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1237 (T376905)', diff saved to https://phabricator.wikimedia.org/P70746 and previous config saved to /var/cache/conftool/dbconfig/20241031-113456-ladsgroup.json
- 11:34 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1237.eqiad.wmnet with reason: Maintenance
- 11:34 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1237.eqiad.wmnet with reason: Maintenance
- 11:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 11:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 11:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T376905)', diff saved to https://phabricator.wikimedia.org/P70744 and previous config saved to /var/cache/conftool/dbconfig/20241031-112924-ladsgroup.json
- 11:26 fabfur: reverted previous action (T378578)
- 11:20 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1002.eqiad.wmnet with reason: host reimage
- 11:17 fabfur: install haproxykafka on cp4037 and cp3066 (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1085308) (T378578)
- 11:17 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1002.eqiad.wmnet with reason: host reimage
- 11:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P70743 and previous config saved to /var/cache/conftool/dbconfig/20241031-111417-ladsgroup.json
- 11:02 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1002.eqiad.wmnet with OS bookworm
- 11:01 elukey@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1002.eqiad.wmnet
- 11:00 elukey@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1002.eqiad.wmnet
- 10:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P70742 and previous config saved to /var/cache/conftool/dbconfig/20241031-105910-ladsgroup.json
- 10:58 elukey@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1002.eqiad.wmnet
- 10:58 elukey@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1002.eqiad.wmnet
- 10:56 elukey@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1003.eqiad.wmnet
- 10:56 elukey@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1003.eqiad.wmnet
- 10:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[1232,1234].eqiad.wmnet with reason: hosts in cloning, avoiding alerts
- 10:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db[1232,1234].eqiad.wmnet with reason: hosts in cloning, avoiding alerts
- 10:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T376905)', diff saved to https://phabricator.wikimedia.org/P70741 and previous config saved to /var/cache/conftool/dbconfig/20241031-104404-ladsgroup.json
- 10:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T376905)', diff saved to https://phabricator.wikimedia.org/P70740 and previous config saved to /var/cache/conftool/dbconfig/20241031-103406-ladsgroup.json
- 10:34 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
- 10:33 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
- 10:28 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 10:28 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 10:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T376905)', diff saved to https://phabricator.wikimedia.org/P70739 and previous config saved to /var/cache/conftool/dbconfig/20241031-102835-ladsgroup.json
- 10:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P70738 and previous config saved to /var/cache/conftool/dbconfig/20241031-101328-ladsgroup.json
- 10:06 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-ctrl1003.eqiad.wmnet with OS bookworm
- 10:04 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db1232.eqiad.wmnet onto db1234.eqiad.wmnet
- 10:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db1232 in db1234 for T378267', diff saved to https://phabricator.wikimedia.org/P70737 and previous config saved to /var/cache/conftool/dbconfig/20241031-100301-arnaudb.json
- 09:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P70736 and previous config saved to /var/cache/conftool/dbconfig/20241031-095821-ladsgroup.json
- 09:49 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-ctrl1003.eqiad.wmnet with reason: host reimage
- 09:47 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-ctrl1003.eqiad.wmnet with reason: host reimage
- 09:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T376905)', diff saved to https://phabricator.wikimedia.org/P70735 and previous config saved to /var/cache/conftool/dbconfig/20241031-094314-ladsgroup.json
- 09:35 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host aux-k8s-ctrl1003.eqiad.wmnet with OS bookworm
- 09:35 elukey@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1003.eqiad.wmnet
- 09:35 elukey@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1003.eqiad.wmnet
- 09:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1179 (T376905)', diff saved to https://phabricator.wikimedia.org/P70734 and previous config saved to /var/cache/conftool/dbconfig/20241031-093446-ladsgroup.json
- 09:34 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 09:34 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 09:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1003.eqiad.wmnet
- 09:32 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-ctrl1003.eqiad.wmnet
- 09:07 fabfur: importing haproxykafka 0.3 package into apt repository (T377613)
- 08:23 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-presto1016.eqiad.wmnet with OS bullseye
- 08:23 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-presto1016.eqiad.wmnet with OS bullseye
- 08:21 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-presto1019.eqiad.wmnet with OS bullseye
- 08:13 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'configure' for AS: 56258
- 08:12 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 56258
- 08:01 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-presto1020.eqiad.wmnet with OS bullseye
- 04:54 eileen: civicrm upgraded from 0eb881ca to 31f5cbdb
- 01:45 krinkle@deploy2002: Finished deploy [integration/docroot@0b03488]: (no justification provided) (duration: 00m 10s)
- 01:45 krinkle@deploy2002: Started deploy [integration/docroot@0b03488]: (no justification provided)
- 01:42 Krinkle: krinkle@mwmaint2001$ Purge https://doc.wikimedia.org/lib/wmui-page.css via `mwscript extensions/WikimediaMaintenance/purgeUrls.php`, T257188 T378542
- 01:38 krinkle@deploy2002: Finished deploy [integration/docroot@a2c044c]: T378542 (duration: 00m 23s)
- 01:38 krinkle@deploy2002: Started deploy [integration/docroot@a2c044c]: T378542
- 00:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2215 (T376905)', diff saved to https://phabricator.wikimedia.org/P70733 and previous config saved to /var/cache/conftool/dbconfig/20241031-003014-ladsgroup.json
- 00:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P70732 and previous config saved to /var/cache/conftool/dbconfig/20241031-001507-ladsgroup.json
- 00:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P70731 and previous config saved to /var/cache/conftool/dbconfig/20241031-000000-ladsgroup.json
2024-10-30
- 23:53 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2081.codfw.wmnet with OS bullseye
- 23:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2215 (T376905)', diff saved to https://phabricator.wikimedia.org/P70730 and previous config saved to /var/cache/conftool/dbconfig/20241030-234453-ladsgroup.json
- 23:44 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2083.codfw.wmnet with OS bullseye
- 22:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2215 (T376905)', diff saved to https://phabricator.wikimedia.org/P70729 and previous config saved to /var/cache/conftool/dbconfig/20241030-225520-ladsgroup.json
- 22:55 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
- 22:54 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
- 22:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2191 (T376905)', diff saved to https://phabricator.wikimedia.org/P70728 and previous config saved to /var/cache/conftool/dbconfig/20241030-225449-ladsgroup.json
- 22:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2191', diff saved to https://phabricator.wikimedia.org/P70727 and previous config saved to /var/cache/conftool/dbconfig/20241030-223942-ladsgroup.json
- 22:39 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2081.codfw.wmnet with OS bullseye
- 22:29 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2083.codfw.wmnet with OS bullseye
- 22:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2191', diff saved to https://phabricator.wikimedia.org/P70726 and previous config saved to /var/cache/conftool/dbconfig/20241030-222435-ladsgroup.json
- 22:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2191 (T376905)', diff saved to https://phabricator.wikimedia.org/P70725 and previous config saved to /var/cache/conftool/dbconfig/20241030-220928-ladsgroup.json
- 22:03 brett: Running ./redis-check-aof --fix on rdb1014 tcp_6379 instance - T376961
- 21:26 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Fix bug in BlockManager::getUniqueBlocks (T378563) (duration: 07m 22s)
- 21:21 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 21:21 dreamyjazz@deploy2002: dreamyjazz: Backport for Fix bug in BlockManager::getUniqueBlocks (T378563) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:18 dreamyjazz@deploy2002: Started scap sync-world: Backport for Fix bug in BlockManager::getUniqueBlocks (T378563)
- 21:17 tgr@deploy2002: Finished scap sync-world: Backport for GrowthExperiments: enable community updates module in pilot wikis (T374664) (duration: 10m 10s)
- 21:12 tgr@deploy2002: tgr, sgimeno: Continuing with sync
- 21:09 tgr@deploy2002: tgr, sgimeno: Backport for GrowthExperiments: enable community updates module in pilot wikis (T374664) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2191 (T376905)', diff saved to https://phabricator.wikimedia.org/P70724 and previous config saved to /var/cache/conftool/dbconfig/20241030-210902-ladsgroup.json
- 21:08 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2191.codfw.wmnet with reason: Maintenance
- 21:08 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2191.codfw.wmnet with reason: Maintenance
- 21:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2131 (T376905)', diff saved to https://phabricator.wikimedia.org/P70723 and previous config saved to /var/cache/conftool/dbconfig/20241030-210836-ladsgroup.json
- 21:07 tgr@deploy2002: Started scap sync-world: Backport for GrowthExperiments: enable community updates module in pilot wikis (T374664)
- {{safesubst:SAL entry|1=21:01 tgr@deploy2002: Finished scap sync-world: Backport for Set username in user mock and reset state after test (T378573), Fix and re-enable selenium test (T378581), Fix selenium test loading the wrong talk page, build: Suppress phan issue with null for Message::numParams, [[gerrit:1084181|HomepageHooks: do not store assigned variant on account cr}}
- 20:57 tgr@deploy2002: sgimeno, umherirrender, tgr: Continuing with sync
- 20:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2131', diff saved to https://phabricator.wikimedia.org/P70722 and previous config saved to /var/cache/conftool/dbconfig/20241030-205329-ladsgroup.json
- 20:51 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 20:51 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- {{safesubst:SAL entry|1=20:45 tgr@deploy2002: sgimeno, umherirrender, tgr: Backport for Set username in user mock and reset state after test (T378573), Fix and re-enable selenium test (T378581), Fix selenium test loading the wrong talk page, build: Suppress phan issue with null for Message::numParams, [[gerrit:1084181|HomepageHooks: do not store assigned variant on account}}
- {{safesubst:SAL entry|1=20:43 tgr@deploy2002: Started scap sync-world: Backport for Set username in user mock and reset state after test (T378573), Fix and re-enable selenium test (T378581), Fix selenium test loading the wrong talk page, build: Suppress phan issue with null for Message::numParams, [[gerrit:1084181|HomepageHooks: do not store assigned variant on account cre}}
- 20:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2131', diff saved to https://phabricator.wikimedia.org/P70721 and previous config saved to /var/cache/conftool/dbconfig/20241030-203822-ladsgroup.json
- 20:24 tgr@deploy2002: Finished scap sync-world: Backport for Set Flow to read-only on nowiki (T377990) (duration: 13m 21s)
- 20:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2131 (T376905)', diff saved to https://phabricator.wikimedia.org/P70720 and previous config saved to /var/cache/conftool/dbconfig/20241030-202315-ladsgroup.json
- 20:20 tgr@deploy2002: esanders, tgr: Continuing with sync
- 20:16 tgr@deploy2002: esanders, tgr: Backport for Set Flow to read-only on nowiki (T377990) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2131 (T376905)', diff saved to https://phabricator.wikimedia.org/P70719 and previous config saved to /var/cache/conftool/dbconfig/20241030-201331-ladsgroup.json
- 20:13 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2131.codfw.wmnet with reason: Maintenance
- 20:13 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2131.codfw.wmnet with reason: Maintenance
- 20:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2115 (T376905)', diff saved to https://phabricator.wikimedia.org/P70718 and previous config saved to /var/cache/conftool/dbconfig/20241030-201305-ladsgroup.json
- 20:11 tgr@deploy2002: Started scap sync-world: Backport for Set Flow to read-only on nowiki (T377990)
- 19:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2115', diff saved to https://phabricator.wikimedia.org/P70717 and previous config saved to /var/cache/conftool/dbconfig/20241030-195758-ladsgroup.json
- 19:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2115', diff saved to https://phabricator.wikimedia.org/P70716 and previous config saved to /var/cache/conftool/dbconfig/20241030-194251-ladsgroup.json
- 19:40 swfrench-wmf: all shellbox instances updated to shellbox 2024-10-15-214239 - T375243
- 19:39 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 19:39 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 19:37 mutante: gitlab - deleting user "jfk" on main server and both replicas T376936
- 19:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 19:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 19:36 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
- 19:35 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
- 19:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
- 19:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
- 19:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
- 19:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
- 19:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2115 (T376905)', diff saved to https://phabricator.wikimedia.org/P70715 and previous config saved to /var/cache/conftool/dbconfig/20241030-192744-ladsgroup.json
- 19:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2115 (T376905)', diff saved to https://phabricator.wikimedia.org/P70714 and previous config saved to /var/cache/conftool/dbconfig/20241030-192011-ladsgroup.json
- 19:20 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2115.codfw.wmnet with reason: Maintenance
- 19:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2115.codfw.wmnet with reason: Maintenance
- 19:17 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.44.0-wmf.1 refs T375660
- 18:40 dduvall@deploy2002: Finished scap sync-world: Backport for Revert "Use array instead of string for class list" (T378531) (duration: 19m 04s)
- 18:39 inflatador: bking@stat1008,stat1009,stat1010.mgmt racadm jobqueue delete -i $job T376813
- 18:36 fnegri@cumin1002: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database nrwiki (T375101)
- 18:35 dduvall@deploy2002: ammarpad, dduvall: Continuing with sync
- 18:35 dduvall: error is still occurring following backport deployment of https://gerrit.wikimedia.org/r/c/mediawiki/skins/MinervaNeue/+/1084759 (T378531)
- 18:27 dduvall: monitoring testwiki error rates for a few minutes to see if the error related to T378531 subsides (current rate is 23 errors in the last 15 minutes)
- 18:23 dduvall@deploy2002: ammarpad, dduvall: Backport for Revert "Use array instead of string for class list" (T378531) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 18:21 dduvall@deploy2002: Started scap sync-world: Backport for Revert "Use array instead of string for class list" (T378531)
- 18:10 fnegri@cumin1002: START - Cookbook sre.wikireplicas.add-wiki for database nrwiki (T375101)
- 17:35 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s3
- 17:35 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s1
- 17:31 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
- 17:26 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s1
- 17:24 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet,service=s7
- 17:23 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet,service=s7
- 17:21 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1019.eqiad.wmnet,service=s6
- 17:21 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1019.eqiad.wmnet,service=s4
- 17:20 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet,service=s6
- 17:20 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet,service=s4
- 17:19 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet,service=s5
- 17:18 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet,service=s8
- 17:11 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=s8
- 17:11 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=s5
- 17:03 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
- 17:03 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
- 17:01 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
- 17:00 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
- 16:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
- 16:58 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
- 16:58 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
- 16:57 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
- 16:54 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
- 16:53 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
- 16:44 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-presto1017.eqiad.wmnet with OS bullseye
- 16:39 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
- 16:39 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
- 16:39 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
- 16:39 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1052.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:38 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
- 16:38 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 16:38 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 16:38 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
- 16:38 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
- 16:38 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
- 16:37 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
- 16:37 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 16:37 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
- 16:33 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1052.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:26 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1051.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:21 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1051.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:16 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1050.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:11 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-presto1019.eqiad.wmnet with OS bullseye
- 16:09 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Fix bug in BlockManager::getUniqueBlocks (T378563) (duration: 07m 06s)
- 16:08 pfischer@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 16:08 pfischer@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 16:07 pfischer@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 16:07 pfischer@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 16:07 pfischer@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 16:06 pfischer@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 16:06 pfischer@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 16:06 pfischer@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 16:06 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1050.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:04 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 16:04 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1049.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:04 dreamyjazz@deploy2002: dreamyjazz: Backport for Fix bug in BlockManager::getUniqueBlocks (T378563) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 16:02 pfischer@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 16:02 pfischer@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 16:01 dreamyjazz@deploy2002: Started scap sync-world: Backport for Fix bug in BlockManager::getUniqueBlocks (T378563)
- 16:01 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-presto1017.eqiad.wmnet with reason: host reimage
- 15:59 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1049.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 15:57 stevemunene@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-presto1017.eqiad.wmnet with reason: host reimage
- 15:57 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1048.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 15:56 pfischer@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 15:55 pfischer@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 15:55 pfischer@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 15:54 pfischer@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 15:52 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1048.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 15:50 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1047.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 15:47 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:47 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:45 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1047.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 15:44 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1046.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 15:43 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-presto1017.eqiad.wmnet with OS bullseye
- 15:39 moritzm: re-enable Puppet fleet-wide after puppetserver2001 maintenance
- 15:39 moritzm: re-enable Puppet fleet-wide for puppetserver2001 maintenance
- 15:39 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1046.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 15:38 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1045.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 15:36 ejegg: Standalone SmashPig upgraded from eaa176f7 to be47dddd
- 15:35 pfischer@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 15:35 pfischer@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 15:35 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:35 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:32 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1045.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 15:32 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:31 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on puppetserver2001.codfw.wmnet with reason: puppetserver2001 maintenance
- 15:29 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on puppetserver2001.codfw.wmnet with reason: puppetserver2001 maintenance
- 15:27 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:27 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:26 moritzm: disable Puppet fleet-wide for puppetserver2001 maintenance
- 15:25 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-presto1016.eqiad.wmnet with OS bullseye
- 15:25 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 15:24 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 15:23 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-presto1017.eqiad.wmnet with OS bullseye
- 15:07 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-presto1017.eqiad.wmnet with OS bullseye
- 15:06 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on an-presto1020.eqiad.wmnet with reason: reimaging the hosts to bullseye
- 15:06 stevemunene@cumin1002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on an-presto1020.eqiad.wmnet with reason: reimaging the hosts to bullseye
- 15:05 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on an-presto[1017-1019].eqiad.wmnet with reason: reimaging the hosts to bullseye
- 15:05 stevemunene@cumin1002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on an-presto[1017-1019].eqiad.wmnet with reason: reimaging the hosts to bullseye
- 15:02 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-presto1016.eqiad.wmnet with OS bullseye
- 15:01 elukey@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1002.eqiad.wmnet
- 15:00 elukey@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1002.eqiad.wmnet
- 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on puppetserver2003.codfw.wmnet with reason: RAM expansion
- 14:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on puppetserver2003.codfw.wmnet with reason: RAM expansion
- 14:58 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-ctrl1002.eqiad.wmnet with OS bookworm
- 14:56 fabfur: importing haproxykafka 0.2 package into apt repository (T377613)
- 14:43 joal@deploy2002: Finished deploy [airflow-dags/analytics@ec02629]: Regular analytics weekly train SECOND [airflow-dags/analytics@ec02629d] (duration: 00m 55s)
- 14:42 joal@deploy2002: Started deploy [airflow-dags/analytics@ec02629]: Regular analytics weekly train SECOND [airflow-dags/analytics@ec02629d]
- 14:41 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-ctrl1002.eqiad.wmnet with reason: host reimage
- 14:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2016.codfw.wmnet
- 14:37 dcausse@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
- 14:37 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-ctrl1002.eqiad.wmnet with reason: host reimage
- 14:37 dcausse@deploy2002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
- 14:34 dcausse@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
- 14:34 dcausse@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
- 14:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T376905)', diff saved to https://phabricator.wikimedia.org/P70712 and previous config saved to /var/cache/conftool/dbconfig/20241030-143303-ladsgroup.json
- 14:32 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
- 14:32 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
- 14:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T376905)', diff saved to https://phabricator.wikimedia.org/P70711 and previous config saved to /var/cache/conftool/dbconfig/20241030-143236-ladsgroup.json
- 14:30 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
- 14:30 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
- 14:28 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [BlockManager] Don't assume autoblocks have ::getParentBlockId (T378563), [GlobalBlocking] Enable global autoblocks on all WMF wikis (T377760) (duration: 09m 10s)
- 14:23 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 14:23 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host aux-k8s-ctrl1002.eqiad.wmnet with OS bookworm
- 14:22 elukey@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1002.eqiad.wmnet
- 14:22 elukey@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1002.eqiad.wmnet
- 14:21 dreamyjazz@deploy2002: dreamyjazz: Backport for [BlockManager] Don't assume autoblocks have ::getParentBlockId (T378563), [GlobalBlocking] Enable global autoblocks on all WMF wikis (T377760) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on puppetserver2002.codfw.wmnet with reason: RAM expansion
- 14:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on puppetserver2002.codfw.wmnet with reason: RAM expansion
- 14:19 dreamyjazz@deploy2002: Started scap sync-world: Backport for [BlockManager] Don't assume autoblocks have ::getParentBlockId (T378563), [GlobalBlocking] Enable global autoblocks on all WMF wikis (T377760)
- 14:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P70710 and previous config saved to /var/cache/conftool/dbconfig/20241030-141729-ladsgroup.json
- 14:11 urbanecm: mwmaint2002: kill all running instances of `refreshLinkRecommendations.php` (T377150)
- 14:06 urbanecm@deploy2002: Finished scap sync-world: Backport for [BlockManager] Don't assume autoblocks have ::getParentBlockId (T378563), CirrusSearch: Enable offloading weighted tags via EventBus (T377150), cswiki: Add celebration logo (T378597) (duration: 15m 30s)
- 14:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P70709 and previous config saved to /var/cache/conftool/dbconfig/20241030-140222-ladsgroup.json
- 14:01 urbanecm@deploy2002: dreamyjazz, pfischer, urbanecm: Continuing with sync
- 13:58 joal@deploy2002: Finished deploy [airflow-dags/analytics@ec4746b]: Regular analytics weekly train [airflow-dags/analytics@ec4746b5] (duration: 00m 41s)
- 13:57 joal@deploy2002: Started deploy [airflow-dags/analytics@ec4746b]: Regular analytics weekly train [airflow-dags/analytics@ec4746b5]
- 13:53 urbanecm@deploy2002: dreamyjazz, pfischer, urbanecm: Backport for [BlockManager] Don't assume autoblocks have ::getParentBlockId (T378563), CirrusSearch: Enable offloading weighted tags via EventBus (T377150), cswiki: Add celebration logo (T378597) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:50 urbanecm@deploy2002: Started scap sync-world: Backport for [BlockManager] Don't assume autoblocks have ::getParentBlockId (T378563), CirrusSearch: Enable offloading weighted tags via EventBus (T377150), cswiki: Add celebration logo (T378597)
- 13:48 urbanecm@deploy2002: Finished scap sync-world: Backport for Growth [test2wiki]: enable community updates module (T376952), [Growth] beta: configure the A/B test experiment variants (T377233) (duration: 29m 00s)
- 13:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T376905)', diff saved to https://phabricator.wikimedia.org/P70707 and previous config saved to /var/cache/conftool/dbconfig/20241030-134715-ladsgroup.json
- 13:43 urbanecm@deploy2002: sgimeno, urbanecm: Continuing with sync
- 13:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P70704 and previous config saved to /var/cache/conftool/dbconfig/20241030-132204-ladsgroup.json
- 13:20 moritzm: upgrade PHP 7.4 on mwdebug* to 1:7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u2+icu67u3 T378173
- 13:19 urbanecm@deploy2002: Started scap sync-world: Backport for Growth [test2wiki]: enable community updates module (T376952), [Growth] beta: configure the A/B test experiment variants (T377233)
- 13:18 andrewtavis-wmde@deploy2002: Finished deploy [airflow-dags/wmde@ec4746b]: (no justification provided) (duration: 00m 07s)
- 13:18 andrewtavis-wmde@deploy2002: Started deploy [airflow-dags/wmde@ec4746b]: (no justification provided)
- 13:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P70703 and previous config saved to /var/cache/conftool/dbconfig/20241030-130657-ladsgroup.json
- 12:55 andrewtavis-wmde@deploy2002: Finished deploy [airflow-dags/wmde@ec4746b]: (no justification provided) (duration: 00m 11s)
- 12:54 andrewtavis-wmde@deploy2002: Started deploy [airflow-dags/wmde@ec4746b]: (no justification provided)
- 12:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T376905)', diff saved to https://phabricator.wikimedia.org/P70702 and previous config saved to /var/cache/conftool/dbconfig/20241030-125150-ladsgroup.json
- 12:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T376905)', diff saved to https://phabricator.wikimedia.org/P70701 and previous config saved to /var/cache/conftool/dbconfig/20241030-124316-ladsgroup.json
- 12:43 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 12:43 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 12:43 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
- 12:43 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
- 12:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T376905)', diff saved to https://phabricator.wikimedia.org/P70700 and previous config saved to /var/cache/conftool/dbconfig/20241030-124256-ladsgroup.json
- 12:30 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Handle a missing parent block in GlobalBlockLookup::getUserBlock (T378447), Handle a missing parent block in GlobalBlockLookup::getUserBlock (T378447), globalblocks API: Hide autoblocks when target param has username and IP (T377855) (duration: 10m 28s)
- 12:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P70699 and previous config saved to /var/cache/conftool/dbconfig/20241030-122749-ladsgroup.json
- 12:25 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 12:22 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
- 12:22 dreamyjazz@deploy2002: dreamyjazz: Backport for Handle a missing parent block in GlobalBlockLookup::getUserBlock (T378447), Handle a missing parent block in GlobalBlockLookup::getUserBlock (T378447), globalblocks API: Hide autoblocks when target param has username and IP (T377855) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 12:22 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
- 12:21 ayounsi@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 12:21 ayounsi@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 12:20 dreamyjazz@deploy2002: Started scap sync-world: Backport for Handle a missing parent block in GlobalBlockLookup::getUserBlock (T378447), Handle a missing parent block in GlobalBlockLookup::getUserBlock (T378447), globalblocks API: Hide autoblocks when target param has username and IP (T377855)
- 12:19 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
- 12:19 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
- 12:18 ayounsi@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 12:17 ayounsi@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 12:17 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 12:16 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
- 12:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P70698 and previous config saved to /var/cache/conftool/dbconfig/20241030-121242-ladsgroup.json
- 12:12 moritzm: installing podman security updates
- 12:11 joal@deploy2002: Finished deploy [analytics/refinery@0855ce2] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@0855ce28] (duration: 03m 41s)
- 12:07 joal@deploy2002: Started deploy [analytics/refinery@0855ce2] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@0855ce28]
- 12:04 joal@deploy2002: Finished deploy [analytics/refinery@0855ce2] (thin): Regular analytics weekly train THIN [analytics/refinery@0855ce28] (duration: 06m 54s)
- 11:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T376905)', diff saved to https://phabricator.wikimedia.org/P70697 and previous config saved to /var/cache/conftool/dbconfig/20241030-115735-ladsgroup.json
- 11:57 joal@deploy2002: Started deploy [analytics/refinery@0855ce2] (thin): Regular analytics weekly train THIN [analytics/refinery@0855ce28]
- 11:55 joal@deploy2002: Finished deploy [analytics/refinery@0855ce2]: Regular analytics weekly train [analytics/refinery@0855ce28] (duration: 08m 14s)
- 11:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T376905)', diff saved to https://phabricator.wikimedia.org/P70696 and previous config saved to /var/cache/conftool/dbconfig/20241030-114808-ladsgroup.json
- 11:48 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
- 11:47 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
- 11:47 joal@deploy2002: Started deploy [analytics/refinery@0855ce2]: Regular analytics weekly train [analytics/refinery@0855ce28]
- 11:44 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 11:43 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 11:41 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 11:41 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 11:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2016.codfw.wmnet
- 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ml-etcd2003.codfw.wmnet to plain
- 11:38 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ml-etcd2003.codfw.wmnet to plain
- 11:38 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1011.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2016.codfw.wmnet
- 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2016.codfw.wmnet
- 11:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ml-etcd2003.codfw.wmnet to drbd
- 11:33 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ml-serve1011.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:28 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1010.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:26 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ml-etcd2003.codfw.wmnet to drbd
- 11:23 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ml-serve1010.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2016.codfw.wmnet
- 11:19 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2016.codfw.wmnet
- 11:19 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-presto1016.eqiad.wmnet with OS bullseye
- 11:17 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-presto1016.eqiad.wmnet with OS bullseye
- 11:14 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ml-serve1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:09 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:09 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ml-serve2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:06 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-presto1016.eqiad.wmnet with OS bullseye
- 11:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2044.codfw.wmnet to cluster codfw and group D
- 11:01 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2044.codfw.wmnet to cluster codfw and group D
- 10:40 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 16347
- 10:40 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 16347
- 10:39 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 16347
- 10:39 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 16347
- 10:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 852
- 10:32 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 852
- 10:31 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 14593
- 10:29 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 14593
- 10:21 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 6461
- 10:18 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 6461
- 10:04 moritzm: installing python-idna security updates
- 09:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P70694 and previous config saved to /var/cache/conftool/dbconfig/20241030-095904-arnaudb.json
- 09:50 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling reboot on A:docker-registry
- 09:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P70693 and previous config saved to /var/cache/conftool/dbconfig/20241030-094357-arnaudb.json
- 09:41 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 40676
- 09:40 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 40676
- 09:38 fabfur: importing haproxykafka package into apt repository (T377613)
- 09:33 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling reboot on A:docker-registry
- 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet
- 09:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P70692 and previous config saved to /var/cache/conftool/dbconfig/20241030-092850-arnaudb.json
- 09:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
- 09:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet
- 09:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
- 09:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P70691 and previous config saved to /var/cache/conftool/dbconfig/20241030-091343-arnaudb.json
- 09:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P70690 and previous config saved to /var/cache/conftool/dbconfig/20241030-091131-arnaudb.json
- 09:11 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 09:11 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 09:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P70689 and previous config saved to /var/cache/conftool/dbconfig/20241030-091108-arnaudb.json
- 09:08 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2043.codfw.wmnet to cluster codfw and group D
- 09:07 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2043.codfw.wmnet to cluster codfw and group D
- 09:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet
- 09:00 arnaudb@cumin1002: dbctl commit (dc=all): 'db2223 (re)pooling @ 100%: post clone repool', diff saved to https://phabricator.wikimedia.org/P70688 and previous config saved to /var/cache/conftool/dbconfig/20241030-090002-arnaudb.json
- 08:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet
- 08:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P70687 and previous config saved to /var/cache/conftool/dbconfig/20241030-085601-arnaudb.json
- 08:44 arnaudb@cumin1002: dbctl commit (dc=all): 'db2223 (re)pooling @ 75%: post clone repool', diff saved to https://phabricator.wikimedia.org/P70685 and previous config saved to /var/cache/conftool/dbconfig/20241030-084457-arnaudb.json
- 08:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P70684 and previous config saved to /var/cache/conftool/dbconfig/20241030-084054-arnaudb.json
- 08:29 arnaudb@cumin1002: dbctl commit (dc=all): 'db2223 (re)pooling @ 50%: post clone repool', diff saved to https://phabricator.wikimedia.org/P70683 and previous config saved to /var/cache/conftool/dbconfig/20241030-082952-arnaudb.json
- 08:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2239.codfw.wmnet with reason: host in preparation
- 08:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2239.codfw.wmnet with reason: host in preparation
- 08:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P70682 and previous config saved to /var/cache/conftool/dbconfig/20241030-082547-arnaudb.json
- 08:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db2223 (re)pooling @ 25%: post clone repool', diff saved to https://phabricator.wikimedia.org/P70680 and previous config saved to /var/cache/conftool/dbconfig/20241030-081446-arnaudb.json
- 07:59 arnaudb@cumin1002: dbctl commit (dc=all): 'db2223 (re)pooling @ 10%: post clone repool', diff saved to https://phabricator.wikimedia.org/P70678 and previous config saved to /var/cache/conftool/dbconfig/20241030-075941-arnaudb.json
- 07:57 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 07:52 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 07:44 arnaudb@cumin1002: dbctl commit (dc=all): 'db2223 (re)pooling @ 5%: post clone repool', diff saved to https://phabricator.wikimedia.org/P70677 and previous config saved to /var/cache/conftool/dbconfig/20241030-074436-arnaudb.json
- 07:29 arnaudb@cumin1002: dbctl commit (dc=all): 'db2223 (re)pooling @ 4%: post clone repool', diff saved to https://phabricator.wikimedia.org/P70676 and previous config saved to /var/cache/conftool/dbconfig/20241030-072930-arnaudb.json
- 07:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P70675 and previous config saved to /var/cache/conftool/dbconfig/20241030-072520-arnaudb.json
- 07:25 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 07:25 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 07:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 07:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 07:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db2223 (re)pooling @ 2%: post clone repool', diff saved to https://phabricator.wikimedia.org/P70674 and previous config saved to /var/cache/conftool/dbconfig/20241030-071425-arnaudb.json
- 06:59 arnaudb@cumin1002: dbctl commit (dc=all): 'db2223 (re)pooling @ 1%: post clone repool', diff saved to https://phabricator.wikimedia.org/P70673 and previous config saved to /var/cache/conftool/dbconfig/20241030-065920-arnaudb.json
- 06:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.sanitize-pii (exit_code=0) Managing PII for wikis tcywikisource, tcywiktionary in section s5
- 06:47 arnaudb@cumin1002: START - Cookbook sre.mysql.sanitize-pii Managing PII for wikis tcywikisource, tcywiktionary in section s5
- 06:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.sanitize-pii (exit_code=0) Checking PII for wikis tcywikisource in section s5
- 06:46 arnaudb@cumin1002: START - Cookbook sre.mysql.sanitize-pii Checking PII for wikis tcywikisource in section s5
- 00:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T376905)', diff saved to https://phabricator.wikimedia.org/P70672 and previous config saved to /var/cache/conftool/dbconfig/20241030-003847-ladsgroup.json
- 00:28 zabe@deploy2002: Finished scap sync-world: update interwiki cache (duration: 09m 01s)
- 00:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P70671 and previous config saved to /var/cache/conftool/dbconfig/20241030-002340-ladsgroup.json
- 00:19 zabe@deploy2002: Started scap sync-world: update interwiki cache
- 00:14 zabe: zabe@mwmaint2002:~$ mwscript extensions/CirrusSearch/maintenance/UpdateSearchIndexConfig.php --wiki=tcywikisource --cluster=all 2>&1 | tee /tmp/tcywikisource.UpdateSearchIndexConfig.log # T377919
- 00:11 zabe@deploy2002: Finished scap sync-world: Creating tcywikisource (T377919) (duration: 08m 13s)
- 00:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P70670 and previous config saved to /var/cache/conftool/dbconfig/20241030-000833-ladsgroup.json
- 00:03 zabe@deploy2002: Started scap sync-world: Creating tcywikisource (T377919)
2024-10-29
- 23:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T376905)', diff saved to https://phabricator.wikimedia.org/P70669 and previous config saved to /var/cache/conftool/dbconfig/20241029-235326-ladsgroup.json
- 23:53 zabe: zabe@mwmaint2002:~$ mwscript extensions/CirrusSearch/maintenance/UpdateSearchIndexConfig.php --wiki=tcywiktionary --cluster=all 2>&1 | tee /tmp/tcywiktionary.UpdateSearchIndexConfig.log # T377922
- 23:48 zabe@deploy2002: Finished scap sync-world: Creating tcywiktionary (T377922) (duration: 07m 26s)
- 23:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2216 (T376905)', diff saved to https://phabricator.wikimedia.org/P70668 and previous config saved to /var/cache/conftool/dbconfig/20241029-234608-ladsgroup.json
- 23:46 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
- 23:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
- 23:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212 (T376905)', diff saved to https://phabricator.wikimedia.org/P70667 and previous config saved to /var/cache/conftool/dbconfig/20241029-234541-ladsgroup.json
- 23:41 zabe@deploy2002: Started scap sync-world: Creating tcywiktionary (T377922)
- 23:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212', diff saved to https://phabricator.wikimedia.org/P70666 and previous config saved to /var/cache/conftool/dbconfig/20241029-233034-ladsgroup.json
- 23:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212', diff saved to https://phabricator.wikimedia.org/P70665 and previous config saved to /var/cache/conftool/dbconfig/20241029-231527-ladsgroup.json
- 23:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212 (T376905)', diff saved to https://phabricator.wikimedia.org/P70664 and previous config saved to /var/cache/conftool/dbconfig/20241029-230020-ladsgroup.json
- 22:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2202.codfw.wmnet with reason: Maintenance
- 22:47 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2202.codfw.wmnet with reason: Maintenance
- 22:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T376905)', diff saved to https://phabricator.wikimedia.org/P70662 and previous config saved to /var/cache/conftool/dbconfig/20241029-224717-ladsgroup.json
- 22:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P70661 and previous config saved to /var/cache/conftool/dbconfig/20241029-223210-ladsgroup.json
- 22:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P70660 and previous config saved to /var/cache/conftool/dbconfig/20241029-221703-ladsgroup.json
- 22:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T376905)', diff saved to https://phabricator.wikimedia.org/P70659 and previous config saved to /var/cache/conftool/dbconfig/20241029-220156-ladsgroup.json
- 21:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2188 (T376905)', diff saved to https://phabricator.wikimedia.org/P70658 and previous config saved to /var/cache/conftool/dbconfig/20241029-215443-ladsgroup.json
- 21:54 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
- 21:54 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
- 21:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T376905)', diff saved to https://phabricator.wikimedia.org/P70657 and previous config saved to /var/cache/conftool/dbconfig/20241029-215417-ladsgroup.json
- 21:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P70656 and previous config saved to /var/cache/conftool/dbconfig/20241029-213910-ladsgroup.json
- 21:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P70655 and previous config saved to /var/cache/conftool/dbconfig/20241029-212402-ladsgroup.json
- 21:09 eileen: civicrm upgraded from 0b7f3b47 to 0eb881ca
- 21:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T376905)', diff saved to https://phabricator.wikimedia.org/P70654 and previous config saved to /var/cache/conftool/dbconfig/20241029-210855-ladsgroup.json
- 20:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2176 (T376905)', diff saved to https://phabricator.wikimedia.org/P70653 and previous config saved to /var/cache/conftool/dbconfig/20241029-205718-ladsgroup.json
- 20:57 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 20:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 20:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T376905)', diff saved to https://phabricator.wikimedia.org/P70652 and previous config saved to /var/cache/conftool/dbconfig/20241029-205652-ladsgroup.json
- 20:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P70651 and previous config saved to /var/cache/conftool/dbconfig/20241029-204145-ladsgroup.json
- 20:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P70650 and previous config saved to /var/cache/conftool/dbconfig/20241029-202638-ladsgroup.json
- 20:14 kostajh: UTC late deploys done
- 20:12 kharlan@deploy2002: Finished scap sync-world: Backport for QuickSurveys: Undeploy safety survey (T376517), Missing.php: redirect wikisources to localized main page (duration: 09m 16s)
- 20:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T376905)', diff saved to https://phabricator.wikimedia.org/P70649 and previous config saved to /var/cache/conftool/dbconfig/20241029-201131-ladsgroup.json
- 20:08 kharlan@deploy2002: pppery, kharlan: Continuing with sync
- 20:05 kharlan@deploy2002: pppery, kharlan: Backport for QuickSurveys: Undeploy safety survey (T376517), Missing.php: redirect wikisources to localized main page synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:03 kharlan@deploy2002: Started scap sync-world: Backport for QuickSurveys: Undeploy safety survey (T376517), Missing.php: redirect wikisources to localized main page
- 20:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2174 (T376905)', diff saved to https://phabricator.wikimedia.org/P70648 and previous config saved to /var/cache/conftool/dbconfig/20241029-200056-ladsgroup.json
- 20:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 20:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 20:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T376905)', diff saved to https://phabricator.wikimedia.org/P70647 and previous config saved to /var/cache/conftool/dbconfig/20241029-200029-ladsgroup.json
- 19:56 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on an-worker1165.eqiad.wmnet with reason: T378454
- 19:55 bking@cumin2002: START - Cookbook sre.hosts.downtime for 6 days, 0:00:00 on an-worker1165.eqiad.wmnet with reason: T378454
- 19:48 eileen: civicrm upgraded from 8f5c8b33 to 0b7f3b47
- 19:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P70646 and previous config saved to /var/cache/conftool/dbconfig/20241029-194522-ladsgroup.json
- 19:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P70645 and previous config saved to /var/cache/conftool/dbconfig/20241029-193015-ladsgroup.json
- 19:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T376905)', diff saved to https://phabricator.wikimedia.org/P70644 and previous config saved to /var/cache/conftool/dbconfig/20241029-191508-ladsgroup.json
- 19:05 eileen: civicrm upgraded from 1c6c4e08 to 8f5c8b33
- 19:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2173 (T376905)', diff saved to https://phabricator.wikimedia.org/P70643 and previous config saved to /var/cache/conftool/dbconfig/20241029-190442-ladsgroup.json
- 19:04 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 19:04 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 19:04 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
- 19:04 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
- 19:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T376905)', diff saved to https://phabricator.wikimedia.org/P70642 and previous config saved to /var/cache/conftool/dbconfig/20241029-190359-ladsgroup.json
- 18:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P70641 and previous config saved to /var/cache/conftool/dbconfig/20241029-184852-ladsgroup.json
- 18:37 swfrench-wmf: shellbox-syntaxhighlight updated to shellbox 2024-10-15-214239 - T375243
- 18:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P70640 and previous config saved to /var/cache/conftool/dbconfig/20241029-183345-ladsgroup.json
- 18:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 18:31 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 18:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T376905)', diff saved to https://phabricator.wikimedia.org/P70639 and previous config saved to /var/cache/conftool/dbconfig/20241029-181838-ladsgroup.json
- 18:10 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.44.0-wmf.1 refs T375660
- 18:10 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 18:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 18:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2170 (T376905)', diff saved to https://phabricator.wikimedia.org/P70638 and previous config saved to /var/cache/conftool/dbconfig/20241029-180816-ladsgroup.json
- 18:08 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 18:08 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 18:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T376905)', diff saved to https://phabricator.wikimedia.org/P70637 and previous config saved to /var/cache/conftool/dbconfig/20241029-180750-ladsgroup.json
- 17:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P70636 and previous config saved to /var/cache/conftool/dbconfig/20241029-175243-ladsgroup.json
- 17:51 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:50 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:49 brett: Remove RSA cert support from Icinga, librenms (T375569)
- 17:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P70635 and previous config saved to /var/cache/conftool/dbconfig/20241029-173735-ladsgroup.json
- 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 17:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
- 17:31 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
- 17:30 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
- 17:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
- 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
- 17:29 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
- 17:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T376905)', diff saved to https://phabricator.wikimedia.org/P70634 and previous config saved to /var/cache/conftool/dbconfig/20241029-172228-ladsgroup.json
- 17:17 sergi0: Running `foreachwiki userOptions.php --delete --old=A --old=D --old=C --old=null --old=imagerecommendation --old=linkrecommendation growthexperiments-homepage-variant`
- 17:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2153 (T376905)', diff saved to https://phabricator.wikimedia.org/P70633 and previous config saved to /var/cache/conftool/dbconfig/20241029-171258-ladsgroup.json
- 17:13 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
- 17:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
- 17:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2146.codfw.wmnet with reason: Maintenance
- 17:07 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2146.codfw.wmnet with reason: Maintenance
- 17:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T376905)', diff saved to https://phabricator.wikimedia.org/P70632 and previous config saved to /var/cache/conftool/dbconfig/20241029-170657-ladsgroup.json
- 17:05 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1043.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 17:00 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1043.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:58 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
- 16:58 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
- 16:57 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
- 16:56 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
- 16:55 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
- 16:55 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1042.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:54 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
- 16:54 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
- 16:53 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
- 16:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P70631 and previous config saved to /var/cache/conftool/dbconfig/20241029-165150-ladsgroup.json
- 16:49 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1042.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:47 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1041.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:47 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-presto1016.eqiad.wmnet with OS bullseye
- 16:42 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1041.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:40 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1040.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P70630 and previous config saved to /var/cache/conftool/dbconfig/20241029-163643-ladsgroup.json
- 16:35 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1040.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:31 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1039.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:26 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1039.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:26 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 16:26 elukey@cumin1002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 16:25 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2041.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 16:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T376905)', diff saved to https://phabricator.wikimedia.org/P70629 and previous config saved to /var/cache/conftool/dbconfig/20241029-162136-ladsgroup.json
- 16:21 rzl@deploy2002: Finished scap sync-world: 1079056 T376923 (duration: 11m 47s)
- 16:19 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti2041.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 16:18 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2044.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 16:16 rzl@deploy2002: rzl: Continuing with sync
- 16:16 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 16:15 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 16:14 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti2044.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 16:13 rzl@deploy2002: rzl: 1079056 T376923 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 16:12 rzl@deploy2002: Started scap sync-world: 1079056 T376923
- 16:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2145 (T376905)', diff saved to https://phabricator.wikimedia.org/P70627 and previous config saved to /var/cache/conftool/dbconfig/20241029-161103-ladsgroup.json
- 16:11 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2145.codfw.wmnet with reason: Maintenance
- 16:10 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2145.codfw.wmnet with reason: Maintenance
- 16:07 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2043.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 16:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 16:07 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 16:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T376905)', diff saved to https://phabricator.wikimedia.org/P70626 and previous config saved to /var/cache/conftool/dbconfig/20241029-160607-ladsgroup.json
- 16:05 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2041.codfw.wmnet
- 16:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2041.codfw.wmnet
- 16:03 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 16:02 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 16:01 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti2043.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 16:00 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2040.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 15:56 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
- 15:56 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
- 15:55 stevemunene@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
- 15:55 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti2040.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 15:54 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
- 15:54 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
- 15:54 stevemunene@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
- 15:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P70625 and previous config saved to /var/cache/conftool/dbconfig/20241029-155101-ladsgroup.json
- 15:47 moritzm: installing libheif security updates
- 15:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P70624 and previous config saved to /var/cache/conftool/dbconfig/20241029-153554-ladsgroup.json
- 15:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2040.codfw.wmnet
- 15:25 XioNoX: test prefering lumen-ATT path in eqiad
- 15:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2040.codfw.wmnet
- 15:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T376905)', diff saved to https://phabricator.wikimedia.org/P70623 and previous config saved to /var/cache/conftool/dbconfig/20241029-152047-ladsgroup.json
- 15:17 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2039.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 15:14 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1003.eqiad.wmnet with OS bookworm
- 15:12 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti2039.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 15:10 claime: Running `/usr/bin/systemd-cat -t "import-wikitech.sh" /wikitech-static/wikitechsync/import-wikitech.sh &` on wikitech-static - T348503
- 15:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2130 (T376905)', diff saved to https://phabricator.wikimedia.org/P70622 and previous config saved to /var/cache/conftool/dbconfig/20241029-150953-ladsgroup.json
- 15:10 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2130.codfw.wmnet with reason: Maintenance
- 15:09 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2130.codfw.wmnet with reason: Maintenance
- 15:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T376905)', diff saved to https://phabricator.wikimedia.org/P70621 and previous config saved to /var/cache/conftool/dbconfig/20241029-150926-ladsgroup.json
- 15:08 claime: Running `find /srv/mediawiki/images/wikitech/archive -type f | xargs rm` on wikitech-static - T374114 T348503
- 15:00 claime: Running php maintenance/deleteArchivedFiles.php --delete on wikitech-static - T374114
- 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2039.codfw.wmnet
- 14:55 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1003.eqiad.wmnet with reason: host reimage
- 14:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P70619 and previous config saved to /var/cache/conftool/dbconfig/20241029-145419-ladsgroup.json
- 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2039.codfw.wmnet
- 14:52 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1003.eqiad.wmnet with reason: host reimage
- 14:52 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2038.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:47 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti2038.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:44 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 14:40 reedy@deploy2002: Finished scap sync-world: 1.44.0-wmf.1 backports to fix deprecated logspam T375660 T377521 (duration: 07m 21s)
- 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet
- 14:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P70616 and previous config saved to /var/cache/conftool/dbconfig/20241029-143912-ladsgroup.json
- 14:39 herron: centrallog1002:~# systemctl restart rsyslogd
- 14:38 elukey@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 14:35 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet
- 14:35 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2037.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:34 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1003.eqiad.wmnet with OS bookworm
- 14:32 reedy@deploy2002: Started scap sync-world: 1.44.0-wmf.1 backports to fix deprecated logspam T375660 T377521
- 14:29 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2037.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:29 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2037.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:25 MichaelG_WMF: T372337 clearing dangling database-records for link suggestions by running `mwscript extensions/GrowthExperiments/maintenance/fixLinkRecommendationData.php --wiki=eswiki --db-table --force`
- 14:24 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2037.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T376905)', diff saved to https://phabricator.wikimedia.org/P70615 and previous config saved to /var/cache/conftool/dbconfig/20241029-142405-ladsgroup.json
- 14:20 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 14:19 elukey: restart rsyslog on centrallog1002 - connection errors, failing prometheus probes
- 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet
- 14:18 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet
- 14:17 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ml-lab1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 14:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2116 (T376905)', diff saved to https://phabricator.wikimedia.org/P70614 and previous config saved to /var/cache/conftool/dbconfig/20241029-141532-ladsgroup.json
- 14:16 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance
- 14:15 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance
- 14:14 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2036.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:09 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2036.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:07 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 14:06 kostajh: UTC afternoon deploys done
- 14:05 kharlan@deploy2002: Finished scap sync-world: Backport for AuthManagerStatsdHandler: Add label for wiki (T375505), AuthManagerStatsdHandler: Add label for wiki (T375505) (duration: 07m 53s)
- 14:01 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 14:00 kharlan@deploy2002: kharlan: Continuing with sync
- 13:59 kharlan@deploy2002: kharlan: Backport for AuthManagerStatsdHandler: Add label for wiki (T375505), AuthManagerStatsdHandler: Add label for wiki (T375505) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet
- 13:57 kharlan@deploy2002: Started scap sync-world: Backport for AuthManagerStatsdHandler: Add label for wiki (T375505), AuthManagerStatsdHandler: Add label for wiki (T375505)
- 13:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet
- 13:48 jforrester@deploy2002: Finished scap sync-world: Backport for fix ibawiki's tagline svg path (duration: 07m 41s)
- 13:47 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 16347
- 13:46 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 16347
- 13:45 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 16347
- 13:45 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 16347
- 13:43 jforrester@deploy2002: jforrester, hamishz: Continuing with sync
- 13:42 jforrester@deploy2002: jforrester, hamishz: Backport for fix ibawiki's tagline svg path synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:42 moritzm: installing ghoscript security updates
- 13:40 jforrester@deploy2002: Started scap sync-world: Backport for fix ibawiki's tagline svg path
- 13:38 jforrester@deploy2002: Finished scap sync-world: Backport for Allow admins on testwiki to grant and remove upwizcampeditors (T378067), nlwiki, commonswiki, wikidata: lift IP cap for edit-a-thon (T377930) (duration: 08m 03s)
- 13:34 jforrester@deploy2002: dreamrimmer, superzerocool, jforrester: Continuing with sync
- 13:33 jforrester@deploy2002: dreamrimmer, superzerocool, jforrester: Backport for Allow admins on testwiki to grant and remove upwizcampeditors (T378067), nlwiki, commonswiki, wikidata: lift IP cap for edit-a-thon (T377930) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:31 jforrester@deploy2002: Started scap sync-world: Backport for Allow admins on testwiki to grant and remove upwizcampeditors (T378067), nlwiki, commonswiki, wikidata: lift IP cap for edit-a-thon (T377930)
- 13:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 100%: post clone repool', diff saved to https://phabricator.wikimedia.org/P70612 and previous config saved to /var/cache/conftool/dbconfig/20241029-132956-arnaudb.json
- 13:30 mszabo@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
- 13:30 mszabo@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
- 13:29 mszabo@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
- {{safesubst:SAL entry|1=13:28 jforrester@deploy2002: Finished scap sync-world: Backport for annwiki: Add logo (T377535), kgewiki: Add logo (T377075), shnwikinews: Add logo (T377543), gorwikiquote: Add logo (T377542), moswiki: Add logo (T377539), ibawiki: Add logo (T377538), rskwiki: Add logo (T377536), [[gerrit:10840}}
- 13:28 mszabo@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
- 13:27 mszabo@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 13:26 mszabo@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 13:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet
- 13:23 jforrester@deploy2002: jforrester, hamishz: Continuing with sync
- {{safesubst:SAL entry|1=13:22 jforrester@deploy2002: jforrester, hamishz: Backport for annwiki: Add logo (T377535), kgewiki: Add logo (T377075), shnwikinews: Add logo (T377543), gorwikiquote: Add logo (T377542), moswiki: Add logo (T377539), ibawiki: Add logo (T377538), rskwiki: Add logo (T377536), [[gerrit:1084079|td}}
- {{safesubst:SAL entry|1=13:20 jforrester@deploy2002: Started scap sync-world: Backport for annwiki: Add logo (T377535), kgewiki: Add logo (T377075), shnwikinews: Add logo (T377543), gorwikiquote: Add logo (T377542), moswiki: Add logo (T377539), ibawiki: Add logo (T377538), rskwiki: Add logo (T377536), [[gerrit:108407}}
- 13:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet
- 13:17 jforrester@deploy2002: Finished scap sync-world: Backport for ExtensionDistributor: Mark 1.43 as beta (T372322), ExtensionDistributor: Remove EOL 1.40 (T364989), enwiktionary: Enable mobile page tabs for non logged in users (T377648) (duration: 12m 41s)
- 13:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 75%: post clone repool', diff saved to https://phabricator.wikimedia.org/P70610 and previous config saved to /var/cache/conftool/dbconfig/20241029-131451-arnaudb.json
- 13:11 jforrester@deploy2002: zabe, macfan4000, hamishz, jforrester: Continuing with sync
- 13:10 jforrester@deploy2002: zabe, macfan4000, hamishz, jforrester: Backport for ExtensionDistributor: Mark 1.43 as beta (T372322), ExtensionDistributor: Remove EOL 1.40 (T364989), enwiktionary: Enable mobile page tabs for non logged in users (T377648) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet
- 13:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet
- 13:05 jforrester@deploy2002: Started scap sync-world: Backport for ExtensionDistributor: Mark 1.43 as beta (T372322), ExtensionDistributor: Remove EOL 1.40 (T364989), enwiktionary: Enable mobile page tabs for non logged in users (T377648)
- 12:59 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 50%: post clone repool', diff saved to https://phabricator.wikimedia.org/P70607 and previous config saved to /var/cache/conftool/dbconfig/20241029-125945-arnaudb.json
- 12:50 moritzm: installing Apache security updates
- 12:44 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 25%: post clone repool', diff saved to https://phabricator.wikimedia.org/P70606 and previous config saved to /var/cache/conftool/dbconfig/20241029-124440-arnaudb.json
- 12:43 claime: Manually relaunched import-wikitech.sh on wikitech-static - T374114
- 12:42 claime: Killed dead and stacked import-wikitech.sh processes on wikitech-static - T374114
- 12:28 andrewtavis-wmde@deploy2002: Finished deploy [airflow-dags/wmde@d85a93c]: (no justification provided) (duration: 00m 30s)
- 12:27 andrewtavis-wmde@deploy2002: Started deploy [airflow-dags/wmde@d85a93c]: (no justification provided)
- 12:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2015.codfw.wmnet
- 12:04 cgoubert@deploy2002: Finished scap sync-world: T377958 - full mediawiki image rebuild and deployment to add helper scripts for mwcron, mwscript (duration: 29m 44s)
- 11:39 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2044.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:39 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2044.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:36 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2041.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:36 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2041.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:36 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2040.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:36 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2040.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:35 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2039.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:35 cgoubert@deploy2002: Started scap sync-world: T377958 - full mediawiki image rebuild and deployment to add helper scripts for mwcron, mwscript
- 11:35 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2039.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:35 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2038.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:34 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2038.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:33 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2037.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:33 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2037.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:32 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2036.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:32 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2036.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:30 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1052.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:30 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1052.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:29 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1051.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:29 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1051.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:29 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1050.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:29 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1050.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:28 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1049.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:28 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1049.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:27 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1048.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:27 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1048.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:27 claime: Rebuilding php{7.4,8.1}-fpm-multiversion-base - T377958
- 11:26 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1047.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:26 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1047.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:25 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1046.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:25 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1046.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:24 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1045.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:24 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1045.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:23 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:23 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:23 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1043.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:22 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1043.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:21 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1042.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:21 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1042.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:18 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1041.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:18 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1041.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:16 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1040.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:16 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1040.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:15 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1039.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:15 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1039.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:11 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:11 elukey@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:10 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:10 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ml-lab1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:10 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:09 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:09 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1011.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:09 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve1011.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:07 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1010.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:07 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve1010.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:05 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:05 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 11:02 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:01 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 10:59 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2044.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 10:59 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2043.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 10:58 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2043.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 10:53 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2042.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 10:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2211.codfw.wmnet onto db2223.codfw.wmnet
- 10:50 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2042.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 10:14 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2015.codfw.wmnet
- 10:08 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2211.codfw.wmnet onto db2223.codfw.wmnet
- 10:07 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-presto1016.eqiad.wmnet with OS bullseye
- 10:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2015.codfw.wmnet
- 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2015.codfw.wmnet
- 09:56 moritzm: installing wireshark security updates
- 09:41 kostajh: UTC morning deploys done
- 09:23 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2211.codfw.wmnet onto db2223.codfw.wmnet
- 09:23 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2211.codfw.wmnet onto db2223.codfw.wmnet
- 09:22 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2211.codfw.wmnet onto db2223.codfw.wmnet
- 09:21 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2211.codfw.wmnet onto db2223.codfw.wmnet
- 09:20 kharlan@deploy2002: Finished scap sync-world: Backport for temp accounts: Enable temp account autocreation on five pilot wikis (T378334), beta: enable "Surfacing structured tasks" for an early beta-wiki (T376677) (duration: 24m 42s)
- 09:20 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2211.codfw.wmnet onto db2223.codfw.wmnet
- 09:20 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2211.codfw.wmnet onto db2223.codfw.wmnet
- 09:16 kharlan@deploy2002: migr, kharlan: Continuing with sync
- 09:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on pc1013.eqiad.wmnet with reason: maintenance, host is not pooled
- 09:13 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on pc1013.eqiad.wmnet with reason: maintenance, host is not pooled
- 09:07 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2211.codfw.wmnet onto db2223.codfw.wmnet
- 09:07 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2211.codfw.wmnet onto db2223.codfw.wmnet
- 08:58 kharlan@deploy2002: migr, kharlan: Backport for temp accounts: Enable temp account autocreation on five pilot wikis (T378334), beta: enable "Surfacing structured tasks" for an early beta-wiki (T376677) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:56 kharlan@deploy2002: Started scap sync-world: Backport for temp accounts: Enable temp account autocreation on five pilot wikis (T378334), beta: enable "Surfacing structured tasks" for an early beta-wiki (T376677)
- 08:55 moritzm: upgrade irc.wikimedia.org to ircstream 1.0+wmf12u1 T376014
- 08:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: T378320', diff saved to https://phabricator.wikimedia.org/P70604 and previous config saved to /var/cache/conftool/dbconfig/20241029-085507-arnaudb.json
- 08:53 kharlan@deploy2002: Finished scap sync-world: Backport for Unblock CI (T377947), StatsLib: Set label for wiki ID (T375496) (duration: 13m 06s)
- 08:52 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2211.codfw.wmnet onto db2223.codfw.wmnet
- 08:52 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2211.codfw.wmnet onto db2223.codfw.wmnet
- 08:51 moritzm: uploaded ircstream 1.0+wmf12u1 to apt.wikimedia.org T376014
- 08:49 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 56258
- 08:48 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 56258
- 08:47 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 264567
- 08:47 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 264567
- 08:47 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 16591
- 08:46 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 16591
- 08:46 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 200478
- 08:46 kharlan@deploy2002: kharlan: Continuing with sync
- 08:45 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 200478
- 08:45 kharlan@deploy2002: kharlan: Backport for Unblock CI (T377947), StatsLib: Set label for wiki ID (T375496) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:44 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 56258
- 08:44 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 56258
- 08:43 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 8966
- 08:42 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 8966
- 08:42 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9038
- 08:41 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 9038
- 08:41 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 16347
- 08:41 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2211.codfw.wmnet onto db2223.codfw.wmnet
- 08:41 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2211.codfw.wmnet onto db2223.codfw.wmnet
- 08:41 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 16347
- 08:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: T378320', diff saved to https://phabricator.wikimedia.org/P70603 and previous config saved to /var/cache/conftool/dbconfig/20241029-084002-arnaudb.json
- 08:40 kharlan@deploy2002: Started scap sync-world: Backport for Unblock CI (T377947), StatsLib: Set label for wiki ID (T375496)
- 08:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2211 in db2223 for T373579', diff saved to https://phabricator.wikimedia.org/P70602 and previous config saved to /var/cache/conftool/dbconfig/20241029-083035-arnaudb.json
- 08:29 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 28306
- 08:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2211 - depooling db2211 to clone on db2223
- 08:29 arnaudb@cumin1002: START - Cookbook sre.mysql.depool db2211 - depooling db2211 to clone on db2223
- 08:29 arnaudb@cumin1002: dbctl commit (dc=all): 'depool preshot db2211', diff saved to https://phabricator.wikimedia.org/P70601 and previous config saved to /var/cache/conftool/dbconfig/20241029-082903-arnaudb.json
- 08:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: provisionning db2223.codfw.wmnet - T373579
- 08:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: provisionning db2223.codfw.wmnet - T373579
- 08:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: provisionning db2223.codfw.wmnet - T373579
- 08:28 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 28306
- 08:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: provisionning db2223.codfw.wmnet - T373579
- 08:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: T378320', diff saved to https://phabricator.wikimedia.org/P70600 and previous config saved to /var/cache/conftool/dbconfig/20241029-082456-arnaudb.json
- 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts irc2004.wikimedia.org
- 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: irc2004.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: irc2004.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:15 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:09 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts irc2004.wikimedia.org
- 08:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: T378320', diff saved to https://phabricator.wikimedia.org/P70599 and previous config saved to /var/cache/conftool/dbconfig/20241029-080951-arnaudb.json
- 08:08 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) db1169 quickly with 2 steps - index rebuilt
- 08:08 arnaudb@cumin1002: START - Cookbook sre.mysql.pool db1169 quickly with 2 steps - index rebuilt
- 08:08 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) db1169 gradually with 4 steps - index rebuilt
- 08:08 arnaudb@cumin1002: START - Cookbook sre.mysql.pool db1169 gradually with 4 steps - index rebuilt
- 08:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts irc1004.wikimedia.org
- 08:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: irc1004.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:06 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) db1169 gradually with 4 steps - index rebuilt
- 08:06 arnaudb@cumin1002: START - Cookbook sre.mysql.pool db1169 gradually with 4 steps - index rebuilt
- 08:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: irc1004.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:03 moritzm: installing qemu security updates
- 07:58 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 07:53 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts irc1004.wikimedia.org
- 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.43.0-wmf.26 (duration: 01m 04s)
- 03:53 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.44.0-wmf.1 refs T375660 (duration: 49m 51s)
- 03:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.44.0-wmf.1 refs T375660
2024-10-28
- 23:08 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
- 23:08 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
- 23:06 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
- 23:06 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
- 23:05 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 23:05 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 23:04 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
- 23:03 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
- 23:03 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
- 23:02 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
- 23:01 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 23:01 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
- 22:28 ebernhardson@deploy2002: Finished deploy [airflow-dags/search@d85a93c]: add missing comma (duration: 00m 36s)
- 22:27 ebernhardson@deploy2002: Started deploy [airflow-dags/search@d85a93c]: add missing comma
- 22:10 ebernhardson@deploy2002: Finished deploy [airflow-dags/search@99eb6f3]: T375387: update discolytics to 0.27.0 (duration: 00m 50s)
- 22:09 ebernhardson@deploy2002: Started deploy [airflow-dags/search@99eb6f3]: T375387: update discolytics to 0.27.0
- 22:00 ryankemper: T372074 `sudo requestctl delete ipblock abuse/wdqs` && `sudo requestctl delete pattern ua/wdqs_sparql` to clean up objects removed in commit `d26fc1e910579d33d33ec3d5a192d137045eba4b` ( <-- this occurred before the requestctl commit; i just missed making the irc log)
- 21:48 ryankemper: T372074 `sudo requestctl commit`
- 21:29 kostajh: UTC late deploys done, for real
- 21:26 ryankemper: T372074 `sudo requestctl delete action cache-text/T372074` && `sudo requestctl delete action cache-text/T372074_wdqs_codfw_flap`
- 21:26 kharlan@deploy2002: Finished scap sync-world: Backport for GlobalContributionsPager: Make article link redirect to the page (T378155) (duration: 09m 01s)
- 21:21 kharlan@deploy2002: kharlan: Continuing with sync
- 21:19 kharlan@deploy2002: kharlan: Backport for GlobalContributionsPager: Make article link redirect to the page (T378155) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:17 kharlan@deploy2002: Started scap sync-world: Backport for GlobalContributionsPager: Make article link redirect to the page (T378155)
- 20:44 kostajh: UTC late deploys done
- {{safesubst:SAL entry|1=20:42 kharlan@deploy2002: Finished scap sync-world: Backport for Partial Revert "Make sure contributor's name is on its line" (T378142), Restore missing second argument to "mapState" in QuickView.vue (T378204), GlobalContributionsPager: Use Special:PermanentLink to construct link (T378155), [[gerrit:1083886|GlobalContributionsPager: Don't display external namespace in}}
- 20:37 kharlan@deploy2002: jdlrobson, kharlan: Continuing with sync
- {{safesubst:SAL entry|1=20:33 kharlan@deploy2002: jdlrobson, kharlan: Backport for Partial Revert "Make sure contributor's name is on its line" (T378142), Restore missing second argument to "mapState" in QuickView.vue (T378204), GlobalContributionsPager: Use Special:PermanentLink to construct link (T378155), [[gerrit:1083886|GlobalContributionsPager: Don't display external namespace in artic}}
- {{safesubst:SAL entry|1=20:30 kharlan@deploy2002: Started scap sync-world: Backport for Partial Revert "Make sure contributor's name is on its line" (T378142), Restore missing second argument to "mapState" in QuickView.vue (T378204), GlobalContributionsPager: Use Special:PermanentLink to construct link (T378155), [[gerrit:1083886|GlobalContributionsPager: Don't display external namespace in}}
- 19:52 brett: Removed RSA certificate support from tlsproxy (T375569)
- 19:33 brett: Removed RSA certificate support from mirrors, dumps (T375569)
- 19:27 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 19:26 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 19:24 gmodena@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 19:24 gmodena@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
- 19:23 brett: Removed RSA certificate support from ldap, archiva, durum
- 19:21 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 19:21 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
- 19:18 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich-next: apply
- 19:17 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich-next: apply
- 19:15 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
- 19:15 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
- 19:14 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich-next: apply
- 19:13 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich-next: apply
- 17:43 jiawang@deploy2002: Finished deploy [airflow-dags/analytics_product@a7456f9]: deploy tsp pipelines (duration: 01m 33s)
- 17:42 jiawang@deploy2002: Started deploy [airflow-dags/analytics_product@a7456f9]: deploy tsp pipelines
- 17:04 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on cloudvirt1063.eqiad.wmnet with reason: cloudvirt1063 needs maintenance T375223
- 17:03 fnegri@cumin1002: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on cloudvirt1063.eqiad.wmnet with reason: cloudvirt1063 needs maintenance T375223
- 16:55 vgutierrez@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1013.eqiad.wmnet
- 16:50 vgutierrez@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs1013.eqiad.wmnet
- 16:50 vgutierrez@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1014.eqiad.wmnet
- 16:44 vgutierrez@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs1014.eqiad.wmnet
- 16:44 vgutierrez@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1015.eqiad.wmnet
- 16:38 vgutierrez@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs1015.eqiad.wmnet
- 16:38 vgutierrez@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1016.eqiad.wmnet
- 16:32 vgutierrez@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs1016.eqiad.wmnet
- 16:26 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet
- 16:20 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet
- 16:20 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet
- 16:16 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet
- 15:51 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 02m 25s)
- 15:49 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 07m 35s)
- 15:48 XioNoX: re-enable IX BGP sessions in eqiad
- 15:30 jan_drewniak: starting portals deployment
- 15:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 15:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 15:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 15:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:51 MichaelG_WMF: T372337 - run `mwscript extensions/GrowthExperiments/maintenance/fixLinkRecommendationData.php --wiki=eswiki --search-index` to fix the remaining ca. 10K dangling search index records
- 14:37 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2042.codfw.wmnet to cluster codfw and group D
- 14:36 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2042.codfw.wmnet to cluster codfw and group D
- 14:08 urbanecm@deploy2002: Finished scap sync-world: Backport for knwiktionary: update logo, wordmark (T360022), hewikisource: add project namespace alias (T378303), Add config for testing T375264 on beta (T377988) (duration: 10m 43s)
- 14:04 urbanecm@deploy2002: anzx, cparle, urbanecm: Continuing with sync
- 14:01 urbanecm@deploy2002: anzx, cparle, urbanecm: Backport for knwiktionary: update logo, wordmark (T360022), hewikisource: add project namespace alias (T378303), Add config for testing T375264 on beta (T377988) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:58 urbanecm@deploy2002: Started scap sync-world: Backport for knwiktionary: update logo, wordmark (T360022), hewikisource: add project namespace alias (T378303), Add config for testing T375264 on beta (T377988)
- 13:57 urbanecm@deploy2002: Sync cancelled.
- 13:54 urbanecm@deploy2002: anzx, urbanecm: Backport for knwiktionary: update logo, wordmark (T360022) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:52 urbanecm@deploy2002: Started scap sync-world: Backport for knwiktionary: update logo, wordmark (T360022)
- 13:49 arnaudb@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2211 quickly with 2 steps - test fast pool
- 13:41 urbanecm@deploy2002: Finished scap sync-world: Backport for Enable CampaignEvents collaboration list by default (T375141), beta: Drop $wgCampaignEventsShowEventInvitationSpecialPages (T373442), prod: Drop $wgCampaignEventsShowEventInvitationSpecialPages (T373442) (duration: 13m 43s)
- 13:38 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2041.codfw.wmnet to cluster codfw and group D
- 13:37 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2041.codfw.wmnet to cluster codfw and group D
- 13:36 urbanecm@deploy2002: urbanecm, daimona: Continuing with sync
- 13:33 arnaudb@cumin2002: START - Cookbook sre.mysql.pool db2211 quickly with 2 steps - test fast pool
- 13:31 arnaudb@cumin2002: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2211 - test depool
- 13:31 arnaudb@cumin2002: START - Cookbook sre.mysql.depool db2211 - test depool
- 13:29 urbanecm@deploy2002: urbanecm, daimona: Backport for Enable CampaignEvents collaboration list by default (T375141), beta: Drop $wgCampaignEventsShowEventInvitationSpecialPages (T373442), prod: Drop $wgCampaignEventsShowEventInvitationSpecialPages (T373442) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:27 urbanecm@deploy2002: Started scap sync-world: Backport for Enable CampaignEvents collaboration list by default (T375141), beta: Drop $wgCampaignEventsShowEventInvitationSpecialPages (T373442), prod: Drop $wgCampaignEventsShowEventInvitationSpecialPages (T373442)
- 13:16 moritzm: installing bash/zsh updates from bookworm point release
- 12:12 moritzm: upgrade irc.wikimedia.org to ircstream 0.13.0+wmf12u3 T376014
- 12:06 _joe_: uploaded conftool 4.0.0-1 to reprepro T376877
- 11:30 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Specify wiki ID to ::getId call in GlobalBlockingHandler (T378085) (duration: 07m 44s)
- 11:25 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 11:25 dreamyjazz@deploy2002: dreamyjazz: Backport for Specify wiki ID to ::getId call in GlobalBlockingHandler (T378085) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 11:23 dreamyjazz@deploy2002: Started scap sync-world: Backport for Specify wiki ID to ::getId call in GlobalBlockingHandler (T378085)
- 11:05 volans: updated spicerack to v8.15.1 on cumin1002
- 10:58 Dreamy_Jazz: Ran `DROP TABLE /*_*/globalblocks` on all beta wikis (excluding the centralauth DB) - T377742
- 10:51 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet
- 10:50 elukey: elukey@puppetmaster1001:~$ sudo puppet cert destroy puppetboard.discovery.wmnet
- 10:46 cmooney@cumin1002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet
- 10:46 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
- 10:39 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet
- 10:36 dcausse: T378227: rebuilding dewiki_titlesuggest
- 10:35 moritzm: uploaded ircstream 0.13.0+wmf12u3 to apt.wikimedia.org (includes a fix which should hopefully reduce connection errors with bots using smart4irc)
- 10:34 cmooney@cumin1002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
- 10:34 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
- 10:34 cmooney@cumin1002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet
- 10:29 cmooney@cumin1002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
- 10:28 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
- 10:12 volans: updated spicerack to v8.15.1 on cumin2002
- 09:21 cmooney@cumin1002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
- 09:11 hashar: Restarted CI Jenkins for plugin update - T378327
- 08:42 dcausse: T378227: deleting broken cirrus titlesugest index dewiki_titlesuggest_1729824440
- 08:38 kostajh: UTC morning deploys done
- 08:38 kharlan@deploy2002: Finished scap sync-world: Backport for ContributionsPager: Fix getTemplateParams() parameter (T378132), Fix getTemplateParams() $classes parameter (T378132) (duration: 09m 38s)
- 08:33 kharlan@deploy2002: kharlan: Continuing with sync
- 08:31 kharlan@deploy2002: kharlan: Backport for ContributionsPager: Fix getTemplateParams() parameter (T378132), Fix getTemplateParams() $classes parameter (T378132) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:28 kharlan@deploy2002: Started scap sync-world: Backport for ContributionsPager: Fix getTemplateParams() parameter (T378132), Fix getTemplateParams() $classes parameter (T378132)
- 08:27 hashar: Pushed https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CheckUser/+/1083592 and https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1083591 for wmf/1.43.0-wmf.28 / T378132 due to a dependency loop
- 08:24 hashar: Pushed https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CheckUser/+/1083592 and https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CheckUser/+/1083592 for wmf/1.43.0-wmf.28 / T378132 due to a dependency loop
- 08:19 hashar: Changed UTC morning backport window from 00:00 SF to 09:00 CET (aka 08:00 UTC) | UTC morning backport window
- 08:07 kartik@deploy2002: Finished scap sync-world: Backport for Disable MT in Content Translation on Lithuanian Wikipedia (T364073) (duration: 22m 24s)
- 08:01 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1234.eqiad.wmnet with reason: maintenance T378267
- 08:01 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1234.eqiad.wmnet with reason: maintenance T378267
- 08:01 hashar: Restarted CI Jenkins to update the Collapsible Sections plugin | T378327
- 07:57 kartik@deploy2002: kartik: Continuing with sync
- 07:56 kartik@deploy2002: kartik: Backport for Disable MT in Content Translation on Lithuanian Wikipedia (T364073) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 07:45 kartik@deploy2002: Started scap sync-world: Backport for Disable MT in Content Translation on Lithuanian Wikipedia (T364073)
- 07:14 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[1169,1234].eqiad.wmnet with reason: maintenance
- 07:14 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db[1169,1234].eqiad.wmnet with reason: maintenance
- 06:07 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: replication broken T378320
- 06:06 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: replication broken T378320
- 06:03 taavi@cumin1002: dbctl commit (dc=all): 'depool db1169', diff saved to https://phabricator.wikimedia.org/P70590 and previous config saved to /var/cache/conftool/dbconfig/20241028-060327-taavi.json
2024-10-27
- 13:41 Dreamy_Jazz: Starting MediaModeration scanning on group1 wikis
- 13:37 Dreamy_Jazz: Starting MediaModeration scanning on group2 wikis
2024-10-26
- 16:29 mvernon@cumin1002: dbctl commit (dc=all): 'Depool db1234', diff saved to https://phabricator.wikimedia.org/P70589 and previous config saved to /var/cache/conftool/dbconfig/20241026-162946-mvernon.json
- 16:29 mvernon@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1234.eqiad.wmnet with reason: spontaneous reboot, depooling 'til Monday
- 16:28 mvernon@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db1234.eqiad.wmnet with reason: spontaneous reboot, depooling 'til Monday
- 02:03 tzatziki: removing 9 files for legal compliance
2024-10-25
- 18:29 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2012.codfw.wmnet with OS bookworm
- 18:28 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 18:18 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 17:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2012.codfw.wmnet with reason: host reimage
- 17:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2012.codfw.wmnet with reason: host reimage
- 16:42 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host backup2012.codfw.wmnet with OS bookworm
- 16:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host backup2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:28 JustHannah: T378170 Ran mwscript-k8s extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=trwiki --logwiki=metawiki 'Peter.kerepesi' 'Peakbagger77' @ 11:57:19 UTC
- 15:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host backup2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:57 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host backup2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:56 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host backup2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:53 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host backup2012
- 15:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host backup2012
- 15:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding backup2012 to codfw - jhancock@cumin2002"
- 15:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding backup2012 to codfw - jhancock@cumin2002"
- 15:47 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 15:07 dancy@deploy2002: Installation of scap version "4.118.0" completed for 209 hosts
- 15:03 dancy@deploy2002: Installing scap version "4.118.0" for 209 hosts
- 14:31 herron: alert1002: manually killed stunnel4 process to clear puppet failure T375143
- 14:02 sukhe: running authdns-update for CR 1082548
- 10:31 arnaudb@cumin1002: dbctl commit (dc=all): 'maintenance', diff saved to https://phabricator.wikimedia.org/P70588 and previous config saved to /var/cache/conftool/dbconfig/20241025-103157-arnaudb.json
- 10:21 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 10:18 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 10:17 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 10:16 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 10:15 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 10:12 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2014.codfw.wmnet
- 09:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2012.codfw.wmnet
- 09:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:17 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2012.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:17 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2012.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:12 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 09:06 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2012.codfw.wmnet
- 09:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2011.codfw.wmnet
- 09:05 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2011.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2011.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:54 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
- 08:53 jelto@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
- 08:47 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:42 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2011.codfw.wmnet
- 08:27 moritzm: installing wireshark security updates
- 08:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2014.codfw.wmnet
- 08:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2004.codfw.wmnet to plain
- 08:22 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2004.codfw.wmnet to plain
- 08:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2014.codfw.wmnet
- 08:21 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2014.codfw.wmnet
- 08:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2004.codfw.wmnet to drbd
- 08:11 moritzm: imported openjdk-8 8u422-b05-1~deb12u1 to component/jdk for bookworm-wikimedia
- 08:04 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2004.codfw.wmnet to drbd
- 08:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2014.codfw.wmnet
- 08:02 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2014.codfw.wmnet
- 08:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2004.codfw.wmnet to plain
- 08:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2004.codfw.wmnet to plain
- 07:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2004.codfw.wmnet to drbd
- 07:42 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2004.codfw.wmnet to drbd
- 06:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2014.codfw.wmnet
- 06:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2014.codfw.wmnet
- 06:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast1003.wikimedia.org
- 06:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast1003.wikimedia.org
- 06:27 jmm@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
- 06:19 jmm@cumin1002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
2024-10-24
- 23:09 tzatziki: removing 3 files for legal compliance
- 22:27 zabe@deploy2002: Finished scap sync-world: Backport for s8: Reduce revision-slots cache expiry to 60 seconds (T183490) (duration: 07m 03s)
- 22:23 zabe@deploy2002: zabe: Continuing with sync
- 22:23 zabe@deploy2002: zabe: Backport for s8: Reduce revision-slots cache expiry to 60 seconds (T183490) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 22:20 zabe@deploy2002: Started scap sync-world: Backport for s8: Reduce revision-slots cache expiry to 60 seconds (T183490)
- 21:37 legoktm@deploy2002: Finished scap sync-world: Backport for Update interwiki cache (duration: 07m 51s)
- 21:32 legoktm@deploy2002: legoktm: Continuing with sync
- 21:31 legoktm@deploy2002: legoktm: Backport for Update interwiki cache synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:29 legoktm@deploy2002: Started scap sync-world: Backport for Update interwiki cache
- 21:25 tzatziki: removing 1 file for legal compliance
- 21:24 Dreamy_Jazz: Ran `foreachwiki emptyUserGroup.php checkuser-temporary-account-viewer` on the beta wikis.
- 21:14 thcipriani@deploy2002: Finished scap sync-world: Backport for Enable edit check on nlwiki (T377551) (duration: 09m 07s)
- 21:09 thcipriani@deploy2002: thcipriani, kemayo: Continuing with sync
- 21:07 thcipriani@deploy2002: thcipriani, kemayo: Backport for Enable edit check on nlwiki (T377551) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:05 thcipriani@deploy2002: Started scap sync-world: Backport for Enable edit check on nlwiki (T377551)
- 21:02 thcipriani@deploy2002: Finished scap sync-world: Backport for chore: Move authevents logging into AuthManager (T341650 T375510 T375505), chore: AuthManager::autoCreateUser log authevents now (T341650 T375510 T375505) (duration: 18m 10s)
- 20:58 thcipriani@deploy2002: tgr, thcipriani: Continuing with sync
- 20:53 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Apply openjdk upgrade (11.0.25+9-1~deb11u1) - eevans@cumin1002
- 20:46 thcipriani@deploy2002: tgr, thcipriani: Backport for chore: Move authevents logging into AuthManager (T341650 T375510 T375505), chore: AuthManager::autoCreateUser log authevents now (T341650 T375510 T375505) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:44 thcipriani@deploy2002: Started scap sync-world: Backport for chore: Move authevents logging into AuthManager (T341650 T375510 T375505), chore: AuthManager::autoCreateUser log authevents now (T341650 T375510 T375505)
- 20:40 thcipriani@deploy2002: Finished scap sync-world: Backport for Configure settings for annwiki, nrwiki, mywikisource (T375102 T377160 T363270) (duration: 11m 09s)
- 20:35 thcipriani@deploy2002: thcipriani, pppery: Continuing with sync
- 20:31 thcipriani@deploy2002: thcipriani, pppery: Backport for Configure settings for annwiki, nrwiki, mywikisource (T375102 T377160 T363270) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:29 thcipriani@deploy2002: Started scap sync-world: Backport for Configure settings for annwiki, nrwiki, mywikisource (T375102 T377160 T363270)
- 20:24 thcipriani@deploy2002: Finished scap sync-world: Backport for Deploy missing.php redirects for Allemanic German (T376923) (duration: 14m 08s)
- 20:20 thcipriani@deploy2002: thcipriani, pppery: Continuing with sync
- 20:13 thcipriani@deploy2002: thcipriani, pppery: Backport for Deploy missing.php redirects for Allemanic German (T376923) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:10 thcipriani@deploy2002: Started scap sync-world: Backport for Deploy missing.php redirects for Allemanic German (T376923)
- 19:34 dancy@deploy2002: Finished scap sync-world: Backport for Use SpecialPage::getRobotPolicy to set robot policy (T378108) (duration: 07m 08s)
- 19:29 dancy@deploy2002: dancy: Continuing with sync
- 19:29 dancy@deploy2002: dancy: Backport for Use SpecialPage::getRobotPolicy to set robot policy (T378108) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 19:26 dancy@deploy2002: Started scap sync-world: Backport for Use SpecialPage::getRobotPolicy to set robot policy (T378108)
- 18:46 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Apply openjdk upgrade (11.0.25+9-1~deb11u1) - eevans@cumin1002
- 18:09 dancy@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.43.0-wmf.28 refs T375659
- 17:42 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
- 17:42 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
- 17:42 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
- 17:41 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
- 17:38 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
- 17:38 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 17:13 dancy@deploy2002: Finished scap sync-world: Backport for AbuseLogPager: Fix passing `false` as message parameter (T377917) (duration: 07m 18s)
- 17:09 dancy@deploy2002: dancy: Continuing with sync
- 17:09 dancy@deploy2002: dancy: Backport for AbuseLogPager: Fix passing `false` as message parameter (T377917) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 17:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2013.codfw.wmnet
- 17:06 dancy@deploy2002: Started scap sync-world: Backport for AbuseLogPager: Fix passing `false` as message parameter (T377917)
- 17:04 urbanecm: `mwscript-k8s -f extensions/Flow/maintenance/FlowMoveBoardsToSubpages.php -- --wiki=nowiki` (running as `mw-script.codfw.ui7285yu`; T376749)
- 16:56 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Apply openjdk upgrade (11.0.25+9-1~deb11u1) - eevans@cumin1002
- 16:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2088.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:45 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Fix encoding of usernames with non-ascii letters - oblivian@cumin1002"
- 16:44 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix encoding of usernames with non-ascii letters - oblivian@cumin1002
- 16:43 oblivian@cumin1002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix encoding of usernames with non-ascii letters - oblivian@cumin1002
- 16:43 oblivian@cumin1002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Fix encoding of usernames with non-ascii letters - oblivian@cumin1002"
- 16:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2087.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:15 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2088.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:15 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2087.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2087
- 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2088
- 16:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2088
- 16:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2087
- 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:12 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-be2087 to codfw - jhancock@cumin2002"
- 16:12 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-be2087 to codfw - jhancock@cumin2002"
- 16:06 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 16:05 dancy@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.43.0-wmf.28 refs T375659
- 16:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2086.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:51 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2016.codfw.wmnet
- 15:51 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2016.codfw.wmnet
- 15:50 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubernetes2016.codfw.wmnet
- 15:48 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubernetes2016.codfw.wmnet
- 15:47 xcollazo@deploy2002: Finished deploy [airflow-dags/analytics@325d943]: Deploy latest DAGs to analytics Airflow instance. T377999. (duration: 01m 07s)
- 15:46 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2016.codfw.wmnet
- 15:46 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2016.codfw.wmnet
- 15:46 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2015.codfw.wmnet
- 15:46 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2015.codfw.wmnet
- 15:45 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubernetes2015.codfw.wmnet
- 15:45 xcollazo@deploy2002: Started deploy [airflow-dags/analytics@325d943]: Deploy latest DAGs to analytics Airflow instance. T377999.
- 15:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2086.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:43 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubernetes2015.codfw.wmnet
- 15:42 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2086
- 15:42 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2086
- 15:41 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:41 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-be2086 to codfw - jhancock@cumin2002"
- 15:41 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-be2086 to codfw - jhancock@cumin2002"
- 15:41 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2015.codfw.wmnet
- 15:41 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2015.codfw.wmnet
- 15:41 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2006.codfw.wmnet
- 15:40 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2006.codfw.wmnet
- 15:40 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubernetes2006.codfw.wmnet
- 15:38 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubernetes2006.codfw.wmnet
- 15:37 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 15:37 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2006.codfw.wmnet
- 15:37 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2006.codfw.wmnet
- 15:36 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2005.codfw.wmnet
- 15:36 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2005.codfw.wmnet
- 15:35 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubernetes2005.codfw.wmnet
- 15:34 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes1016.eqiad.wmnet
- 15:34 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes1016.eqiad.wmnet
- 15:33 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubernetes1016.eqiad.wmnet
- 15:32 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubernetes2005.codfw.wmnet
- 15:31 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubernetes1016.eqiad.wmnet
- 15:30 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2005.codfw.wmnet
- 15:30 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2005.codfw.wmnet
- 15:30 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes1016.eqiad.wmnet
- 15:29 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2085.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:29 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes1016.eqiad.wmnet
- 15:29 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes1015.eqiad.wmnet
- 15:29 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes1015.eqiad.wmnet
- 15:28 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubernetes1015.eqiad.wmnet
- 15:26 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubernetes1015.eqiad.wmnet
- 15:25 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes1015.eqiad.wmnet
- 15:24 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes1015.eqiad.wmnet
- 15:24 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes1006.eqiad.wmnet
- 15:23 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes1006.eqiad.wmnet
- 15:23 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubernetes1006.eqiad.wmnet
- 15:21 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubernetes1006.eqiad.wmnet
- 15:19 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes1006.eqiad.wmnet
- 15:18 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes1006.eqiad.wmnet
- 15:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2085.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:16 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes1005.eqiad.wmnet
- 15:16 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes1005.eqiad.wmnet
- 15:16 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubernetes1005.eqiad.wmnet
- 15:15 ihurbain@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 15:15 ihurbain@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 15:13 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubernetes1005.eqiad.wmnet
- 15:13 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes1005.eqiad.wmnet
- 15:13 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes1005.eqiad.wmnet
- 15:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2085.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:09 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes1005.eqiad.wmnet
- 15:08 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes1005.eqiad.wmnet
- 15:08 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubernetes1005.eqiad.wmnet
- 15:08 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2085.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:04 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubernetes1005.eqiad.wmnet
- 15:03 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes1005.eqiad.wmnet
- 15:02 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes1005.eqiad.wmnet
- 14:54 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1285-1286,1288-1289].eqiad.wmnet
- 14:53 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1286,1288-1289].eqiad.wmnet
- 14:50 ihurbain@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 14:48 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Apply openjdk upgrade (11.0.25+9-1~deb11u1) - eevans@cumin1002
- 14:42 hashar: Restarting CI Jenkins
- 14:33 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2085
- 14:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2085
- 14:32 gmodena@deploy2002: Finished deploy [analytics/refinery@413e5d9] (hadoop-test): 2024-10-24 refinery hotfix deployment TEST [analytics/refinery@413e5d91] (duration: 04m 03s)
- 14:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-be2085 to codfw - jhancock@cumin2002"
- 14:28 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-be2085 to codfw - jhancock@cumin2002"
- 14:27 gmodena@deploy2002: Started deploy [analytics/refinery@413e5d9] (hadoop-test): 2024-10-24 refinery hotfix deployment TEST [analytics/refinery@413e5d91]
- 14:24 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 14:24 gmodena@deploy2002: Finished deploy [analytics/refinery@413e5d9] (thin): 2024-10-24 refinery hotfix deployment THIN [analytics/refinery@413e5d91] (duration: 04m 59s)
- 14:22 urbanecm@deploy2002: Finished scap sync-world: Backport for Add maintenance script to move all flow boards on a wiki to a subpage (T371738), Add maintenance script to move all flow boards on a wiki to a subpage (T371738) (duration: 07m 28s)
- 14:22 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2083.codfw.wmnet with OS bullseye
- 14:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2013.codfw.wmnet
- 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2013.codfw.wmnet
- 14:19 gmodena@deploy2002: Started deploy [analytics/refinery@413e5d9] (thin): 2024-10-24 refinery hotfix deployment THIN [analytics/refinery@413e5d91]
- 14:18 sukhe: running authdns-update for CR 1042919
- 14:16 gmodena@deploy2002: Finished deploy [analytics/refinery@413e5d9]: 2024-10-24 refinery hotfix deployment [analytics/refinery@413e5d91] (duration: 07m 48s)
- 14:15 urbanecm@deploy2002: Started scap sync-world: Backport for Add maintenance script to move all flow boards on a wiki to a subpage (T371738), Add maintenance script to move all flow boards on a wiki to a subpage (T371738)
- 14:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2013.codfw.wmnet
- 14:08 gmodena@deploy2002: Started deploy [analytics/refinery@413e5d9]: 2024-10-24 refinery hotfix deployment [analytics/refinery@413e5d91]
- 14:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet
- 14:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet
- 14:00 mvernon@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be1066.eqiad.wmnet
- 13:59 mvernon@cumin1002: START - Cookbook sre.hosts.remove-downtime for ms-be1066.eqiad.wmnet
- 13:57 Emperor: restarting swift after vacuum on ms-be1066 T377827
- 13:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet
- 13:53 mvernon@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1066.eqiad.wmnet with reason: vacuum an overlarge container db
- 13:52 mvernon@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on ms-be1066.eqiad.wmnet with reason: vacuum an overlarge container db
- 13:49 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1286.eqiad.wmnet with OS bookworm
- 13:47 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1288.eqiad.wmnet with OS bookworm
- 13:45 oblivian@cumin2002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool mw-web-ro in codfw: maintenance
- 13:43 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1289.eqiad.wmnet with OS bookworm
- 13:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet
- 13:40 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1285.eqiad.wmnet with OS bookworm
- 13:40 oblivian@cumin2002: START - Cookbook sre.discovery.service-route pool mw-web-ro in codfw: maintenance
- 13:29 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1286.eqiad.wmnet with reason: host reimage
- 13:26 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1288.eqiad.wmnet with reason: host reimage
- 13:23 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1289.eqiad.wmnet with reason: host reimage
- 13:23 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 13:22 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 13:20 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1285.eqiad.wmnet with reason: host reimage
- 13:19 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1288.eqiad.wmnet with reason: host reimage
- 13:18 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1289.eqiad.wmnet with reason: host reimage
- 13:18 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1286.eqiad.wmnet with reason: host reimage
- 13:16 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1285.eqiad.wmnet with reason: host reimage
- 13:15 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 13:14 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 13:07 elukey@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2083.codfw.wmnet with OS bullseye
- 12:59 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1288.eqiad.wmnet with OS bookworm
- 12:59 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1289.eqiad.wmnet with OS bookworm
- 12:58 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1286.eqiad.wmnet with OS bookworm
- 12:57 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1285.eqiad.wmnet with OS bookworm
- 12:55 btullis@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-main-eqiad cluster: Roll restart of jvm daemons.
- 12:46 mvernon@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
- 12:45 btullis@cumin1002: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-main-eqiad cluster: Roll restart of jvm daemons.
- 12:40 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1289.eqiad.wmnet with OS bookworm
- 12:38 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1286.eqiad.wmnet with OS bookworm
- 12:38 mvernon@cumin1002: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
- 12:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
- 12:37 mvernon@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1002.eqiad.wmnet
- 12:34 moritzm: bump qemu migration speed to 1000 for esams, ulsfo, eqsin, drmrs, magru Ganeti clusters
- 12:34 moritzm: bump qemu migration speed to 1000 for esams, ulsfo, eqsin, drmrs, magru clusters
- 12:33 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1288.eqiad.wmnet with OS bookworm
- 12:30 mvernon@cumin1002: START - Cookbook sre.hosts.reboot-single for host moss-be1002.eqiad.wmnet
- 12:29 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1285.eqiad.wmnet with OS bookworm
- 12:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
- 12:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2002.codfw.wmnet
- 12:22 mvernon@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1001.eqiad.wmnet
- 12:21 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1289.eqiad.wmnet with reason: host reimage
- 12:21 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2002.codfw.wmnet
- 12:21 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2001.codfw.wmnet
- 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet
- 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2036.codfw.wmnet
- 12:17 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1286.eqiad.wmnet with reason: host reimage
- 12:15 mvernon@cumin1002: START - Cookbook sre.hosts.reboot-single for host moss-be1001.eqiad.wmnet
- 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2036.codfw.wmnet
- 12:14 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1288.eqiad.wmnet with reason: host reimage
- 12:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2001.codfw.wmnet
- 12:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet
- 12:10 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1285.eqiad.wmnet with reason: host reimage
- 12:08 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1288.eqiad.wmnet with reason: host reimage
- 12:08 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1289.eqiad.wmnet with reason: host reimage
- 12:07 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1286.eqiad.wmnet with reason: host reimage
- 12:07 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1285.eqiad.wmnet with reason: host reimage
- 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet
- 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet
- 11:48 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1289.eqiad.wmnet with OS bookworm
- 11:48 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1288.eqiad.wmnet with OS bookworm
- 11:48 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1286.eqiad.wmnet with OS bookworm
- 11:47 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1285.eqiad.wmnet with OS bookworm
- 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet
- 11:23 oblivian@cumin1002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool mw-web-ro in codfw: maintenance
- 11:18 oblivian@cumin1002: START - Cookbook sre.discovery.service-route depool mw-web-ro in codfw: maintenance
- 11:14 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet
- 11:07 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet
- 11:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
- 10:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet
- 10:51 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
- 10:43 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2083.codfw.wmnet with OS bookworm
- 10:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
- 10:30 btullis@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host an-redacteddb1001.eqiad.wmnet
- 10:27 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1285.eqiad.wmnet with OS bookworm
- 10:26 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1017.eqiad.wmnet with reason: stopped being the active one, stopping replication
- 10:26 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on pc1017.eqiad.wmnet with reason: stopped being the active one, stopping replication
- 10:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
- 10:22 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1288.eqiad.wmnet with OS bookworm
- 10:22 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
- 10:21 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
- 10:21 Emperor: reboot apus frontends T376800
- 10:19 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1289.eqiad.wmnet with OS bookworm
- 10:18 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-redacteddb1001.eqiad.wmnet
- 10:17 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1286.eqiad.wmnet with OS bookworm
- 10:11 jynus@cumin1002: dbctl commit (dc=all): 'promoting pc1014 as the master of pc5 T378068', diff saved to https://phabricator.wikimedia.org/P70584 and previous config saved to /var/cache/conftool/dbconfig/20241024-101150-jynus.json
- 10:08 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1285.eqiad.wmnet with reason: host reimage
- 10:03 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1288.eqiad.wmnet with reason: host reimage
- 10:03 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1014.eqiad.wmnet with reason: moved pc number
- 10:03 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on pc1014.eqiad.wmnet with reason: moved pc number
- 10:00 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1289.eqiad.wmnet with reason: host reimage
- 09:59 jynus: restart pc1014 T378068
- 09:57 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1286.eqiad.wmnet with reason: host reimage
- 09:57 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1289.eqiad.wmnet with reason: host reimage
- 09:55 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1288.eqiad.wmnet with reason: host reimage
- 09:54 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1285.eqiad.wmnet with reason: host reimage
- 09:54 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1286.eqiad.wmnet with reason: host reimage
- 09:37 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1289.eqiad.wmnet with OS bookworm
- 09:35 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1288.eqiad.wmnet with OS bookworm
- 09:35 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1286.eqiad.wmnet with OS bookworm
- 09:34 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1285.eqiad.wmnet with OS bookworm
- 09:28 elukey@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2083.codfw.wmnet with OS bookworm
- 09:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
- 09:25 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1285-1286,1288-1289].eqiad.wmnet
- 09:23 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
- 09:22 jelto@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
- 09:22 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1285-1286,1288-1289].eqiad.wmnet
- 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
- 09:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
- 09:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
- 09:12 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on pc[1014,1017].eqiad.wmnet with reason: pc maintenance T378068
- 09:12 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on pc[1014,1017].eqiad.wmnet with reason: pc maintenance T378068
- 08:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P70582 and previous config saved to /var/cache/conftool/dbconfig/20241024-083027-arnaudb.json
- 08:30 kevinbazira@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
- 08:27 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 08:23 moritzm: installing bash/zsh updates from bookworm point release
- 08:23 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
- 08:22 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
- 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
- 08:17 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
- 08:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P70581 and previous config saved to /var/cache/conftool/dbconfig/20241024-081520-arnaudb.json
- 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
- 08:13 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
- 08:13 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
- 08:05 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet
- 08:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet
- 08:01 moritzm: installing libssh2 security updates
- 08:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet
- 08:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P70580 and previous config saved to /var/cache/conftool/dbconfig/20241024-080013-arnaudb.json
- 08:00 ayounsi@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 07:59 ayounsi@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 07:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet
- 07:56 ayounsi@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 07:56 ayounsi@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 07:55 elukey: restart ircstream on irc.wikimedia.org to remove a performance experiment
- 07:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P70579 and previous config saved to /var/cache/conftool/dbconfig/20241024-074506-arnaudb.json
- 07:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.sanitize-pii (exit_code=0) Checking PII for wikis annwiki in section s5
- 07:33 arnaudb@cumin1002: START - Cookbook sre.mysql.sanitize-pii Checking PII for wikis annwiki in section s5
- 07:32 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.sanitize-pii (exit_code=0) Setting up permissions and view database PII for wikis annwiki in section s5
- 07:32 arnaudb@cumin1002: START - Cookbook sre.mysql.sanitize-pii Setting up permissions and view database PII for wikis annwiki in section s5
- 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti2039.codfw.wmnet
- 07:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet
- 06:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P70578 and previous config saved to /var/cache/conftool/dbconfig/20241024-064440-arnaudb.json
- 06:44 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 06:44 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 06:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P70577 and previous config saved to /var/cache/conftool/dbconfig/20241024-064418-arnaudb.json
- 06:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P70576 and previous config saved to /var/cache/conftool/dbconfig/20241024-062910-arnaudb.json
- 06:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P70575 and previous config saved to /var/cache/conftool/dbconfig/20241024-061403-arnaudb.json
- 05:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P70574 and previous config saved to /var/cache/conftool/dbconfig/20241024-055856-arnaudb.json
- 04:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P70573 and previous config saved to /var/cache/conftool/dbconfig/20241024-045830-arnaudb.json
- 04:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 04:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 04:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 04:57 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 03:57 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 03:57 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 03:57 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 03:56 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 03:56 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 03:55 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 03:55 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 03:55 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 03:54 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 03:54 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 03:54 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 03:53 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 02:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2083.codfw.wmnet with OS bullseye
- 01:01 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2083.codfw.wmnet with OS bullseye
- 00:46 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 00:44 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on gerrit2003.wikimedia.org with reason: in setup and T338470
- 00:44 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on gerrit2003.wikimedia.org with reason: in setup and T338470
- 00:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: reboot
- 00:26 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 0:10:00 on gerrit2003.wikimedia.org with reason: reboot
- 00:26 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 00:23 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 00:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 00:22 mutante: gerrit2003 rebooting for T338470
- 00:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 00:14 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 00:05 dzahn@cumin2002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: security release 20241023
2024-10-23
- 23:47 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-eqiad: Apply openjdk upgrade (11.0.25+9-1~deb11u1) - eevans@cumin1002
- 23:46 reedy@deploy2002: Finished scap sync-world: T378006 (duration: 07m 09s)
- 23:39 reedy@deploy2002: Started scap sync-world: T378006
- 22:21 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-eqiad: Apply openjdk upgrade (11.0.25+9-1~deb11u1) - eevans@cumin1002
- 22:08 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-codfw: Apply openjdk upgrade (11.0.25+9-1~deb11u1) - eevans@cumin1002
- 21:59 urbanecm@deploy2002: Finished scap sync-world: Backport for throttle: Add exemption for WikiArabia (T377957) (duration: 07m 06s)
- 21:52 urbanecm@deploy2002: Started scap sync-world: Backport for throttle: Add exemption for WikiArabia (T377957)
- 21:22 dzahn@cumin2002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: security release 20241023
- 21:16 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2083.codfw.wmnet with OS bullseye
- away: UTC late deploys done
- 21:05 tgr@deploy2002: Finished scap sync-world: Backport for SessionManager: Add more logging when unpersisting invalid sessions (T372702), Log unexpected central session lookup misses (T372702) (duration: 15m 07s)
- 21:02 dzahn@cumin2002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: security release 20241023
- 21:00 tgr@deploy2002: tgr: Continuing with sync
- 20:55 dzahn@cumin2002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: security release 20241023
- 20:53 dzahn@cumin2002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: security release 20241023
- 20:52 tgr@deploy2002: tgr: Backport for SessionManager: Add more logging when unpersisting invalid sessions (T372702), Log unexpected central session lookup misses (T372702) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:50 tgr@deploy2002: Started scap sync-world: Backport for SessionManager: Add more logging when unpersisting invalid sessions (T372702), Log unexpected central session lookup misses (T372702)
- 20:46 dzahn@cumin2002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: security release 20241023
- 20:41 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-codfw: Apply openjdk upgrade (11.0.25+9-1~deb11u1) - eevans@cumin1002
- 20:40 eileen: civicrm upgraded from e787e5f2 to 1c6c4e08
- 20:06 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:05 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:02 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2083.codfw.wmnet with OS bullseye
- 19:56 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:55 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:46 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:46 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:35 eileen: civicrm upgraded from ce44ce45 to e787e5f2
- 19:18 dancy@deploy2002: Finished scap sync-world: Backport for Adjust return type documentation on SuggestedEdits (T378003) (duration: 13m 20s)
- 19:13 dancy@deploy2002: dancy: Continuing with sync
- 19:13 dancy@deploy2002: dancy: Backport for Adjust return type documentation on SuggestedEdits (T378003) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 19:09 sukhe: dummy authdns-update run
- 19:04 dancy@deploy2002: Started scap sync-world: Backport for Adjust return type documentation on SuggestedEdits (T378003)
- 18:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 18:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 18:46 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:28 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:26 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.28 refs T375659
- 18:09 sukhe: running agent on A:dnsbox
- 17:43 urbanecm@deploy2002: Finished scap sync-world: Backport for StructuredTaskMobileArticleTarget: Fix history hacks to avoid firing events (T377907) (duration: 11m 56s)
- 17:38 urbanecm@deploy2002: urbanecm: Continuing with sync
- 17:33 urbanecm@deploy2002: urbanecm: Backport for StructuredTaskMobileArticleTarget: Fix history hacks to avoid firing events (T377907) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 17:31 urbanecm@deploy2002: Started scap sync-world: Backport for StructuredTaskMobileArticleTarget: Fix history hacks to avoid firing events (T377907)
- 17:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 17:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 17:01 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 17:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 16:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 16:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 16:52 sukhe: restart ircecho on alerting hosts
- 16:35 sukhe: sudo cumin 'O:alerting_host or O:dnsbox' 'run-puppet-agent'
- 16:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 16:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 16:30 hnowlan@cumin1002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool sessionstore in codfw: sessionstore mesh migration T363996
- 16:25 hnowlan@cumin1002: START - Cookbook sre.discovery.service-route pool sessionstore in codfw: sessionstore mesh migration T363996
- 16:22 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
- 16:22 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/sessionstore: apply
- 16:21 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
- 16:20 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/sessionstore: apply
- 16:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 16:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 16:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 16:15 hnowlan@cumin1002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool sessionstore in codfw: sessionstore mesh migration T363996
- 16:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 16:09 hnowlan@cumin1002: START - Cookbook sre.discovery.service-route depool sessionstore in codfw: sessionstore mesh migration T363996
- 15:57 btullis@deploy2002: Finished deploy [airflow-dags/analytics_product@ba61f77]: T351388 (duration: 01m 15s)
- 15:56 btullis@deploy2002: Started deploy [airflow-dags/analytics_product@ba61f77]: T351388
- 15:55 btullis@deploy2002: Finished deploy [airflow-dags/platform_eng@ba61f77]: T351388 (duration: 00m 31s)
- 15:55 btullis@deploy2002: Started deploy [airflow-dags/platform_eng@ba61f77]: T351388
- 15:55 btullis@deploy2002: Finished deploy [airflow-dags/research@ba61f77]: T351388 (duration: 00m 45s)
- 15:54 btullis@deploy2002: Started deploy [airflow-dags/research@ba61f77]: T351388
- 15:53 btullis@deploy2002: Finished deploy [airflow-dags/search@ba61f77]: T351388 (duration: 00m 29s)
- 15:53 btullis@deploy2002: Started deploy [airflow-dags/search@ba61f77]: T351388
- 15:52 btullis@deploy2002: Finished deploy [airflow-dags/analytics@ba61f77]: T351388 (duration: 01m 08s)
- 15:51 btullis@deploy2002: Started deploy [airflow-dags/analytics@ba61f77]: T351388
- 15:51 btullis@deploy2002: Finished deploy [airflow-dags/analytics_test@ba61f77]: T351388 (duration: 00m 31s)
- 15:51 btullis@deploy2002: Started deploy [airflow-dags/analytics_test@ba61f77]: T351388
- 15:42 dduvall@deploy2002: Finished deploy [releng/jenkins-deploy@e1c56d1] (releasing): Deploying https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/95 (duration: 00m 53s)
- 15:42 dduvall@deploy2002: Started deploy [releng/jenkins-deploy@e1c56d1] (releasing): Deploying https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/95
- 15:35 hashar: Restarted CI Jenkins
- 15:28 moritzm: uploaded openjdk-8 8u422-b05-1~deb12u0 for component/jdk for bookworm-wikimedia (bootstrap build since openjdk-8 needs openjdk-8 to build)
- 15:20 dduvall@deploy2002: Finished deploy [releng/jenkins-deploy@d8e345f] (releasing): Deploying https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/94 (duration: 01m 05s)
- 15:19 dduvall@deploy2002: Started deploy [releng/jenkins-deploy@d8e345f] (releasing): Deploying https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/94
- 15:17 Lucas_WMDE: UTC afternoon backport+config window done
- 15:15 logmsgbot: lucaswerkmeister-wmde Deployed security patch for T377912
- 15:10 volans: uploaded spicerack_8.15.1 to apt.wikimedia.org bullseye-wikimedia
- 15:04 stran@deploy2002: Finished scap sync-world: Backport for Support template overrides in ContributionsPager (T356292), Add source wiki to contributions on Special:GlobalContributions (T356292) (duration: 10m 53s)
- 14:59 stran@deploy2002: stran: Continuing with sync
- 14:55 stran@deploy2002: stran: Backport for Support template overrides in ContributionsPager (T356292), Add source wiki to contributions on Special:GlobalContributions (T356292) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:53 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on rdb1014.eqiad.wmnet with reason: Hardware issue
- 14:53 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on rdb1014.eqiad.wmnet with reason: Hardware issue
- 14:53 stran@deploy2002: Started scap sync-world: Backport for Support template overrides in ContributionsPager (T356292), Add source wiki to contributions on Special:GlobalContributions (T356292)
- 14:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:43 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Enable CampaignEvents collaboration list in testwiki and test2wiki (v2) (T376055) (duration: 17m 47s)
- 14:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:39 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:39 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, daimona: Continuing with sync
- 14:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:28 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, daimona: Backport for Enable CampaignEvents collaboration list in testwiki and test2wiki (v2) (T376055) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:25 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Enable CampaignEvents collaboration list in testwiki and test2wiki (v2) (T376055)
- 14:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:21 tgr@deploy2002: Finished scap sync-world: Backport for Auth: pass accountType to authevents log stream (T341650 T375510 T375505), Auth: pass accountType to authevents log stream (T341650 T375510 T375505) (duration: 13m 23s)
- 14:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:18 sukhe: sudo cumin 'O:alerting_host' 'run-puppet-agent'
- 14:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:16 tgr@deploy2002: tgr: Continuing with sync
- 14:14 sukhe: sudo cumin 'A:dnsbox' 'run-puppet-agent'
- 14:10 tgr@deploy2002: tgr: Backport for Auth: pass accountType to authevents log stream (T341650 T375510 T375505), Auth: pass accountType to authevents log stream (T341650 T375510 T375505) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:07 tgr@deploy2002: Started scap sync-world: Backport for Auth: pass accountType to authevents log stream (T341650 T375510 T375505), Auth: pass accountType to authevents log stream (T341650 T375510 T375505)
- 13:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet
- 13:54 sukhe@cumin1002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs and not A:ulsfo and A:lvs
- 13:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet
- 13:39 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 13:39 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 13:37 moritzm: instaling gdk-pixbuf security updates
- 13:34 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
- 13:34 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
- 13:34 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
- 13:33 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop: apply
- 13:33 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
- 13:33 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
- 13:32 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
- 13:31 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for WikiProjectIDLookup: use SparqlClient and make endpoint configurable (T377746) (duration: 07m 15s)
- 13:27 lucaswerkmeister-wmde@deploy2002: daimona, lucaswerkmeister-wmde: Continuing with sync
- 13:27 lucaswerkmeister-wmde@deploy2002: daimona, lucaswerkmeister-wmde: Backport for WikiProjectIDLookup: use SparqlClient and make endpoint configurable (T377746) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:26 sukhe@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs and not A:ulsfo and A:lvs
- 13:24 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for WikiProjectIDLookup: use SparqlClient and make endpoint configurable (T377746)
- 13:18 sukhe@cumin1002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs and A:ulsfo and A:lvs
- 13:15 sukhe@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs and A:ulsfo and A:lvs
- 13:13 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 13:13 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 13:12 moritzm: installing qemu security updates
- 13:12 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 13:11 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 13:09 sukhe: running agent on A:lvs to roll out CR 1082238
- 13:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2040.codfw.wmnet to cluster codfw and group C
- 12:53 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2040.codfw.wmnet to cluster codfw and group C
- 12:26 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2039.codfw.wmnet to cluster codfw and group C
- 12:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2039.codfw.wmnet to cluster codfw and group C
- 12:16 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2039.codfw.wmnet to cluster codfw and group C
- 12:16 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2039.codfw.wmnet to cluster codfw and group C
- 11:49 dreamyjazz@deploy2002: Finished scap sync-world: Backport for recentchanges: Use current time for imported revision category changes (T377932) (duration: 07m 26s)
- 11:44 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 11:44 dreamyjazz@deploy2002: dreamyjazz: Backport for recentchanges: Use current time for imported revision category changes (T377932) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 11:41 dreamyjazz@deploy2002: Started scap sync-world: Backport for recentchanges: Use current time for imported revision category changes (T377932)
- 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2012.codfw.wmnet
- 11:11 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
- 11:11 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
- 11:09 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
- 11:09 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/zotero: apply
- 11:05 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
- 11:05 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
- 10:53 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
- 10:51 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
- 10:45 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
- 10:45 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop: apply
- 10:43 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
- 10:43 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
- 10:14 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
- 10:13 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
- 10:13 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
- 10:13 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
- 10:12 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
- 10:12 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
- 10:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet
- 10:05 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
- 10:03 Dreamy_Jazz: Restarted MediaModeration scanning script for commonswiki - https://wikitech.wikimedia.org/wiki/MediaModeration
- 09:59 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
- 09:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet
- 09:42 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
- 09:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet
- 09:34 volans@cumin1002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1185 gradually with 4 steps - Testing new cookbook
- 09:31 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet
- 09:30 volans@cumin1002: START - Cookbook sre.mysql.pool db1185 gradually with 4 steps - Testing new cookbook
- 09:29 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
- 09:29 volans@cumin1002: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1185 - Testing new cookbook
- 09:29 volans@cumin1002: START - Cookbook sre.mysql.depool db1185 - Testing new cookbook
- 09:24 volans@cumin1002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1185 gradually with 4 steps - Testing new cookbook
- 09:24 volans@cumin1002: START - Cookbook sre.mysql.pool db1185 gradually with 4 steps - Testing new cookbook
- 09:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab-runner2004.codfw.wmnet with OS bullseye
- 09:02 Tran: UTC morning deploys done
- 08:48 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:48 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab-runner2004.codfw.wmnet with reason: host reimage
- 08:29 moritzm: installing Java 11 security updates
- 08:28 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab-runner2004.codfw.wmnet with reason: host reimage
- 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
- 08:26 jmm@cumin2002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: new JDK - jmm@cumin2002
- 08:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
- 08:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
- 08:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
- 08:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
- 08:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host gitlab-runner2004
- 08:12 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host gitlab-runner2004
- 08:12 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host gitlab-runner2004
- 08:12 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) gitlab-runner2004.codfw.wmnet 71.48.192.10.in-addr.arpa 1.7.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 08:12 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache gitlab-runner2004.codfw.wmnet 71.48.192.10.in-addr.arpa 1.7.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 08:12 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:11 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host gitlab-runner2004 - jelto@cumin1002"
- 08:11 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host gitlab-runner2004 - jelto@cumin1002"
- 08:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
- 08:08 jelto@cumin1002: START - Cookbook sre.dns.netbox
- 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
- 08:07 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host gitlab-runner2004
- 08:07 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host gitlab-runner2004.codfw.wmnet with OS bullseye
- 08:06 jmm@cumin2002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: new JDK - jmm@cumin2002
- 08:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
- 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: reboot
- 07:56 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: reboot
- 07:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab-runner2003.codfw.wmnet with OS bullseye
- 07:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab-runner2003.codfw.wmnet with reason: host reimage
- 07:33 moritzm: installing perf updates on bookworm nodes
- 07:32 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab-runner2003.codfw.wmnet with reason: host reimage
- 07:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2012.codfw.wmnet
- 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ml-etcd2002.codfw.wmnet to plain
- 07:23 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ml-etcd2002.codfw.wmnet to plain
- 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2012.codfw.wmnet
- 07:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2012.codfw.wmnet
- 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ml-etcd2002.codfw.wmnet to drbd
- 07:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host gitlab-runner2003
- 07:15 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host gitlab-runner2003
- 07:15 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host gitlab-runner2003
- 07:15 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) gitlab-runner2003.codfw.wmnet 93.32.192.10.in-addr.arpa 3.9.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 07:15 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache gitlab-runner2003.codfw.wmnet 93.32.192.10.in-addr.arpa 3.9.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 07:15 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:15 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host gitlab-runner2003 - jelto@cumin1002"
- 07:15 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host gitlab-runner2003 - jelto@cumin1002"
- 07:12 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ml-etcd2002.codfw.wmnet to drbd
- 07:11 jelto@cumin1002: START - Cookbook sre.dns.netbox
- 07:11 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host gitlab-runner2003
- 07:10 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host gitlab-runner2003.codfw.wmnet with OS bullseye
- 06:48 kart_: Updated cxserver to 2024-10-23-055433-production
- 06:47 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 06:47 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 06:45 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 06:44 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 06:44 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 06:44 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 06:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2012.codfw.wmnet
- 06:35 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2012.codfw.wmnet
- 05:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
- 05:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
- 04:18 eileen: civicrm upgraded from de642bea to ce44ce45
- 00:01 ejegg: fundraising civicrm upgraded from 5463f37b to de642bea
2024-10-22
- 23:32 ejegg: fundraising civicrm upgraded from d9e85c3d to 5463f37b
- 22:59 ejegg: fundraising civicrm upgraded from 36660cb3 to d9e85c3d
- 22:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P70562 and previous config saved to /var/cache/conftool/dbconfig/20241022-223858-ladsgroup.json
- 22:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P70561 and previous config saved to /var/cache/conftool/dbconfig/20241022-222352-ladsgroup.json
- 22:11 zabe@deploy2002: Finished scap sync-world: Backport for s1: Reduce revision-slots cache expiry to 60 seconds (T183490) (duration: 07m 17s)
- 22:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P70560 and previous config saved to /var/cache/conftool/dbconfig/20241022-220847-ladsgroup.json
- 22:07 zabe@deploy2002: zabe: Continuing with sync
- 22:06 zabe@deploy2002: zabe: Backport for s1: Reduce revision-slots cache expiry to 60 seconds (T183490) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 22:03 zabe@deploy2002: Started scap sync-world: Backport for s1: Reduce revision-slots cache expiry to 60 seconds (T183490)
- 21:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T367856)', diff saved to https://phabricator.wikimedia.org/P70559 and previous config saved to /var/cache/conftool/dbconfig/20241022-215137-ladsgroup.json
- 21:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
- 21:44 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
- 21:44 dancy@deploy2002: Installation of scap version "4.117.0" completed for 209 hosts
- 21:40 dancy@deploy2002: Installing scap version "4.117.0" for 209 hosts
- 21:01 dduvall@deploy2002: Finished deploy [releng/jenkins-deploy@b08d130] (releasing): Deploying changes to single-version MediaWiki image build (duration: 01m 44s)
- 21:00 dduvall@deploy2002: Started deploy [releng/jenkins-deploy@b08d130] (releasing): Deploying changes to single-version MediaWiki image build
- 20:33 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 20:32 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 20:28 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1239.eqiad.wmnet with reason: Maintenance
- 20:27 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1239.eqiad.wmnet with reason: Maintenance
- 20:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T376905)', diff saved to https://phabricator.wikimedia.org/P70558 and previous config saved to /var/cache/conftool/dbconfig/20241022-202717-ladsgroup.json
- 20:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P70557 and previous config saved to /var/cache/conftool/dbconfig/20241022-201210-ladsgroup.json
- 19:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P70556 and previous config saved to /var/cache/conftool/dbconfig/20241022-195703-ladsgroup.json
- 19:54 swfrench-wmf: running puppet on A:cp-text (-b11) after validating ATS Lua changes on cp4040 - T372605
- 19:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T376905)', diff saved to https://phabricator.wikimedia.org/P70555 and previous config saved to /var/cache/conftool/dbconfig/20241022-194156-ladsgroup.json
- 19:40 swfrench-wmf: disabling puppet on A:cp-text before merging ATS Lua changes - T372605
- 19:39 ladsgroup@deploy2002: Finished scap sync-world: Backport for Fix duplicated key in wgVectorNightMode (duration: 07m 51s)
- 19:36 ladsgroup@deploy2002: ladsgroup, ebrahim: Continuing with sync
- 19:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1235 (T376905)', diff saved to https://phabricator.wikimedia.org/P70554 and previous config saved to /var/cache/conftool/dbconfig/20241022-193352-ladsgroup.json
- 19:35 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
- 19:34 ladsgroup@deploy2002: ladsgroup, ebrahim: Backport for Fix duplicated key in wgVectorNightMode synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 19:34 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
- 19:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T376905)', diff saved to https://phabricator.wikimedia.org/P70553 and previous config saved to /var/cache/conftool/dbconfig/20241022-193327-ladsgroup.json
- 19:31 ladsgroup@deploy2002: Started scap sync-world: Backport for Fix duplicated key in wgVectorNightMode
- 19:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P70552 and previous config saved to /var/cache/conftool/dbconfig/20241022-191820-ladsgroup.json
- 19:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P70551 and previous config saved to /var/cache/conftool/dbconfig/20241022-190313-ladsgroup.json
- 19:00 dduvall@deploy2002: Installation of scap version "4.116.0" completed for 209 hosts
- 18:56 dduvall@deploy2002: Installing scap version "4.116.0" for 209 hosts
- 18:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P70550 and previous config saved to /var/cache/conftool/dbconfig/20241022-184946-arnaudb.json
- 18:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T376905)', diff saved to https://phabricator.wikimedia.org/P70549 and previous config saved to /var/cache/conftool/dbconfig/20241022-184806-ladsgroup.json
- 18:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T376905)', diff saved to https://phabricator.wikimedia.org/P70548 and previous config saved to /var/cache/conftool/dbconfig/20241022-183955-ladsgroup.json
- 18:39 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
- 18:39 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
- 18:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T376905)', diff saved to https://phabricator.wikimedia.org/P70547 and previous config saved to /var/cache/conftool/dbconfig/20241022-183930-ladsgroup.json
- 18:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P70546 and previous config saved to /var/cache/conftool/dbconfig/20241022-183440-arnaudb.json
- 18:26 dancy@deploy2002: sync-world aborted: Refreshing (duration: 01m 33s)
- 18:24 dancy@deploy2002: Started scap sync-world: Refreshing
- 18:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P70544 and previous config saved to /var/cache/conftool/dbconfig/20241022-182423-ladsgroup.json
- 18:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P70543 and previous config saved to /var/cache/conftool/dbconfig/20241022-181933-arnaudb.json
- 18:17 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.28 refs T375659
- 18:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P70542 and previous config saved to /var/cache/conftool/dbconfig/20241022-180916-ladsgroup.json
- 18:09 dancy@deploy2002: Finished scap sync-world: Backport for Prevent blocked users from being able to review/unreview articles (T366991) (duration: 07m 26s)
- 18:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P70541 and previous config saved to /var/cache/conftool/dbconfig/20241022-180426-arnaudb.json
- 18:04 dancy@deploy2002: dancy, sbassett: Continuing with sync
- 18:04 dancy@deploy2002: dancy, sbassett: Backport for Prevent blocked users from being able to review/unreview articles (T366991) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 18:01 dancy@deploy2002: Started scap sync-world: Backport for Prevent blocked users from being able to review/unreview articles (T366991)
- 17:54 sukhe: sudo cumin -b4 "A:cp-upload" 'run-puppet-agent --enable "merging CR 1078994"': T375761
- 17:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T376905)', diff saved to https://phabricator.wikimedia.org/P70540 and previous config saved to /var/cache/conftool/dbconfig/20241022-175409-ladsgroup.json
- 17:50 dduvall@deploy2002: Finished deploy [releng/jenkins-deploy@16eb792] (releasing): Deploying https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/90 (duration: 01m 21s)
- 17:49 dduvall@deploy2002: Started deploy [releng/jenkins-deploy@16eb792] (releasing): Deploying https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/90
- 17:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T376905)', diff saved to https://phabricator.wikimedia.org/P70539 and previous config saved to /var/cache/conftool/dbconfig/20241022-174555-ladsgroup.json
- 17:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
- 17:45 sukhe: sudo cumin "A:cp-upload" 'disable-puppet "merging CR 1078994"': T375761
- 17:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
- 17:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T376905)', diff saved to https://phabricator.wikimedia.org/P70538 and previous config saved to /var/cache/conftool/dbconfig/20241022-174530-ladsgroup.json
- 17:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P70537 and previous config saved to /var/cache/conftool/dbconfig/20241022-173022-ladsgroup.json
- 17:30 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2014.codfw.wmnet
- 17:23 cmooney@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
- 17:18 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2014.codfw.wmnet with reason: rebooting to test changes rolled out in CR 1006063
- 17:17 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on lvs2014.codfw.wmnet with reason: rebooting to test changes rolled out in CR 1006063
- 17:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P70536 and previous config saved to /var/cache/conftool/dbconfig/20241022-171515-ladsgroup.json
- 17:14 sukhe: re-enable Puppet on A:lvs [change merged on lvs2014]: T358260
- 17:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool sessionstore in eqiad: repooling sessionstore post mesh migration T363996
- 17:04 hnowlan@cumin1002: START - Cookbook sre.discovery.service-route pool sessionstore in eqiad: repooling sessionstore post mesh migration T363996
- 17:04 sukhe: disable Puppet on A:lvs to merge 1006063: T358260
- 17:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P70535 and previous config saved to /var/cache/conftool/dbconfig/20241022-170400-arnaudb.json
- 17:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 17:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 17:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T367781)', diff saved to https://phabricator.wikimedia.org/P70534 and previous config saved to /var/cache/conftool/dbconfig/20241022-170337-arnaudb.json
- 17:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T376905)', diff saved to https://phabricator.wikimedia.org/P70533 and previous config saved to /var/cache/conftool/dbconfig/20241022-170008-ladsgroup.json
- 16:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T376905)', diff saved to https://phabricator.wikimedia.org/P70532 and previous config saved to /var/cache/conftool/dbconfig/20241022-165211-ladsgroup.json
- 16:52 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1176.eqiad.wmnet
- 16:52 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 16:51 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 16:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T376905)', diff saved to https://phabricator.wikimedia.org/P70531 and previous config saved to /var/cache/conftool/dbconfig/20241022-165147-ladsgroup.json
- 16:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P70530 and previous config saved to /var/cache/conftool/dbconfig/20241022-164830-arnaudb.json
- 16:47 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
- 16:46 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
- 16:46 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2083.codfw.wmnet with OS bullseye
- 16:44 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
- 16:44 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
- 16:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P70529 and previous config saved to /var/cache/conftool/dbconfig/20241022-163639-ladsgroup.json
- 16:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P70528 and previous config saved to /var/cache/conftool/dbconfig/20241022-163323-arnaudb.json
- 16:31 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-worker1176.eqiad.wmnet
- 16:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P70527 and previous config saved to /var/cache/conftool/dbconfig/20241022-162132-ladsgroup.json
- 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T367781)', diff saved to https://phabricator.wikimedia.org/P70526 and previous config saved to /var/cache/conftool/dbconfig/20241022-161816-arnaudb.json
- 16:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T367781)', diff saved to https://phabricator.wikimedia.org/P70525 and previous config saved to /var/cache/conftool/dbconfig/20241022-161604-arnaudb.json
- 16:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 16:15 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 16:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P70524 and previous config saved to /var/cache/conftool/dbconfig/20241022-161552-arnaudb.json
- 16:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool sessionstore in eqiad: testing sessionstore mesh migration
- 16:08 hnowlan@cumin1002: START - Cookbook sre.discovery.service-route depool sessionstore in eqiad: testing sessionstore mesh migration
- 16:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T376905)', diff saved to https://phabricator.wikimedia.org/P70523 and previous config saved to /var/cache/conftool/dbconfig/20241022-160625-ladsgroup.json
- 16:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P70522 and previous config saved to /var/cache/conftool/dbconfig/20241022-160045-arnaudb.json
- 15:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5004.wikimedia.org
- 15:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T376905)', diff saved to https://phabricator.wikimedia.org/P70521 and previous config saved to /var/cache/conftool/dbconfig/20241022-155824-ladsgroup.json
- 15:58 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
- 15:58 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
- 15:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T376905)', diff saved to https://phabricator.wikimedia.org/P70520 and previous config saved to /var/cache/conftool/dbconfig/20241022-155759-ladsgroup.json
- 15:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2011.codfw.wmnet
- 15:54 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
- 15:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5004.wikimedia.org
- 15:53 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
- 15:53 hnowlan@cumin1002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) check sessionstore: maintenance
- 15:53 hnowlan@cumin1002: START - Cookbook sre.discovery.service-route check sessionstore: maintenance
- 15:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
- 15:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
- 15:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P70519 and previous config saved to /var/cache/conftool/dbconfig/20241022-154538-arnaudb.json
- 15:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P70518 and previous config saved to /var/cache/conftool/dbconfig/20241022-154251-ladsgroup.json
- 15:39 sbassett@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 15:38 sbassett@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 15:38 sbassett@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 15:38 sbassett@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 15:38 sbassett@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 15:38 sbassett@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 15:37 sbassett@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 15:37 sbassett@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 15:36 sbassett@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 15:36 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
- 15:36 sbassett@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 15:35 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
- 15:32 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2083.codfw.wmnet with OS bullseye
- 15:31 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
- 15:30 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
- 15:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P70517 and previous config saved to /var/cache/conftool/dbconfig/20241022-153031-arnaudb.json
- 15:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P70516 and previous config saved to /var/cache/conftool/dbconfig/20241022-152743-ladsgroup.json
- 15:19 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 15:19 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 15:18 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 15:18 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 15:15 aqu: Deployed refinery using scap, then deployed onto hdfs
- 15:14 cgoubert@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) check for host kubestagemaster2003.codfw.wmnet
- 15:14 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node check for host kubestagemaster2003.codfw.wmnet
- 15:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T376905)', diff saved to https://phabricator.wikimedia.org/P70515 and previous config saved to /var/cache/conftool/dbconfig/20241022-151237-ladsgroup.json
- 15:11 gmodena@deploy2002: Finished deploy [airflow-dags/analytics@7c2d65f]: DPE 2024-10-22 deployment train (duration: 01m 16s)
- 15:10 gmodena@deploy2002: Started deploy [airflow-dags/analytics@7c2d65f]: DPE 2024-10-22 deployment train
- 15:09 brennen@deploy2002: Finished deploy [phabricator/deployment@582cde5]: deploy phab1004 for T377850 (duration: 01m 04s)
- 15:08 brennen@deploy2002: Started deploy [phabricator/deployment@582cde5]: deploy phab1004 for T377850
- 15:07 brennen@deploy2002: Finished deploy [phabricator/deployment@582cde5]: test deploy phab2002 for T377850 (may fail, expected) (duration: 00m 24s)
- 15:07 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:07 eoghan@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:30:00 on phab.wmfusercontent.org with reason: Phabricator deployment
- 15:07 eoghan@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab.wmfusercontent.org with reason: Phabricator deployment
- 15:07 brennen@deploy2002: Started deploy [phabricator/deployment@582cde5]: test deploy phab2002 for T377850 (may fail, expected)
- 15:06 eoghan@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:30:00 on phabricator.wikimedia.org with reason: Phabricator deployment
- 15:06 eoghan@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phabricator.wikimedia.org with reason: Phabricator deployment
- 15:06 eoghan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deployment
- 15:06 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:06 eoghan@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deployment
- 15:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1207 (T376905)', diff saved to https://phabricator.wikimedia.org/P70514 and previous config saved to /var/cache/conftool/dbconfig/20241022-150435-ladsgroup.json
- 15:04 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 15:04 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 15:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T376905)', diff saved to https://phabricator.wikimedia.org/P70513 and previous config saved to /var/cache/conftool/dbconfig/20241022-150409-ladsgroup.json
- 14:57 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 14:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 100%: T377718', diff saved to https://phabricator.wikimedia.org/P70512 and previous config saved to /var/cache/conftool/dbconfig/20241022-145653-arnaudb.json
- 14:53 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2084.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 14:52 hashar@deploy2002: Finished deploy [gerrit/gerrit@30691f2]: Update patch demo to recognize both legacy and new URLs - T374954 (duration: 00m 10s)
- 14:52 hashar@deploy2002: Started deploy [gerrit/gerrit@30691f2]: Update patch demo to recognize both legacy and new URLs - T374954
- 14:50 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
- 14:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P70511 and previous config saved to /var/cache/conftool/dbconfig/20241022-144902-ladsgroup.json
- 14:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 75%: T377718', diff saved to https://phabricator.wikimedia.org/P70510 and previous config saved to /var/cache/conftool/dbconfig/20241022-144148-arnaudb.json
- 14:40 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
- 14:40 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
- 14:37 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:37 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-be2084 to codfw - jhancock@cumin2002"
- 14:37 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-be2084 to codfw - jhancock@cumin2002"
- 14:36 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 100%: post clone', diff saved to https://phabricator.wikimedia.org/P70509 and previous config saved to /var/cache/conftool/dbconfig/20241022-143628-arnaudb.json
- 14:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
- 14:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
- 14:34 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
- 14:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P70507 and previous config saved to /var/cache/conftool/dbconfig/20241022-143355-ladsgroup.json
- 14:32 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Fix performer link on Special:GlobalBlockList (T377398) (duration: 07m 43s)
- 14:31 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 14:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P70506 and previous config saved to /var/cache/conftool/dbconfig/20241022-143005-arnaudb.json
- 14:30 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 14:29 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 14:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 14:29 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 14:27 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 14:27 dreamyjazz@deploy2002: dreamyjazz: Backport for Fix performer link on Special:GlobalBlockList (T377398) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 50%: T377718', diff saved to https://phabricator.wikimedia.org/P70505 and previous config saved to /var/cache/conftool/dbconfig/20241022-142642-arnaudb.json
- 14:24 dreamyjazz@deploy2002: Started scap sync-world: Backport for Fix performer link on Special:GlobalBlockList (T377398)
- 14:21 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 75%: post clone', diff saved to https://phabricator.wikimedia.org/P70504 and previous config saved to /var/cache/conftool/dbconfig/20241022-142123-arnaudb.json
- 14:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T376905)', diff saved to https://phabricator.wikimedia.org/P70503 and previous config saved to /var/cache/conftool/dbconfig/20241022-141848-ladsgroup.json
- 14:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 25%: T377718', diff saved to https://phabricator.wikimedia.org/P70502 and previous config saved to /var/cache/conftool/dbconfig/20241022-141137-arnaudb.json
- 14:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2011.codfw.wmnet
- 14:10 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2011.codfw.wmnet
- 14:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2011.codfw.wmnet
- 14:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T376905)', diff saved to https://phabricator.wikimedia.org/P70501 and previous config saved to /var/cache/conftool/dbconfig/20241022-140956-ladsgroup.json
- 14:09 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
- 14:09 ejegg: payments-wiki upgraded from 7ae3479f to a039cd50
- 14:09 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
- 14:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T376905)', diff saved to https://phabricator.wikimedia.org/P70500 and previous config saved to /var/cache/conftool/dbconfig/20241022-140931-ladsgroup.json
- 14:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2011.codfw.wmnet
- 14:06 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 50%: post clone', diff saved to https://phabricator.wikimedia.org/P70499 and previous config saved to /var/cache/conftool/dbconfig/20241022-140617-arnaudb.json
- 14:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2011.codfw.wmnet
- 13:59 moritzm: rebalance ganeti clusters in magru following reboots
- 13:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet
- 13:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
- 13:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 10%: T377718', diff saved to https://phabricator.wikimedia.org/P70498 and previous config saved to /var/cache/conftool/dbconfig/20241022-135631-arnaudb.json
- 13:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P70497 and previous config saved to /var/cache/conftool/dbconfig/20241022-135424-ladsgroup.json
- 13:52 Lucas_WMDE: UTC afternoon backport+window done (a further GlobalBlocking fix will be backported out-of-window soon)
- 13:51 aqu@deploy2002: Finished deploy [analytics/refinery@ffc985a] (hadoop-test): Adding refinery/source 0.2.49.2 & 0.2.53 [analytics/refinery@ffc985a7] (duration: 03m 17s)
- 13:51 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 25%: post clone', diff saved to https://phabricator.wikimedia.org/P70496 and previous config saved to /var/cache/conftool/dbconfig/20241022-135112-arnaudb.json
- 13:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
- 13:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet
- 13:48 aqu@deploy2002: Started deploy [analytics/refinery@ffc985a] (hadoop-test): Adding refinery/source 0.2.49.2 & 0.2.53 [analytics/refinery@ffc985a7]
- 13:48 aqu@deploy2002: Finished deploy [analytics/refinery@ffc985a] (thin): Adding refinery/source 0.2.49.2 & 0.2.53 [analytics/refinery@ffc985a7] (duration: 00m 07s)
- 13:48 aqu@deploy2002: Started deploy [analytics/refinery@ffc985a] (thin): Adding refinery/source 0.2.49.2 & 0.2.53 [analytics/refinery@ffc985a7]
- 13:47 aqu@deploy2002: Finished deploy [analytics/refinery@ffc985a] (thin): Adding refinery/source 0.2.49.2 & 0.2.53 [analytics/refinery@ffc985a7] (duration: 00m 57s)
- 13:47 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 13:46 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ml-serve2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 13:46 aqu@deploy2002: Started deploy [analytics/refinery@ffc985a] (thin): Adding refinery/source 0.2.49.2 & 0.2.53 [analytics/refinery@ffc985a7]
- 13:45 aqu@deploy2002: deploy aborted: Adding refinery/source 0.2.49.2 & 0.2.53 [analytics/refinery@ffc985a7] (duration: 03m 50s)
- 13:45 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 13:44 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Activate feature flag to default move wikibase sidebar link to other projects. (T66315) (duration: 08m 40s)
- 13:41 aqu@deploy2002: Started deploy [analytics/refinery@ffc985a] (thin): Adding refinery/source 0.2.49.2 & 0.2.53 [analytics/refinery@ffc985a7]
- 13:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 5%: T377718', diff saved to https://phabricator.wikimedia.org/P70495 and previous config saved to /var/cache/conftool/dbconfig/20241022-134126-arnaudb.json
- 13:40 lucaswerkmeister-wmde@deploy2002: joelyrookewmde, lucaswerkmeister-wmde: Continuing with sync
- 13:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab-runner2002.codfw.wmnet with OS bullseye
- 13:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P70494 and previous config saved to /var/cache/conftool/dbconfig/20241022-133916-ladsgroup.json
- 13:39 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ml-serve2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 13:37 lucaswerkmeister-wmde@deploy2002: joelyrookewmde, lucaswerkmeister-wmde: Backport for Activate feature flag to default move wikibase sidebar link to other projects. (T66315) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:35 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Activate feature flag to default move wikibase sidebar link to other projects. (T66315)
- 13:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2149.codfw.wmnet onto db2227.codfw.wmnet
- 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
- 13:32 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Don't escape performer link HTML in GlobalBlockDetailsRenderer (T377398) (duration: 15m 27s)
- 13:30 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2011.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 13:30 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ml-serve2011.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 13:29 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2011.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 13:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 100%: T377718', diff saved to https://phabricator.wikimedia.org/P70493 and previous config saved to /var/cache/conftool/dbconfig/20241022-132745-arnaudb.json
- 13:25 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, dreamyjazz: Continuing with sync
- 13:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
- 13:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T376905)', diff saved to https://phabricator.wikimedia.org/P70492 and previous config saved to /var/cache/conftool/dbconfig/20241022-132409-ladsgroup.json
- 13:23 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ml-serve2011.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2003.codfw.wmnet
- 13:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2003.codfw.wmnet
- 13:19 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2085-2086,2088-2089].codfw.wmnet
- 13:19 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, dreamyjazz: Backport for Don't escape performer link HTML in GlobalBlockDetailsRenderer (T377398) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:19 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2085-2086,2088-2089].codfw.wmnet
- 13:16 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Don't escape performer link HTML in GlobalBlockDetailsRenderer (T377398)
- 13:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T376905)', diff saved to https://phabricator.wikimedia.org/P70491 and previous config saved to /var/cache/conftool/dbconfig/20241022-131448-ladsgroup.json
- 13:14 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 13:14 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 13:14 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
- 13:14 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Release CampaignEvents to eswiki (T376786) (duration: 09m 35s)
- 13:14 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
- 13:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T376905)', diff saved to https://phabricator.wikimedia.org/P70490 and previous config saved to /var/cache/conftool/dbconfig/20241022-131415-ladsgroup.json
- 13:14 aqu@deploy2002: Finished deploy [analytics/refinery@ffc985a]: Adding refinery/source 0.2.49.2 & 0.2.53 [analytics/refinery@ffc985a7] (duration: 19m 41s)
- 13:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 75%: T377718', diff saved to https://phabricator.wikimedia.org/P70489 and previous config saved to /var/cache/conftool/dbconfig/20241022-131239-arnaudb.json
- 13:09 lucaswerkmeister-wmde@deploy2002: mhorsey, lucaswerkmeister-wmde: Continuing with sync
- 13:07 lucaswerkmeister-wmde@deploy2002: mhorsey, lucaswerkmeister-wmde: Backport for Release CampaignEvents to eswiki (T376786) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:04 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Release CampaignEvents to eswiki (T376786)
- 13:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab-runner2002.codfw.wmnet with reason: host reimage
- 12:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P70488 and previous config saved to /var/cache/conftool/dbconfig/20241022-125908-ladsgroup.json
- 12:58 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab-runner2002.codfw.wmnet with reason: host reimage
- 12:57 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 50%: T377718', diff saved to https://phabricator.wikimedia.org/P70487 and previous config saved to /var/cache/conftool/dbconfig/20241022-125734-arnaudb.json
- 12:55 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2089.codfw.wmnet with OS bookworm
- 12:54 aqu@deploy2002: Started deploy [analytics/refinery@ffc985a]: Adding refinery/source 0.2.49.2 & 0.2.53 [analytics/refinery@ffc985a7]
- 12:53 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2086.codfw.wmnet with OS bookworm
- 12:50 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2085.codfw.wmnet with OS bookworm
- 12:45 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2088.codfw.wmnet with OS bookworm
- 12:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P70486 and previous config saved to /var/cache/conftool/dbconfig/20241022-124401-ladsgroup.json
- 12:42 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 25%: T377718', diff saved to https://phabricator.wikimedia.org/P70485 and previous config saved to /var/cache/conftool/dbconfig/20241022-124228-arnaudb.json
- 12:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host gitlab-runner2002
- 12:42 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host gitlab-runner2002
- 12:41 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host gitlab-runner2002
- 12:41 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) gitlab-runner2002.codfw.wmnet 161.16.192.10.in-addr.arpa 1.6.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 12:41 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache gitlab-runner2002.codfw.wmnet 161.16.192.10.in-addr.arpa 1.6.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 12:41 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:41 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host gitlab-runner2002 - jelto@cumin1002"
- 12:41 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host gitlab-runner2002 - jelto@cumin1002"
- 12:37 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2089.codfw.wmnet with reason: host reimage
- 12:37 jelto@cumin1002: START - Cookbook sre.dns.netbox
- 12:36 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host gitlab-runner2002
- 12:36 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host gitlab-runner2002.codfw.wmnet with OS bullseye
- 12:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 12:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 12:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 12:34 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2086.codfw.wmnet with reason: host reimage
- 12:34 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2089.codfw.wmnet with reason: host reimage
- 12:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 12:31 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2085.codfw.wmnet with reason: host reimage
- 12:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T376905)', diff saved to https://phabricator.wikimedia.org/P70484 and previous config saved to /var/cache/conftool/dbconfig/20241022-122854-ladsgroup.json
- 12:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2010.codfw.wmnet
- 12:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 12:27 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2088.codfw.wmnet with reason: host reimage
- 12:27 Dreamy_Jazz: Running MediaModeration scan on all group2 wikis
- 12:27 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2086.codfw.wmnet with reason: host reimage
- 12:27 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2085.codfw.wmnet with reason: host reimage
- 12:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 10%: T377718', diff saved to https://phabricator.wikimedia.org/P70483 and previous config saved to /var/cache/conftool/dbconfig/20241022-122723-arnaudb.json
- 12:27 Dreamy_Jazz: Stopped MediaModeration scan on all group1 wikis
- 12:24 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2088.codfw.wmnet with reason: host reimage
- 12:23 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 12:20 Dreamy_Jazz: Running MediaModeration scan on all group1 wikis
- 12:20 klausman@cumin2002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Java 11 security updates - klausman@cumin2002
- 12:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1195 (T376905)', diff saved to https://phabricator.wikimedia.org/P70482 and previous config saved to /var/cache/conftool/dbconfig/20241022-121928-ladsgroup.json
- 12:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
- 12:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
- 12:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T376905)', diff saved to https://phabricator.wikimedia.org/P70481 and previous config saved to /var/cache/conftool/dbconfig/20241022-121903-ladsgroup.json
- 12:17 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 12:12 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2149.codfw.wmnet onto db2227.codfw.wmnet
- 12:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 5%: T377718', diff saved to https://phabricator.wikimedia.org/P70480 and previous config saved to /var/cache/conftool/dbconfig/20241022-121218-arnaudb.json
- 12:12 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2010.codfw.wmnet
- 12:09 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2089.codfw.wmnet with OS bookworm
- 12:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2149,2227].codfw.wmnet with reason: maintenance
- 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2009.codfw.wmnet
- 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 12:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db[2149,2227].codfw.wmnet with reason: maintenance
- 12:08 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2088.codfw.wmnet with OS bookworm
- 12:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2149 and db2227 - T377718', diff saved to https://phabricator.wikimedia.org/P70479 and previous config saved to /var/cache/conftool/dbconfig/20241022-120753-arnaudb.json
- 12:06 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2086.codfw.wmnet with OS bookworm
- 12:06 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2085.codfw.wmnet with OS bookworm
- 12:05 Dreamy_Jazz: Running MediaModeration scan on all group0 wikis
- 12:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P70478 and previous config saved to /var/cache/conftool/dbconfig/20241022-120356-ladsgroup.json
- 12:03 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for tests: Don't depend on Message implementation details (T377778), Update for Message/MessageValue changes (T377778) (duration: 15m 27s)
- 12:02 klausman@cumin2002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Java 11 security updates - klausman@cumin2002
- 11:57 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2085-2086,2088-2089].codfw.wmnet
- 11:57 klausman@cumin2002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Java 11 security updates - klausman@cumin2002
- 11:56 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Continuing with sync
- 11:55 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2085-2086,2088-2089].codfw.wmnet
- 11:55 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for tests: Don't depend on Message implementation details (T377778), Update for Message/MessageValue changes (T377778) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 11:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 11:48 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 11:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P70477 and previous config saved to /var/cache/conftool/dbconfig/20241022-114849-ladsgroup.json
- 11:48 kart_: Updated cxserver to 2024-10-22-112806-production (T357950)
- 11:47 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for tests: Don't depend on Message implementation details (T377778), Update for Message/MessageValue changes (T377778)
- 11:47 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 11:46 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 11:46 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 11:45 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 11:44 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 11:44 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2009.codfw.wmnet
- 11:43 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 11:43 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) check for host wikikube-worker2085.codfw.wmnet
- 11:43 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node check for host wikikube-worker2085.codfw.wmnet
- 11:41 akosiaris: remove faidon from WMCS projects maps, visualeditor, swift, testlabs per his request. Keep the bastion project. cc paravoid
- 11:39 klausman@cumin2002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Java 11 security updates - klausman@cumin2002
- 11:34 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) check for host kubestagemaster2005.codfw.wmnet
- 11:34 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node check for host kubestagemaster2005.codfw.wmnet
- 11:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T376905)', diff saved to https://phabricator.wikimedia.org/P70476 and previous config saved to /var/cache/conftool/dbconfig/20241022-113342-ladsgroup.json
- 11:27 moritzm: installing Java 11 security updates
- 11:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T376905)', diff saved to https://phabricator.wikimedia.org/P70475 and previous config saved to /var/cache/conftool/dbconfig/20241022-112408-ladsgroup.json
- 11:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
- 11:23 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
- 11:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T376905)', diff saved to https://phabricator.wikimedia.org/P70474 and previous config saved to /var/cache/conftool/dbconfig/20241022-112343-ladsgroup.json
- 11:21 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: sync
- 11:21 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: sync
- 11:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P70473 and previous config saved to /var/cache/conftool/dbconfig/20241022-110836-ladsgroup.json
- 11:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 100%: post clone', diff saved to https://phabricator.wikimedia.org/P70472 and previous config saved to /var/cache/conftool/dbconfig/20241022-110744-arnaudb.json
- 10:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P70471 and previous config saved to /var/cache/conftool/dbconfig/20241022-105329-ladsgroup.json
- 10:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 75%: post clone', diff saved to https://phabricator.wikimedia.org/P70470 and previous config saved to /var/cache/conftool/dbconfig/20241022-105238-arnaudb.json
- 10:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T376905)', diff saved to https://phabricator.wikimedia.org/P70469 and previous config saved to /var/cache/conftool/dbconfig/20241022-103822-ladsgroup.json
- 10:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 50%: post clone', diff saved to https://phabricator.wikimedia.org/P70468 and previous config saved to /var/cache/conftool/dbconfig/20241022-103733-arnaudb.json
- 10:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1184 (T376905)', diff saved to https://phabricator.wikimedia.org/P70467 and previous config saved to /var/cache/conftool/dbconfig/20241022-102907-ladsgroup.json
- 10:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 10:28 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 10:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T376905)', diff saved to https://phabricator.wikimedia.org/P70466 and previous config saved to /var/cache/conftool/dbconfig/20241022-102843-ladsgroup.json
- 10:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 25%: post clone', diff saved to https://phabricator.wikimedia.org/P70465 and previous config saved to /var/cache/conftool/dbconfig/20241022-102227-arnaudb.json
- 10:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P70464 and previous config saved to /var/cache/conftool/dbconfig/20241022-101336-ladsgroup.json
- 10:12 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production
- 10:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production
- 10:07 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
- 10:04 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: sync
- 10:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
- 10:03 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: sync
- 10:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2149.codfw.wmnet onto db2205.codfw.wmnet
- 10:03 andrewtavis-wmde@deploy2002: Finished deploy [airflow-dags/wmde@dcf019d]: (no justification provided) (duration: 00m 11s)
- 10:02 andrewtavis-wmde@deploy2002: Started deploy [airflow-dags/wmde@dcf019d]: (no justification provided)
- 09:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P70463 and previous config saved to /var/cache/conftool/dbconfig/20241022-095829-ladsgroup.json
- 09:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
- 09:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T376905)', diff saved to https://phabricator.wikimedia.org/P70461 and previous config saved to /var/cache/conftool/dbconfig/20241022-094322-ladsgroup.json
- 09:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production
- 09:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T376905)', diff saved to https://phabricator.wikimedia.org/P70460 and previous config saved to /var/cache/conftool/dbconfig/20241022-093345-ladsgroup.json
- 09:33 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 09:33 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 09:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
- 09:28 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2082.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 09:27 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ms-be2082.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 09:22 hashar: Restarting CI Jenkins
- 09:06 hashar: Restarting Gerrit
- 08:57 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on pc1013.eqiad.wmnet with reason: maintenance
- 08:57 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on pc1013.eqiad.wmnet with reason: maintenance
- 08:37 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2149.codfw.wmnet onto db2205.codfw.wmnet
- 08:35 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2082.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:34 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ms-be2082.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:33 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2082.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:33 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ms-be2082.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:33 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:32 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2240 (re)pooling @ 100%: post clone', diff saved to https://phabricator.wikimedia.org/P70459 and previous config saved to /var/cache/conftool/dbconfig/20241022-082545-arnaudb.json
- 08:24 moritzm: irc.wikimedia.org has been switched to ircstream T376014
- 08:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2240 (re)pooling @ 75%: post clone', diff saved to https://phabricator.wikimedia.org/P70457 and previous config saved to /var/cache/conftool/dbconfig/20241022-081040-arnaudb.json
- 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1002.wikimedia.org
- 08:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1002.wikimedia.org
- 08:03 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:03 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:00 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 07:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2149,2205].codfw.wmnet with reason: db2205 reclone
- 07:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db[2149,2205].codfw.wmnet with reason: db2205 reclone
- 07:58 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 07:58 arnaudb@cumin1002: dbctl commit (dc=all): 'T377718', diff saved to https://phabricator.wikimedia.org/P70456 and previous config saved to /var/cache/conftool/dbconfig/20241022-075830-arnaudb.json
- 07:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2240 (re)pooling @ 50%: post clone', diff saved to https://phabricator.wikimedia.org/P70455 and previous config saved to /var/cache/conftool/dbconfig/20241022-075534-arnaudb.json
- 07:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2240 (re)pooling @ 28%: post clone', diff saved to https://phabricator.wikimedia.org/P70454 and previous config saved to /var/cache/conftool/dbconfig/20241022-074029-arnaudb.json
- 07:28 moritzm: installing Java 17 security updates
- 07:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2240 (re)pooling @ 27%: post clone', diff saved to https://phabricator.wikimedia.org/P70453 and previous config saved to /var/cache/conftool/dbconfig/20241022-072523-arnaudb.json
- 07:23 moritzm: rearm keyholder on netmon1003
- 07:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2240 (re)pooling @ 26%: post clone', diff saved to https://phabricator.wikimedia.org/P70452 and previous config saved to /var/cache/conftool/dbconfig/20241022-071018-arnaudb.json
- 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
- 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
- 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
- 06:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2240 (re)pooling @ 25%: post clone', diff saved to https://phabricator.wikimedia.org/P70451 and previous config saved to /var/cache/conftool/dbconfig/20241022-065513-arnaudb.json
- 06:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
- 05:41 kart_: Remove servicerunner dependency for cxserver (T357950, T373777)
- 05:31 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 05:30 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 05:25 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 05:24 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.43.0-wmf.25 (duration: 00m 58s)
- 03:52 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.43.0-wmf.28 refs T375659 (duration: 49m 37s)
- 03:02 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.43.0-wmf.28 refs T375659
- 01:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2229 (T376905)', diff saved to https://phabricator.wikimedia.org/P70450 and previous config saved to /var/cache/conftool/dbconfig/20241022-010820-ladsgroup.json
- 00:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P70449 and previous config saved to /var/cache/conftool/dbconfig/20241022-005313-ladsgroup.json
- 00:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P70448 and previous config saved to /var/cache/conftool/dbconfig/20241022-003807-ladsgroup.json
- 00:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2229 (T376905)', diff saved to https://phabricator.wikimedia.org/P70447 and previous config saved to /var/cache/conftool/dbconfig/20241022-002259-ladsgroup.json
- 00:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2229 (T376905)', diff saved to https://phabricator.wikimedia.org/P70446 and previous config saved to /var/cache/conftool/dbconfig/20241022-001606-ladsgroup.json
- 00:17 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
- 00:16 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
- 00:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T376905)', diff saved to https://phabricator.wikimedia.org/P70445 and previous config saved to /var/cache/conftool/dbconfig/20241022-001539-ladsgroup.json
- 00:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P70444 and previous config saved to /var/cache/conftool/dbconfig/20241022-000032-ladsgroup.json
2024-10-21
- 23:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P70443 and previous config saved to /var/cache/conftool/dbconfig/20241021-234525-ladsgroup.json
- 23:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T376905)', diff saved to https://phabricator.wikimedia.org/P70442 and previous config saved to /var/cache/conftool/dbconfig/20241021-233018-ladsgroup.json
- 23:20 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2083.codfw.wmnet with OS bullseye
- 22:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2224 (T376905)', diff saved to https://phabricator.wikimedia.org/P70441 and previous config saved to /var/cache/conftool/dbconfig/20241021-222952-ladsgroup.json
- 22:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
- 22:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
- 22:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T376905)', diff saved to https://phabricator.wikimedia.org/P70440 and previous config saved to /var/cache/conftool/dbconfig/20241021-222926-ladsgroup.json
- 22:21 eileen: config revision changed from a1c7759c to 3bbf553d
- 22:18 zabe@deploy2002: Finished scap sync-world: Backport for group0: Increase revision-slots cache expiry back to default (T183490) (duration: 06m 58s)
- 22:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P70439 and previous config saved to /var/cache/conftool/dbconfig/20241021-221419-ladsgroup.json
- 22:13 zabe@deploy2002: zabe: Continuing with sync
- 22:13 zabe@deploy2002: zabe: Backport for group0: Increase revision-slots cache expiry back to default (T183490) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 22:11 zabe@deploy2002: Started scap sync-world: Backport for group0: Increase revision-slots cache expiry back to default (T183490)
- 21:59 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2083.codfw.wmnet with OS bullseye
- 21:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P70438 and previous config saved to /var/cache/conftool/dbconfig/20241021-215912-ladsgroup.json
- 21:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T376905)', diff saved to https://phabricator.wikimedia.org/P70437 and previous config saved to /var/cache/conftool/dbconfig/20241021-214405-ladsgroup.json
- 21:43 eileen: config revision changed from d240bcfb to a1c7759c
- 21:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2217 (T376905)', diff saved to https://phabricator.wikimedia.org/P70436 and previous config saved to /var/cache/conftool/dbconfig/20241021-213801-ladsgroup.json
- 21:37 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
- 21:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
- 21:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T376905)', diff saved to https://phabricator.wikimedia.org/P70435 and previous config saved to /var/cache/conftool/dbconfig/20241021-213733-ladsgroup.json
- 21:25 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production
- 21:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P70434 and previous config saved to /var/cache/conftool/dbconfig/20241021-212226-ladsgroup.json
- 21:22 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production
- 21:16 swfrench-wmf: ran authdns-update to pick up mw-(web|api-ext)-next discovery records - T377040
- 21:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P70433 and previous config saved to /var/cache/conftool/dbconfig/20241021-210718-ladsgroup.json
- 21:00 sukhe: running authdns-update for CR 1081371
- away: UTC late deploys done
- 20:56 tgr@deploy2002: Finished scap sync-world: Backport for fix(AuthManagerStatsd): counters require static set of labels (T377476) (duration: 18m 43s)
- 20:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T376905)', diff saved to https://phabricator.wikimedia.org/P70431 and previous config saved to /var/cache/conftool/dbconfig/20241021-205211-ladsgroup.json
- 20:52 tgr@deploy2002: tgr: Continuing with sync
- 20:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T376905)', diff saved to https://phabricator.wikimedia.org/P70430 and previous config saved to /var/cache/conftool/dbconfig/20241021-204603-ladsgroup.json
- 20:46 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
- 20:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
- 20:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T376905)', diff saved to https://phabricator.wikimedia.org/P70429 and previous config saved to /var/cache/conftool/dbconfig/20241021-204536-ladsgroup.json
- 20:40 tgr@deploy2002: tgr: Backport for fix(AuthManagerStatsd): counters require static set of labels (T377476) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:37 tgr@deploy2002: Started scap sync-world: Backport for fix(AuthManagerStatsd): counters require static set of labels (T377476)
- 20:32 tgr@deploy2002: Finished scap sync-world: Backport for frwiki: switch clearing link recommendations to PageSaveComplete hook (T372337) (duration: 08m 19s)
- 20:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P70428 and previous config saved to /var/cache/conftool/dbconfig/20241021-203029-ladsgroup.json
- 20:28 tgr@deploy2002: migr, tgr: Continuing with sync
- 20:26 tgr@deploy2002: migr, tgr: Backport for frwiki: switch clearing link recommendations to PageSaveComplete hook (T372337) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:24 tgr@deploy2002: Started scap sync-world: Backport for frwiki: switch clearing link recommendations to PageSaveComplete hook (T372337)
- 20:22 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production
- 20:21 tgr@deploy2002: Finished scap sync-world: Backport for Re-apply "Set special footer licence message for MediaWiki.org re. Help: pages" (T301483) (duration: 09m 48s)
- 20:19 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production
- 20:17 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production
- 20:16 tgr@deploy2002: matmarex, tgr: Continuing with sync
- 20:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P70427 and previous config saved to /var/cache/conftool/dbconfig/20241021-201522-ladsgroup.json
- 20:13 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production
- 20:13 tgr@deploy2002: matmarex, tgr: Backport for Re-apply "Set special footer licence message for MediaWiki.org re. Help: pages" (T301483) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:11 tgr@deploy2002: Started scap sync-world: Backport for Re-apply "Set special footer licence message for MediaWiki.org re. Help: pages" (T301483)
- 20:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T376905)', diff saved to https://phabricator.wikimedia.org/P70426 and previous config saved to /var/cache/conftool/dbconfig/20241021-200015-ladsgroup.json
- 19:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T376905)', diff saved to https://phabricator.wikimedia.org/P70425 and previous config saved to /var/cache/conftool/dbconfig/20241021-195300-ladsgroup.json
- 19:53 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
- 19:52 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
- 19:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T376905)', diff saved to https://phabricator.wikimedia.org/P70424 and previous config saved to /var/cache/conftool/dbconfig/20241021-195233-ladsgroup.json
- 19:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P70423 and previous config saved to /var/cache/conftool/dbconfig/20241021-193726-ladsgroup.json
- 19:36 swfrench@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=mw-api-ext-next-ro,name=eqiad [reason: preparing mw-api-ext-next-ro (a/a) for discovery - T377040]
- 19:36 swfrench@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=mw-api-ext-next-ro,name=codfw [reason: preparing mw-api-ext-next-ro (a/a) for discovery - T377040]
- 19:36 dduvall@deploy2002: Finished deploy [releng/jenkins-deploy@b75c4aa] (releasing): Deploying changes to MediaWiki branch and publish WMF single-version image job (duration: 01m 20s)
- 19:36 swfrench@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=mw-web-next-ro,name=eqiad [reason: preparing mw-web-next-ro (a/a) for discovery - T377040]
- 19:35 swfrench@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=mw-web-next-ro,name=codfw [reason: preparing mw-web-next-ro (a/a) for discovery - T377040]
- 19:34 dduvall@deploy2002: Started deploy [releng/jenkins-deploy@b75c4aa] (releasing): Deploying changes to MediaWiki branch and publish WMF single-version image job
- 19:31 swfrench@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=mw-api-ext-next,name=codfw [reason: preparing mw-api-ext-next (a/p) for discovery - T377040]
- 19:30 swfrench@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=mw-web-next,name=codfw [reason: preparing mw-web-next (a/p) for discovery - T377040]
- 19:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P70422 and previous config saved to /var/cache/conftool/dbconfig/20241021-192219-ladsgroup.json
- 19:11 ejegg: re-enabled fundraising thank you mailer
- 19:10 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-codfw (T377040)
- 19:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T376905)', diff saved to https://phabricator.wikimedia.org/P70421 and previous config saved to /var/cache/conftool/dbconfig/20241021-190712-ladsgroup.json
- 19:04 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-codfw (T377040)
- 19:02 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-codfw (T377040)
- 19:02 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-codfw (T377040)
- 19:01 swfrench-wmf: ran and enabled puppet agent on 'A:lvs and A:codfw' - T377040
- 19:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2169 (T376905)', diff saved to https://phabricator.wikimedia.org/P70420 and previous config saved to /var/cache/conftool/dbconfig/20241021-185957-ladsgroup.json
- 19:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 19:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 18:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T376905)', diff saved to https://phabricator.wikimedia.org/P70419 and previous config saved to /var/cache/conftool/dbconfig/20241021-185931-ladsgroup.json
- 18:58 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-eqiad (T377040)
- 18:52 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-eqiad (T377040)
- 18:51 zabe@deploy2002: Finished scap sync-world: Backport for s4: Reduce revision-slots cache expiry to 60 seconds (T183490) (duration: 16m 09s)
- 18:51 ejegg: fundraising civicrm upgraded from cfb0def0 to 36660cb3
- 18:45 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup1012.eqiad.wmnet with OS bookworm
- 18:45 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1002"
- 18:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P70418 and previous config saved to /var/cache/conftool/dbconfig/20241021-184424-ladsgroup.json
- 18:43 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-eqiad (T377040)
- 18:42 zabe@deploy2002: zabe: Continuing with sync
- 18:42 zabe@deploy2002: zabe: Backport for s4: Reduce revision-slots cache expiry to 60 seconds (T183490) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 18:37 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1002"
- 18:37 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-eqiad (T377040)
- 18:36 swfrench-wmf: ran and enabled puppet agent on 'A:lvs and A:eqiad' - T377040
- 18:35 zabe@deploy2002: Started scap sync-world: Backport for s4: Reduce revision-slots cache expiry to 60 seconds (T183490)
- 18:32 swfrench-wmf: ran disable-puppet on 'A:lvs and (A:eqiad or A:codfw)' - T377040
- 18:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P70417 and previous config saved to /var/cache/conftool/dbconfig/20241021-182916-ladsgroup.json
- 18:23 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-codfw (T377040)
- 18:22 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-codfw (T377040)
- 18:20 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-codfw (T377040)
- 18:19 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-codfw (T377040)
- 18:19 swfrench-wmf: ran and enabled pupppet agent on 'A:lvs and A:codfw' - T377040
- 18:15 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-eqiad (T377040)
- 18:14 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup1012.eqiad.wmnet with reason: host reimage
- 18:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T376905)', diff saved to https://phabricator.wikimedia.org/P70416 and previous config saved to /var/cache/conftool/dbconfig/20241021-181410-ladsgroup.json
- 18:11 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup1012.eqiad.wmnet with reason: host reimage
- 18:09 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-eqiad (T377040)
- 18:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T376905)', diff saved to https://phabricator.wikimedia.org/P70415 and previous config saved to /var/cache/conftool/dbconfig/20241021-180654-ladsgroup.json
- 18:09 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 18:08 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 18:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
- 18:07 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
- 18:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T376905)', diff saved to https://phabricator.wikimedia.org/P70414 and previous config saved to /var/cache/conftool/dbconfig/20241021-180612-ladsgroup.json
- 18:06 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-eqiad (T377040)
- 18:05 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-eqiad (T377040)
- 18:04 swfrench-wmf: ran and enabled pupppet agent on 'A:lvs and A:eqiad' - T377040
- 17:59 swfrench-wmf: ran disable-puppet on 'A:lvs and (A:eqiad or A:codfw)' - T377040
- 17:56 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host backup1012.eqiad.wmnet with OS bookworm
- 17:53 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1012.eqiad.wmnet with OS bookworm
- 17:53 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host backup1012.eqiad.wmnet with OS bookworm
- 17:52 dduvall@deploy2002: Installing scap version "4.115.0" for 209 hosts
- 17:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P70413 and previous config saved to /var/cache/conftool/dbconfig/20241021-175105-ladsgroup.json
- 17:50 xcollazo@deploy2002: Finished deploy [airflow-dags/analytics@671896c]: Deploy T375402. (duration: 01m 04s)
- 17:48 xcollazo@deploy2002: Started deploy [airflow-dags/analytics@671896c]: Deploy T375402.
- 17:44 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:43 elukey@cumin1002: START - Cookbook sre.hosts.provision for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:42 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:41 elukey@cumin1002: START - Cookbook sre.hosts.provision for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P70412 and previous config saved to /var/cache/conftool/dbconfig/20241021-173558-ladsgroup.json
- 17:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T376905)', diff saved to https://phabricator.wikimedia.org/P70411 and previous config saved to /var/cache/conftool/dbconfig/20241021-172051-ladsgroup.json
- 17:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T376905)', diff saved to https://phabricator.wikimedia.org/P70410 and previous config saved to /var/cache/conftool/dbconfig/20241021-171138-ladsgroup.json
- 17:11 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
- 17:11 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
- 17:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T376905)', diff saved to https://phabricator.wikimedia.org/P70409 and previous config saved to /var/cache/conftool/dbconfig/20241021-171046-ladsgroup.json
- 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2172 (re)pooling @ 100%: post clone', diff saved to https://phabricator.wikimedia.org/P70408 and previous config saved to /var/cache/conftool/dbconfig/20241021-165624-arnaudb.json
- 16:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P70407 and previous config saved to /var/cache/conftool/dbconfig/20241021-165539-ladsgroup.json
- 16:44 herron@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 16:43 herron@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2172 (re)pooling @ 75%: post clone', diff saved to https://phabricator.wikimedia.org/P70406 and previous config saved to /var/cache/conftool/dbconfig/20241021-164119-arnaudb.json
- 16:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P70405 and previous config saved to /var/cache/conftool/dbconfig/20241021-164032-ladsgroup.json
- 16:33 volans@cumin1002: dbctl commit (dc=all): 'Fix db1185 weight', diff saved to https://phabricator.wikimedia.org/P70404 and previous config saved to /var/cache/conftool/dbconfig/20241021-163355-volans.json
- 16:32 volans@cumin1002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1185 quickly with 2 steps - Testing new cookbook
- 16:29 volans@cumin1002: START - Cookbook sre.mysql.pool db1185 quickly with 2 steps - Testing new cookbook
- 16:29 volans@cumin1002: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) db1185 quickly with 2 steps - Testing new cookbook
- 16:28 volans@cumin1002: START - Cookbook sre.mysql.pool db1185 quickly with 2 steps - Testing new cookbook
- 16:28 arnaudb@cumin1002: dbctl commit (dc=all): 'db2172 (re)pooling @ 50%: post clone', diff saved to https://phabricator.wikimedia.org/P70401 and previous config saved to /var/cache/conftool/dbconfig/20241021-162613-arnaudb.json
- 16:27 volans@cumin1002: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1185 - Testing new cookbook
- 16:26 volans@cumin1002: START - Cookbook sre.mysql.depool db1185 - Testing new cookbook
- 16:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T376905)', diff saved to https://phabricator.wikimedia.org/P70399 and previous config saved to /var/cache/conftool/dbconfig/20241021-162525-ladsgroup.json
- 16:22 volans@cumin1002: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) db1185 - Testing new cookbook
- 16:22 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:22 elukey@cumin1002: START - Cookbook sre.hosts.provision for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:21 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:21 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:19 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:19 elukey@cumin1002: START - Cookbook sre.hosts.provision for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:18 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:18 volans@cumin1002: START - Cookbook sre.mysql.depool db1185 - Testing new cookbook
- 16:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:17 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2213 (T376905)', diff saved to https://phabricator.wikimedia.org/P70398 and previous config saved to /var/cache/conftool/dbconfig/20241021-161701-ladsgroup.json
- 16:17 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2213.codfw.wmnet with reason: Maintenance
- 16:16 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2213.codfw.wmnet with reason: Maintenance
- 16:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T376905)', diff saved to https://phabricator.wikimedia.org/P70397 and previous config saved to /var/cache/conftool/dbconfig/20241021-161634-ladsgroup.json
- 16:15 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2172 (re)pooling @ 25%: post clone', diff saved to https://phabricator.wikimedia.org/P70396 and previous config saved to /var/cache/conftool/dbconfig/20241021-161108-arnaudb.json
- 16:04 ejegg: disabled fundraising Thank You mail send jobs
- 16:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P70395 and previous config saved to /var/cache/conftool/dbconfig/20241021-160127-ladsgroup.json
- 15:58 volans@cumin1002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1185 gradually with 4 steps - Testing new cookbook
- 15:56 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:56 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:55 volans@cumin1002: START - Cookbook sre.mysql.pool db1185 gradually with 4 steps - Testing new cookbook
- 15:53 volans@cumin1002: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1185 - Testing new cookbook
- 15:53 volans@cumin1002: START - Cookbook sre.mysql.depool db1185 - Testing new cookbook
- 15:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P70389 and previous config saved to /var/cache/conftool/dbconfig/20241021-154620-ladsgroup.json
- 15:39 Dreamy_Jazz: Starting MediaModeration scanning script for 12 hrs on enwiki - https://wikitech.wikimedia.org/wiki/MediaModeration
- 15:39 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production
- 15:37 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2172.codfw.wmnet onto db2240.codfw.wmnet
- 15:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production
- 15:32 Dreamy_Jazz: Restarted MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
- 15:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T376905)', diff saved to https://phabricator.wikimedia.org/P70388 and previous config saved to /var/cache/conftool/dbconfig/20241021-153113-ladsgroup.json
- 15:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2211 (T376905)', diff saved to https://phabricator.wikimedia.org/P70387 and previous config saved to /var/cache/conftool/dbconfig/20241021-152408-ladsgroup.json
- 15:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
- 15:23 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
- 15:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T376905)', diff saved to https://phabricator.wikimedia.org/P70386 and previous config saved to /var/cache/conftool/dbconfig/20241021-152339-ladsgroup.json
- 15:20 moritzm: rearm keyholder on netmon2002
- 15:20 stran@deploy2002: Finished scap sync-world: Backport for Disable local IP view right group on meta (T377584) (duration: 20m 29s)
- 15:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
- 15:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P70385 and previous config saved to /var/cache/conftool/dbconfig/20241021-150832-ladsgroup.json
- 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
- 15:02 stran@deploy2002: stran: Continuing with sync
- 15:01 stran@deploy2002: stran: Backport for Disable local IP view right group on meta (T377584) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:59 stran@deploy2002: Started scap sync-world: Backport for Disable local IP view right group on meta (T377584)
- 14:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P70384 and previous config saved to /var/cache/conftool/dbconfig/20241021-145325-ladsgroup.json
- 14:53 ejegg: disabled failing CiviCRM contact dedupe job
- 14:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T376905)', diff saved to https://phabricator.wikimedia.org/P70383 and previous config saved to /var/cache/conftool/dbconfig/20241021-143818-ladsgroup.json
- 14:33 herron@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 14:32 herron@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 14:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2192 (T376905)', diff saved to https://phabricator.wikimedia.org/P70382 and previous config saved to /var/cache/conftool/dbconfig/20241021-143108-ladsgroup.json
- 14:31 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
- 14:31 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
- 14:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T376905)', diff saved to https://phabricator.wikimedia.org/P70381 and previous config saved to /var/cache/conftool/dbconfig/20241021-143042-ladsgroup.json
- 14:29 moritzm: installing PHP 8.2 security updates
- 14:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P70380 and previous config saved to /var/cache/conftool/dbconfig/20241021-141535-ladsgroup.json
- 14:15 Lucas_WMDE: UTC afternoon backport+config window done
- 14:10 stran@deploy2002: Finished scap sync-world: Backport for Disable IP reveal rights for local metawiki groups (T377584), Set redirect wiki for Special:GlobalContributions (T376612), temp accounts: Make temp accounts known on metawiki (T376132) (duration: 14m 55s)
- 14:05 stran@deploy2002: stran, kharlan: Continuing with sync
- 14:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P70379 and previous config saved to /var/cache/conftool/dbconfig/20241021-140028-ladsgroup.json
- 13:57 stran@deploy2002: stran, kharlan: Backport for Disable IP reveal rights for local metawiki groups (T377584), Set redirect wiki for Special:GlobalContributions (T376612), temp accounts: Make temp accounts known on metawiki (T376132) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:57 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2035.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 13:55 stran@deploy2002: Started scap sync-world: Backport for Disable IP reveal rights for local metawiki groups (T377584), Set redirect wiki for Special:GlobalContributions (T376612), temp accounts: Make temp accounts known on metawiki (T376132)
- 13:54 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti2035.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 13:53 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2035.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 13:50 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti2035.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 13:50 stran@deploy2002: Finished scap sync-world: Backport for Apply wmf-specific protected vars rights access (T369610) (duration: 08m 53s)
- 13:45 stran@deploy2002: stran: Continuing with sync
- 13:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T376905)', diff saved to https://phabricator.wikimedia.org/P70378 and previous config saved to /var/cache/conftool/dbconfig/20241021-134521-ladsgroup.json
- 13:43 stran@deploy2002: stran: Backport for Apply wmf-specific protected vars rights access (T369610) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:41 stran@deploy2002: Started scap sync-world: Backport for Apply wmf-specific protected vars rights access (T369610)
- 13:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2178 (T376905)', diff saved to https://phabricator.wikimedia.org/P70377 and previous config saved to /var/cache/conftool/dbconfig/20241021-133619-ladsgroup.json
- 13:37 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
- 13:36 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
- 13:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T376905)', diff saved to https://phabricator.wikimedia.org/P70376 and previous config saved to /var/cache/conftool/dbconfig/20241021-133552-ladsgroup.json
- 13:35 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 13:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7002.magru.wmnet
- 13:34 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Revert "Enable CampaignEvents collaboration list in testwiki and test2wiki" (duration: 08m 20s)
- 13:33 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 13:33 inflatador: bking@stat1009,stat1010.mgmt racadm>>racadm set BIOS.MemSettings.NodeInterleave Enabled && racadm jobqueue create BIOS.Setup.1-1 T376813
- 13:32 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 13:30 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2172.codfw.wmnet onto db2240.codfw.wmnet
- 13:29 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 13:29 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, trainbranchbot: Continuing with sync
- 13:28 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 13:28 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, trainbranchbot: Backport for Revert "Enable CampaignEvents collaboration list in testwiki and test2wiki" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:27 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 13:26 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Revert "Enable CampaignEvents collaboration list in testwiki and test2wiki"
- 13:25 inflatador: bking@stat1008.mgmt racadm>>racadm jobqueue create BIOS.Setup.1-1
- 13:24 inflatador: bking@stat1008.mgmt racadm>>racadm set BIOS.MemSettings.NodeInterleave Enabled T376813
- 13:24 lucaswerkmeister-wmde@deploy2002: Sync cancelled.
- 13:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2172 in db2240 for T373579', diff saved to https://phabricator.wikimedia.org/P70375 and previous config saved to /var/cache/conftool/dbconfig/20241021-132351-arnaudb.json
- 13:22 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: provisionning db2240.codfw.wmnet - T373579
- 13:22 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: provisionning db2240.codfw.wmnet - T373579
- 13:22 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: provisionning db2240.codfw.wmnet - T373579
- 13:22 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: provisionning db2240.codfw.wmnet - T373579
- 13:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P70374 and previous config saved to /var/cache/conftool/dbconfig/20241021-132045-ladsgroup.json
- 13:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2172 to clone on db2240 T373579', diff saved to https://phabricator.wikimedia.org/P70373 and previous config saved to /var/cache/conftool/dbconfig/20241021-131750-arnaudb.json
- 13:12 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: test Ide32aa with dummy upgrade
- 13:11 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: test Ide32aa with dummy upgrade
- 13:08 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, daimona: Backport for Enable CampaignEvents collaboration list in testwiki and test2wiki (T376055) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P70372 and previous config saved to /var/cache/conftool/dbconfig/20241021-130538-ladsgroup.json
- 13:05 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Enable CampaignEvents collaboration list in testwiki and test2wiki (T376055)
- 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet
- 12:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet
- 12:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T376905)', diff saved to https://phabricator.wikimedia.org/P70371 and previous config saved to /var/cache/conftool/dbconfig/20241021-125029-ladsgroup.json
- 12:45 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-lab1002.eqiad.wmnet with OS bookworm
- 12:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2171 (T376905)', diff saved to https://phabricator.wikimedia.org/P70370 and previous config saved to /var/cache/conftool/dbconfig/20241021-124217-ladsgroup.json
- 12:42 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 12:42 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 12:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T376905)', diff saved to https://phabricator.wikimedia.org/P70369 and previous config saved to /var/cache/conftool/dbconfig/20241021-124151-ladsgroup.json
- 12:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P70368 and previous config saved to /var/cache/conftool/dbconfig/20241021-122644-ladsgroup.json
- 12:24 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-lab1002.eqiad.wmnet with reason: host reimage
- 12:21 klausman@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-lab1002.eqiad.wmnet with reason: host reimage
- 12:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P70367 and previous config saved to /var/cache/conftool/dbconfig/20241021-121136-ladsgroup.json
- 12:09 klausman@cumin1002: START - Cookbook sre.hosts.reimage for host ml-lab1002.eqiad.wmnet with OS bookworm
- 12:01 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 12:01 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 12:00 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production
- 11:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: sync on production
- 11:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T376905)', diff saved to https://phabricator.wikimedia.org/P70366 and previous config saved to /var/cache/conftool/dbconfig/20241021-115629-ladsgroup.json
- 11:52 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
- 11:52 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
- 11:52 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 11:51 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 11:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2157 (T376905)', diff saved to https://phabricator.wikimedia.org/P70365 and previous config saved to /var/cache/conftool/dbconfig/20241021-114723-ladsgroup.json
- 11:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
- 11:47 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
- 11:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T376905)', diff saved to https://phabricator.wikimedia.org/P70364 and previous config saved to /var/cache/conftool/dbconfig/20241021-114657-ladsgroup.json
- 11:40 moritzm: installing python-idna security updates
- 11:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P70363 and previous config saved to /var/cache/conftool/dbconfig/20241021-113150-ladsgroup.json
- 11:17 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production
- 11:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P70362 and previous config saved to /var/cache/conftool/dbconfig/20241021-111643-ladsgroup.json
- 11:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production
- 11:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T376905)', diff saved to https://phabricator.wikimedia.org/P70361 and previous config saved to /var/cache/conftool/dbconfig/20241021-110136-ladsgroup.json
- 10:59 moritzm: installing curl security updates
- 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cloudcephosd1029.eqiad.wmnet
- 10:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2128 (T376905)', diff saved to https://phabricator.wikimedia.org/P70360 and previous config saved to /var/cache/conftool/dbconfig/20241021-105223-ladsgroup.json
- 10:52 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 10:52 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 10:52 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
- 10:51 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
- 10:47 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host cloudcephosd1029.eqiad.wmnet
- 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2038.codfw.wmnet to cluster codfw and group C
- 10:31 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2038.codfw.wmnet to cluster codfw and group C
- 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1004.eqiad.wmnet
- 10:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping1004.eqiad.wmnet
- 10:18 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1185.eqiad.wmnet with reason: testing depool/repool
- 10:18 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db1185.eqiad.wmnet with reason: testing depool/repool
- 10:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 10:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1213.eqiad.wmnet with reason: testing depool/repool
- 10:16 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db1213.eqiad.wmnet with reason: testing depool/repool
- 10:14 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1245.eqiad.wmnet with reason: testing depool/repool
- 10:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 10:14 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db1245.eqiad.wmnet with reason: testing depool/repool
- 10:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 10:10 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host cloudcephmon1006.eqiad.wmnet
- 10:08 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.reimage-stacked-control-plane (exit_code=0) Reimaging k8s control planes of cluster staging-eqiad: containerd migration
- 10:08 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster1005.eqiad.wmnet with OS bookworm
- 10:07 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 10:07 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 10:02 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host cloudcephmon1006.eqiad.wmnet
- 09:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 09:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 09:52 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2002.codfw.wmnet
- 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2037.codfw.wmnet to cluster codfw and group C
- 09:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2037.codfw.wmnet to cluster codfw and group C
- 09:47 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-staging2002.codfw.wmnet
- 09:46 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster1005.eqiad.wmnet with reason: host reimage
- 09:45 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
- 09:42 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster1005.eqiad.wmnet with reason: host reimage
- 09:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet
- 09:40 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
- 09:39 dcausse@deploy2002: Finished scap sync-world: Backport for Fix phan issue with getCounter returning NullMetric|CounterMetric, Do not pass null to DataSender::sendWeightedTagsUpdate $tagWeights (T376715) (duration: 23m 26s)
- 09:36 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1011.eqiad.wmnet
- 09:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet
- 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2004.codfw.wmnet
- 09:32 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-serve1011.eqiad.wmnet
- 09:31 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1010.eqiad.wmnet
- 09:29 dcausse@deploy2002: dcausse: Continuing with sync
- 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet
- 09:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping2004.codfw.wmnet
- 09:27 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster1005.eqiad.wmnet with OS bookworm
- 09:27 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster1004.eqiad.wmnet with OS bookworm
- 09:27 dcausse@deploy2002: dcausse: Backport for Fix phan issue with getCounter returning NullMetric|CounterMetric, Do not pass null to DataSender::sendWeightedTagsUpdate $tagWeights (T376715) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 09:26 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-serve1010.eqiad.wmnet
- 09:24 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1009.eqiad.wmnet
- 09:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet
- 09:19 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-serve1009.eqiad.wmnet
- 09:18 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
- 09:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet
- 09:16 dcausse@deploy2002: Started scap sync-world: Backport for Fix phan issue with getCounter returning NullMetric|CounterMetric, Do not pass null to DataSender::sendWeightedTagsUpdate $tagWeights (T376715)
- 09:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet
- 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet
- 09:11 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
- 09:11 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:11 elukey@cumin1002: START - Cookbook sre.hosts.provision for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:10 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:10 elukey@cumin1002: START - Cookbook sre.hosts.provision for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:09 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:09 elukey@cumin1002: START - Cookbook sre.hosts.provision for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet
- 09:06 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster1004.eqiad.wmnet with reason: host reimage
- 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet
- 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
- 09:03 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1001.eqiad.wmnet
- 09:02 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster1004.eqiad.wmnet with reason: host reimage
- 09:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
- 08:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet
- 08:57 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-lab1001.eqiad.wmnet
- 08:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet
- 08:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet
- 08:53 andrewtavis-wmde@deploy2002: Finished deploy [airflow-dags/wmde@d176c47]: (no justification provided) (duration: 00m 11s)
- 08:53 andrewtavis-wmde@deploy2002: Started deploy [airflow-dags/wmde@d176c47]: (no justification provided)
- 08:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet
- 08:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet
- 08:48 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster1004.eqiad.wmnet with OS bookworm
- 08:47 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster1003.eqiad.wmnet with OS bookworm
- 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet
- 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
- 08:44 jnuche@deploy2002: Installing scap version "4.114.0" for 210 hosts
- 08:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
- 08:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet
- 08:26 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster1003.eqiad.wmnet with reason: host reimage
- 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:23 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster1003.eqiad.wmnet with reason: host reimage
- 08:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:09 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster1003.eqiad.wmnet with OS bookworm
- 08:09 jayme@cumin1002: START - Cookbook sre.k8s.reimage-stacked-control-plane Reimaging k8s control planes of cluster staging-eqiad: containerd migration
- 07:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 07:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 07:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 07:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 07:36 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1013.eqiad.wmnet with OS bookworm
- 07:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 07:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 07:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 07:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 07:23 moritzm: installing python-reportlab security updates
- 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast7001.wikimedia.org
- 07:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 07:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 07:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast7001.wikimedia.org
- 07:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1013.eqiad.wmnet with reason: host reimage
- 07:10 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1013.eqiad.wmnet with reason: host reimage
- 07:09 kartik@deploy2002: scap failed: <CalledProcessError> Command '['/usr/bin/scap', 'mwshell', '--no-local-config', '--directory', '/srv/mediawiki-staging', '--user', 'www-data', '--', 'rm -f /srv/mediawiki-staging/php-1.43.0-wmf.27/cache/l10n/*.tmp.*']' returned non-zero exit status 126. (scap version: 4.113.0) (duration: 00m 01s)
- 07:09 kartik@deploy2002: Started scap sync-world: Backport for Enable Special:Contribute on bnwiki
- 07:05 kartik@deploy2002: scap failed: <CalledProcessError> Command '['/usr/bin/scap', 'mwshell', '--no-local-config', '--directory', '/srv/mediawiki-staging', '--user', 'www-data', '--', 'rm -f /srv/mediawiki-staging/php-1.43.0-wmf.27/cache/l10n/*.tmp.*']' returned non-zero exit status 126. (scap version: 4.113.0) (duration: 00m 01s)
- 07:05 kartik@deploy2002: Started scap sync-world: Backport for Enable Special:Contribute on bnwiki
- 06:58 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 153087
- 06:58 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 153087
- 06:58 ayounsi@cumin1002: END (ERROR) - Cookbook sre.network.peering (exit_code=97) with action 'email' for AS: 153087
- 06:58 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 153087
- 06:56 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host pc1013.eqiad.wmnet with OS bookworm
- 06:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2203.codfw.wmnet with reason: Maintenance
- 06:08 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2203.codfw.wmnet with reason: Maintenance
- 06:07 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 06:07 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 00:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1211 (T367856)', diff saved to https://phabricator.wikimedia.org/P70359 and previous config saved to /var/cache/conftool/dbconfig/20241021-000434-ladsgroup.json
- 00:04 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on db1211.eqiad.wmnet with reason: Maintenance
- 00:04 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on db1211.eqiad.wmnet with reason: Maintenance
2024-10-20
- 21:19 eileen: civicrm upgraded from 77ea54bc to cfb0def0
- 09:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T367856)', diff saved to https://phabricator.wikimedia.org/P70358 and previous config saved to /var/cache/conftool/dbconfig/20241020-095904-ladsgroup.json
- 09:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P70357 and previous config saved to /var/cache/conftool/dbconfig/20241020-094357-ladsgroup.json
- 09:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P70356 and previous config saved to /var/cache/conftool/dbconfig/20241020-092850-ladsgroup.json
- 09:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T367856)', diff saved to https://phabricator.wikimedia.org/P70355 and previous config saved to /var/cache/conftool/dbconfig/20241020-091344-ladsgroup.json
2024-10-19
- 00:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 00:13 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
2024-10-18
- 22:16 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 22:13 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:52 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 21:50 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 21:45 dduvall@deploy2002: Finished deploy [releng/jenkins-deploy@8c1070f] (releasing): deploying changes to publishMWSingleVersion job (duration: 01m 06s)
- 21:44 dduvall@deploy2002: Started deploy [releng/jenkins-deploy@8c1070f] (releasing): deploying changes to publishMWSingleVersion job
- 20:23 dduvall: deployed scap release 4.113.0 to releases{1003,2003} hosts
- 20:22 dduvall@deploy2002: Installing scap version "4.113.0" for 2 hosts
- 20:21 dduvall@deploy2002: install-world aborted: (no justification provided) (duration: 00m 52s)
- 20:20 dduvall@deploy2002: Installing scap version "latest" for 2 hosts
- 19:09 tzatziki: removing 3 files for legal compliance
- 18:56 tzatziki: removing 1 file for legal compliance
- 16:54 dzahn@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 16:54 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.reimage-stacked-control-plane (exit_code=0) Reimaging k8s control planes of cluster staging-codfw: containerd migration
- 16:54 dzahn@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
- 16:54 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster2005.codfw.wmnet with OS bookworm
- 16:32 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
- 16:28 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
- 16:10 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster2005.codfw.wmnet with OS bookworm
- 16:09 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster2004.codfw.wmnet with OS bookworm
- 15:46 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2004.codfw.wmnet with reason: host reimage
- 15:43 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2004.codfw.wmnet with reason: host reimage
- 15:26 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster2004.codfw.wmnet with OS bookworm
- 15:26 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster2003.codfw.wmnet with OS bookworm
- 15:02 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2003.codfw.wmnet with reason: host reimage
- 14:59 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:58 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2003.codfw.wmnet with reason: host reimage
- 14:57 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
- 14:53 akosiaris@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 14:53 akosiaris@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Removal of old mx records and api.svc records - akosiaris@cumin1002"
- 14:52 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Removal of old mx records and api.svc records - akosiaris@cumin1002"
- 14:48 milimetric@deploy2002: Finished deploy [airflow-dags/analytics@e44bacc]: Deploying updated dumps reconciliation (duration: 00m 31s)
- 14:47 milimetric@deploy2002: Started deploy [airflow-dags/analytics@e44bacc]: Deploying updated dumps reconciliation
- 14:39 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster2003.codfw.wmnet with OS bookworm
- 14:38 jayme@cumin1002: START - Cookbook sre.k8s.reimage-stacked-control-plane Reimaging k8s control planes of cluster staging-codfw: containerd migration
- 14:37 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1013.eqiad.wmnet
- 14:37 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for aqs1013.eqiad.wmnet
- 14:25 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
- 14:09 sergi0: Running `foreachwiki userOptions.php --delete-defaults growthexperiments-homepage-variant` (T374544, T375753)
- 13:47 isaranto@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
- 13:46 isaranto@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
- 13:32 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on aqs1013.eqiad.wmnet with reason: Hardware replacement
- 13:31 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on aqs1013.eqiad.wmnet with reason: Hardware replacement
- 13:22 milimetric@deploy2002: Finished deploy [airflow-dags/analytics@f020959]: Deploying updated dumps reconciliation (duration: 00m 31s)
- 13:22 milimetric@deploy2002: Started deploy [airflow-dags/analytics@f020959]: Deploying updated dumps reconciliation
- 13:03 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
- 12:22 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
- 12:22 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
- 12:22 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
- 12:21 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
- 11:43 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.reimage-stacked-control-plane (exit_code=0) Reimaging k8s control planes of cluster staging-codfw: containerd migration
- 11:43 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster2005.codfw.wmnet with OS bookworm
- 11:31 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbstore1009.eqiad.wmnet
- 11:31 btullis@cumin1002: START - Cookbook sre.hosts.remove-downtime for dbstore1009.eqiad.wmnet
- 11:21 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
- 11:17 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
- 11:00 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster2005.codfw.wmnet with OS bookworm
- 11:00 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
- 10:59 jelto@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
- 10:59 jayme@cumin1002: START - Cookbook sre.k8s.reimage-stacked-control-plane Reimaging k8s control planes of cluster staging-codfw: containerd migration
- 10:58 jayme@cumin1002: conftool action : set/pooled=yes; selector: name=kubestagemaster2005.codfw.wmnet
- 10:39 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.reimage-stacked-control-plane (exit_code=99) Reimaging k8s control planes of cluster staging-codfw: containerd migration
- 10:38 jayme@cumin1002: START - Cookbook sre.k8s.reimage-stacked-control-plane Reimaging k8s control planes of cluster staging-codfw: containerd migration
- 10:37 jayme@cumin1002: conftool action : set/pooled=yes; selector: name=kubestagemaster2005.codfw.wmnet
- 10:37 jayme@cumin1002: conftool action : set/pooled=inactive; selector: name=kubestagemaster2005.codfw.wmnet
- 10:37 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
- 10:26 jelto@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
- 09:47 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 09:45 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 09:45 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 09:43 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 09:42 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 09:41 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 09:36 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: sync
- 09:35 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/proton: sync
- 09:35 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: sync
- 09:33 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: sync
- 09:33 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: sync
- 09:33 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/proton: sync
- 09:14 Dreamy_Jazz: Restarted MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
- 09:11 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
- 09:10 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
- 08:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2161 (T367856)', diff saved to https://phabricator.wikimedia.org/P70348 and previous config saved to /var/cache/conftool/dbconfig/20241018-080343-ladsgroup.json
- 08:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
- 08:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
- 01:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T376905)', diff saved to https://phabricator.wikimedia.org/P70347 and previous config saved to /var/cache/conftool/dbconfig/20241018-015152-ladsgroup.json
- 01:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P70346 and previous config saved to /var/cache/conftool/dbconfig/20241018-013645-ladsgroup.json
- 01:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P70345 and previous config saved to /var/cache/conftool/dbconfig/20241018-012138-ladsgroup.json
- 01:16 eileen: civicrm upgraded from b0508a22 to 77ea54bc
- 01:16 eileen: ,
- 01:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T376905)', diff saved to https://phabricator.wikimedia.org/P70344 and previous config saved to /var/cache/conftool/dbconfig/20241018-010631-ladsgroup.json
- 00:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2238 (T376905)', diff saved to https://phabricator.wikimedia.org/P70343 and previous config saved to /var/cache/conftool/dbconfig/20241018-005819-ladsgroup.json
- 00:58 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
- 00:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
- 00:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T376905)', diff saved to https://phabricator.wikimedia.org/P70342 and previous config saved to /var/cache/conftool/dbconfig/20241018-005752-ladsgroup.json
- 00:43 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 00:43 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove mgmt DNS entries for old frack switches - pt1979@cumin2002"
- 00:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P70341 and previous config saved to /var/cache/conftool/dbconfig/20241018-004245-ladsgroup.json
- 00:42 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove mgmt DNS entries for old frack switches - pt1979@cumin2002"
- 00:38 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 00:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P70340 and previous config saved to /var/cache/conftool/dbconfig/20241018-002738-ladsgroup.json
- 00:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T376905)', diff saved to https://phabricator.wikimedia.org/P70339 and previous config saved to /var/cache/conftool/dbconfig/20241018-001231-ladsgroup.json
- 00:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2225 (T376905)', diff saved to https://phabricator.wikimedia.org/P70338 and previous config saved to /var/cache/conftool/dbconfig/20241018-000422-ladsgroup.json
- 00:04 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
- 00:04 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
- 00:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T376905)', diff saved to https://phabricator.wikimedia.org/P70337 and previous config saved to /var/cache/conftool/dbconfig/20241018-000356-ladsgroup.json
2024-10-17
- 23:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P70336 and previous config saved to /var/cache/conftool/dbconfig/20241017-234849-ladsgroup.json
- 23:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P70335 and previous config saved to /var/cache/conftool/dbconfig/20241017-233342-ladsgroup.json
- 23:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T376905)', diff saved to https://phabricator.wikimedia.org/P70334 and previous config saved to /var/cache/conftool/dbconfig/20241017-231835-ladsgroup.json
- 23:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2207 (T376905)', diff saved to https://phabricator.wikimedia.org/P70333 and previous config saved to /var/cache/conftool/dbconfig/20241017-231037-ladsgroup.json
- 23:10 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
- 23:10 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
- 23:05 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
- 23:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
- 23:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T376905)', diff saved to https://phabricator.wikimedia.org/P70332 and previous config saved to /var/cache/conftool/dbconfig/20241017-230457-ladsgroup.json
- 22:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P70331 and previous config saved to /var/cache/conftool/dbconfig/20241017-224950-ladsgroup.json
- 22:42 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 22:42 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 22:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T376905)', diff saved to https://phabricator.wikimedia.org/P70330 and previous config saved to /var/cache/conftool/dbconfig/20241017-224209-ladsgroup.json
- 22:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P70329 and previous config saved to /var/cache/conftool/dbconfig/20241017-223443-ladsgroup.json
- 22:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P70328 and previous config saved to /var/cache/conftool/dbconfig/20241017-222702-ladsgroup.json
- 22:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T376905)', diff saved to https://phabricator.wikimedia.org/P70327 and previous config saved to /var/cache/conftool/dbconfig/20241017-221936-ladsgroup.json
- 22:15 eileen: civicrm upgraded from f980ace9 to b0508a22
- 22:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P70326 and previous config saved to /var/cache/conftool/dbconfig/20241017-221155-ladsgroup.json
- 22:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T376905)', diff saved to https://phabricator.wikimedia.org/P70325 and previous config saved to /var/cache/conftool/dbconfig/20241017-221123-ladsgroup.json
- 22:11 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
- 22:11 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
- 22:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T376905)', diff saved to https://phabricator.wikimedia.org/P70324 and previous config saved to /var/cache/conftool/dbconfig/20241017-221057-ladsgroup.json
- 21:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T376905)', diff saved to https://phabricator.wikimedia.org/P70323 and previous config saved to /var/cache/conftool/dbconfig/20241017-215648-ladsgroup.json
- 21:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P70322 and previous config saved to /var/cache/conftool/dbconfig/20241017-215550-ladsgroup.json
- 21:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1223 (T376905)', diff saved to https://phabricator.wikimedia.org/P70321 and previous config saved to /var/cache/conftool/dbconfig/20241017-215014-ladsgroup.json
- 21:50 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1223.eqiad.wmnet with reason: Maintenance
- 21:49 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1223.eqiad.wmnet with reason: Maintenance
- 21:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T376905)', diff saved to https://phabricator.wikimedia.org/P70320 and previous config saved to /var/cache/conftool/dbconfig/20241017-214949-ladsgroup.json
- 21:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P70319 and previous config saved to /var/cache/conftool/dbconfig/20241017-214043-ladsgroup.json
- 21:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P70318 and previous config saved to /var/cache/conftool/dbconfig/20241017-213442-ladsgroup.json
- 21:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T376905)', diff saved to https://phabricator.wikimedia.org/P70317 and previous config saved to /var/cache/conftool/dbconfig/20241017-212536-ladsgroup.json
- 21:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P70316 and previous config saved to /var/cache/conftool/dbconfig/20241017-211935-ladsgroup.json
- 21:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T376905)', diff saved to https://phabricator.wikimedia.org/P70315 and previous config saved to /var/cache/conftool/dbconfig/20241017-211458-ladsgroup.json
- 21:14 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
- 21:14 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
- 21:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T376905)', diff saved to https://phabricator.wikimedia.org/P70314 and previous config saved to /var/cache/conftool/dbconfig/20241017-211432-ladsgroup.json
- 21:11 kindrobot: UTC late backport window finished <3
- 21:08 kindrobot: results of de-duping: https://phabricator.wikimedia.org/P70313
- 21:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T376905)', diff saved to https://phabricator.wikimedia.org/P70312 and previous config saved to /var/cache/conftool/dbconfig/20241017-210428-ladsgroup.json
- 21:01 kindrobot: ran mwscript-k8s -f --comment="https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1080078/comments/02a9334e_cd3e7a0e" -- namespaceDupes.php on: bclwikisource, bewwiki, gorwikiquote, iglwiki, kaawiktionary, kgewiki, kuswiki, madwiktionary, moswiki, nrwiki, rskwiki, shnwikinews, and tddwiki
- 20:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P70311 and previous config saved to /var/cache/conftool/dbconfig/20241017-205925-ladsgroup.json
- 20:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T376905)', diff saved to https://phabricator.wikimedia.org/P70310 and previous config saved to /var/cache/conftool/dbconfig/20241017-205655-ladsgroup.json
- 20:57 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 20:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 20:56 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
- 20:56 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
- 20:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T376905)', diff saved to https://phabricator.wikimedia.org/P70309 and previous config saved to /var/cache/conftool/dbconfig/20241017-205612-ladsgroup.json
- 20:52 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 20:51 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 20:50 eileen: config revision changed from 150b02a9 to 0d019da0
- 20:50 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 20:50 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 20:49 eileen: config revision changed from 3b3e5cad to 0d019da0
- 20:48 kindrobot@deploy2002: Finished scap sync-world: Backport for Configure namespaces, sitenames, and timezones for new wikis (T377160 T375102 T375017 T375424 T376572 T377088 T374644 T375024 T374815 T375095 T375433 T360303 T363256 T360310) (duration: 31m 15s)
- 20:46 eileen: config revision changed from bf02494d to 3b3e5cad
- 20:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P70308 and previous config saved to /var/cache/conftool/dbconfig/20241017-204418-ladsgroup.json
- 20:43 kindrobot@deploy2002: pppery, kindrobot: Continuing with sync
- 20:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P70307 and previous config saved to /var/cache/conftool/dbconfig/20241017-204105-ladsgroup.json
- 20:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T376905)', diff saved to https://phabricator.wikimedia.org/P70306 and previous config saved to /var/cache/conftool/dbconfig/20241017-202911-ladsgroup.json
- 20:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P70305 and previous config saved to /var/cache/conftool/dbconfig/20241017-202558-ladsgroup.json
- 20:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T376905)', diff saved to https://phabricator.wikimedia.org/P70304 and previous config saved to /var/cache/conftool/dbconfig/20241017-201944-ladsgroup.json
- 20:20 kindrobot@deploy2002: pppery, kindrobot: Backport for Configure namespaces, sitenames, and timezones for new wikis (T377160 T375102 T375017 T375424 T376572 T377088 T374644 T375024 T374815 T375095 T375433 T360303 T363256 T360310) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:20 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
- 20:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
- 20:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T376905)', diff saved to https://phabricator.wikimedia.org/P70303 and previous config saved to /var/cache/conftool/dbconfig/20241017-201919-ladsgroup.json
- 20:17 kindrobot@deploy2002: Started scap sync-world: Backport for Configure namespaces, sitenames, and timezones for new wikis (T377160 T375102 T375017 T375424 T376572 T377088 T374644 T375024 T374815 T375095 T375433 T360303 T363256 T360310)
- 20:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T376905)', diff saved to https://phabricator.wikimedia.org/P70302 and previous config saved to /var/cache/conftool/dbconfig/20241017-201051-ladsgroup.json
- 20:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P70301 and previous config saved to /var/cache/conftool/dbconfig/20241017-200412-ladsgroup.json
- 20:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T376905)', diff saved to https://phabricator.wikimedia.org/P70300 and previous config saved to /var/cache/conftool/dbconfig/20241017-200147-ladsgroup.json
- 20:02 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
- 20:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
- 20:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T376905)', diff saved to https://phabricator.wikimedia.org/P70299 and previous config saved to /var/cache/conftool/dbconfig/20241017-200122-ladsgroup.json
- 19:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P70298 and previous config saved to /var/cache/conftool/dbconfig/20241017-194905-ladsgroup.json
- 19:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P70297 and previous config saved to /var/cache/conftool/dbconfig/20241017-194615-ladsgroup.json
- 19:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T376905)', diff saved to https://phabricator.wikimedia.org/P70296 and previous config saved to /var/cache/conftool/dbconfig/20241017-193358-ladsgroup.json
- 19:33 swfrench-wmf: ran authdns-update to pick up records for mw-(web|api-ext)-next in svc - T377040
- 19:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P70295 and previous config saved to /var/cache/conftool/dbconfig/20241017-193108-ladsgroup.json
- 19:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T376905)', diff saved to https://phabricator.wikimedia.org/P70294 and previous config saved to /var/cache/conftool/dbconfig/20241017-192424-ladsgroup.json
- 19:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 19:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 19:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
- 19:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
- 19:18 dancy@deploy2002: Finished scap sync-world: testing https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/484 (duration: 02m 46s)
- 19:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T376905)', diff saved to https://phabricator.wikimedia.org/P70293 and previous config saved to /var/cache/conftool/dbconfig/20241017-191601-ladsgroup.json
- 19:15 dancy@deploy2002: Started scap sync-world: testing https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/484
- 19:13 dancy@deploy2002: Installing scap version "4.112.0" for 1 hosts
- 19:07 dancy@deploy2002: Installing scap version "4.112.0" for 210 hosts
- 19:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T376905)', diff saved to https://phabricator.wikimedia.org/P70292 and previous config saved to /var/cache/conftool/dbconfig/20241017-190655-ladsgroup.json
- 19:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 19:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 19:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 19:02 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 18:54 ladsgroup@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 18:53 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 18:53 ladsgroup@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 18:53 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 18:49 dancy@deploy2002: Finished scap sync-world: testing scap 4.111.0 (duration: 02m 44s)
- 18:49 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 18:48 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 18:48 urbanecm: mwscript-k8s --comment=T377360 -f -- extensions/Flow/maintenance/FlowFixInconsistentBoards.php --wiki=wikidatawiki # T377360
- 18:47 dancy@deploy2002: Started scap sync-world: testing scap 4.111.0
- 18:45 dancy@deploy2002: Installation of scap version "4.111.0" completed for 210 hosts
- 18:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P70291 and previous config saved to /var/cache/conftool/dbconfig/20241017-184402-arnaudb.json
- 18:41 dancy@deploy2002: Installing scap version "4.111.0" for 210 hosts
- 18:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P70290 and previous config saved to /var/cache/conftool/dbconfig/20241017-182855-arnaudb.json
- 18:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 18:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 18:19 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.43.0-wmf.27 refs T375658
- 18:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2081.codfw.wmnet with OS bullseye
- 18:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P70289 and previous config saved to /var/cache/conftool/dbconfig/20241017-181348-arnaudb.json
- 17:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P70288 and previous config saved to /var/cache/conftool/dbconfig/20241017-175841-arnaudb.json
- 17:56 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 17:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 17:43 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 17:43 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 17:34 swfrench@deploy2002: Finished scap sync-world: Testing scap after mw-api-ext / mw-web next release bring up - T377040 (duration: 02m 54s)
- 17:31 swfrench@deploy2002: Started scap sync-world: Testing scap after mw-api-ext / mw-web next release bring up - T377040
- 17:20 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 17:19 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 17:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T376905)', diff saved to https://phabricator.wikimedia.org/P70287 and previous config saved to /var/cache/conftool/dbconfig/20241017-171844-ladsgroup.json
- 17:18 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
- 17:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 17:17 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
- 17:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 17:15 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
- 17:15 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
- 17:14 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
- 17:14 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 17:14 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2081.codfw.wmnet with OS bullseye
- 17:13 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 17:12 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 17:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 17:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 17:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P70286 and previous config saved to /var/cache/conftool/dbconfig/20241017-170337-ladsgroup.json
- 16:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P70285 and previous config saved to /var/cache/conftool/dbconfig/20241017-165814-arnaudb.json
- 16:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 16:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 16:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T367781)', diff saved to https://phabricator.wikimedia.org/P70284 and previous config saved to /var/cache/conftool/dbconfig/20241017-165803-arnaudb.json
- 16:55 mutante: phab2002 T377396 - reboot | in addition to /etc/passwd also fix aphlict GID in /etc/group | fixed puppet run which can now create group vcs. now equivalent to prod server phab1004.
- 16:53 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
- 16:52 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
- 16:52 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 16:52 dcausse@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
- 16:51 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 16:51 dcausse@deploy2002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
- 16:50 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
- 16:49 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
- 16:49 mutante: phab2002 T377396 - fix UIDs/GIDs for phab-related system users: vcs: uid 496 -> 497 | aphlict: uid 497 -> uid 496, gid 497 -> gid 496 | chown aphlict:aphlict /var/log/aphlict | chown aphlict:aphlict /run/aphlict
- 16:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P70283 and previous config saved to /var/cache/conftool/dbconfig/20241017-164830-ladsgroup.json
- 16:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P70282 and previous config saved to /var/cache/conftool/dbconfig/20241017-164256-arnaudb.json
- 16:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 16:40 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 16:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 16:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T376905)', diff saved to https://phabricator.wikimedia.org/P70281 and previous config saved to /var/cache/conftool/dbconfig/20241017-163324-ladsgroup.json
- 16:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 16:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P70280 and previous config saved to /var/cache/conftool/dbconfig/20241017-162749-arnaudb.json
- 16:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T367781)', diff saved to https://phabricator.wikimedia.org/P70279 and previous config saved to /var/cache/conftool/dbconfig/20241017-161242-arnaudb.json
- 16:02 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
- 16:01 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
- 16:00 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
- 16:00 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
- 15:59 dcausse@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
- 15:59 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
- 15:59 dcausse@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
- 15:58 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
- 15:58 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
- 15:58 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
- 15:57 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
- 15:57 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
- 15:56 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
- 15:56 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
- 15:52 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
- 15:51 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
- 15:51 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
- 15:50 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
- 15:48 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
- 15:48 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
- 15:47 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/commons-impact-analytics: apply
- 15:47 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/commons-impact-analytics: apply
- 15:45 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
- 15:45 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
- 15:44 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/changeprop: apply
- 15:44 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/changeprop: apply
- 15:41 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/changeprop: apply
- 15:40 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/changeprop: apply
- 15:39 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster2005.codfw.wmnet with OS bookworm
- 15:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P70278 and previous config saved to /var/cache/conftool/dbconfig/20241017-153546-ladsgroup.json
- 15:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2219 (T376905)', diff saved to https://phabricator.wikimedia.org/P70277 and previous config saved to /var/cache/conftool/dbconfig/20241017-153257-ladsgroup.json
- 15:33 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 15:33 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 15:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T376905)', diff saved to https://phabricator.wikimedia.org/P70276 and previous config saved to /var/cache/conftool/dbconfig/20241017-153238-ladsgroup.json
- 15:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P70275 and previous config saved to /var/cache/conftool/dbconfig/20241017-152040-ladsgroup.json
- 15:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P70274 and previous config saved to /var/cache/conftool/dbconfig/20241017-151731-ladsgroup.json
- 15:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T367781)', diff saved to https://phabricator.wikimedia.org/P70273 and previous config saved to /var/cache/conftool/dbconfig/20241017-151216-arnaudb.json
- 15:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 15:12 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 15:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P70272 and previous config saved to /var/cache/conftool/dbconfig/20241017-151204-arnaudb.json
- 15:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P70271 and previous config saved to /var/cache/conftool/dbconfig/20241017-150535-ladsgroup.json
- 15:05 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
- 15:05 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
- 15:04 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 15:03 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 15:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P70270 and previous config saved to /var/cache/conftool/dbconfig/20241017-150224-ladsgroup.json
- 15:01 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 15:00 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 15:00 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:59 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:57 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:57 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P70269 and previous config saved to /var/cache/conftool/dbconfig/20241017-145657-arnaudb.json
- 14:56 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:56 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:54 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:54 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:54 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:53 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:53 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:52 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:52 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1166 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P70268 and previous config saved to /var/cache/conftool/dbconfig/20241017-145030-ladsgroup.json
- 14:51 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:51 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:51 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:50 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:50 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:49 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:49 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:49 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:48 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:48 dcausse@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:47 dcausse@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T376905)', diff saved to https://phabricator.wikimedia.org/P70267 and previous config saved to /var/cache/conftool/dbconfig/20241017-144717-ladsgroup.json
- 14:43 dcausse@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:43 dcausse@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P70266 and previous config saved to /var/cache/conftool/dbconfig/20241017-144150-arnaudb.json
- 14:41 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:40 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:40 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:39 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:38 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
- 14:38 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
- 14:31 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
- 14:28 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
- 14:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P70265 and previous config saved to /var/cache/conftool/dbconfig/20241017-142643-arnaudb.json
- 14:09 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster2005.codfw.wmnet with OS bookworm
- 14:08 urbanecm@deploy2002: Finished scap sync-world: Backport for Bump wikimedia/parsoid to 0.20.0-a26 (T377287), Bump wikimedia/parsoid to 0.20.0-a26 (T377287) (duration: 09m 41s)
- 14:03 urbanecm@deploy2002: cscott, urbanecm: Continuing with sync
- 14:00 urbanecm@deploy2002: cscott, urbanecm: Backport for Bump wikimedia/parsoid to 0.20.0-a26 (T377287), Bump wikimedia/parsoid to 0.20.0-a26 (T377287) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:00 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 13:59 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 13:58 urbanecm@deploy2002: Started scap sync-world: Backport for Bump wikimedia/parsoid to 0.20.0-a26 (T377287), Bump wikimedia/parsoid to 0.20.0-a26 (T377287)
- 13:56 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 13:54 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 13:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T376905)', diff saved to https://phabricator.wikimedia.org/P70264 and previous config saved to /var/cache/conftool/dbconfig/20241017-134651-ladsgroup.json
- 13:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 13:46 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 13:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T376905)', diff saved to https://phabricator.wikimedia.org/P70263 and previous config saved to /var/cache/conftool/dbconfig/20241017-134636-ladsgroup.json
- 13:35 urbanecm@deploy2002: Finished scap sync-world: Backport for Set $wgAllowRawHtmlCopyrightMessages = false (T375789), tests: ensure maintenance base class has always been requierd (T377391 T357535) (duration: 08m 07s)
- 13:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P70261 and previous config saved to /var/cache/conftool/dbconfig/20241017-133129-ladsgroup.json
- 13:30 urbanecm@deploy2002: cscott, urbanecm, matmarex: Continuing with sync
- 13:29 urbanecm@deploy2002: cscott, urbanecm, matmarex: Backport for Set $wgAllowRawHtmlCopyrightMessages = false (T375789), tests: ensure maintenance base class has always been requierd (T377391 T357535) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:29 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript updateCollation.php --wiki=cswikivoyage --previous-collation=uppercase # T377446
- 13:27 urbanecm@deploy2002: Started scap sync-world: Backport for Set $wgAllowRawHtmlCopyrightMessages = false (T375789), tests: ensure maintenance base class has always been requierd (T377391 T357535)
- 13:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P70260 and previous config saved to /var/cache/conftool/dbconfig/20241017-132617-arnaudb.json
- 13:26 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 13:26 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 13:26 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 13:26 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 13:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2204.codfw.wmnet with reason: Maintenance
- 13:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2204.codfw.wmnet with reason: Maintenance
- 13:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 13:23 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 13:22 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 13:22 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 13:18 inflatador: bking@wdqs1015 depooling to catch up on lag
- 13:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P70258 and previous config saved to /var/cache/conftool/dbconfig/20241017-131622-ladsgroup.json
- 13:14 urbanecm@deploy2002: Finished scap sync-world: Backport for cswikivoyage: Set category collation to uca-cs-u-kn (T377446), QuickSurveys: Update safety survey coverage (T376517) (duration: 07m 23s)
- 13:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T376905)', diff saved to https://phabricator.wikimedia.org/P70257 and previous config saved to /var/cache/conftool/dbconfig/20241017-131012-ladsgroup.json
- 13:10 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 13:10 urbanecm@deploy2002: kharlan, urbanecm: Continuing with sync
- 13:09 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 13:09 urbanecm@deploy2002: kharlan, urbanecm: Backport for cswikivoyage: Set category collation to uca-cs-u-kn (T377446), QuickSurveys: Update safety survey coverage (T376517) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T376905)', diff saved to https://phabricator.wikimedia.org/P70256 and previous config saved to /var/cache/conftool/dbconfig/20241017-130947-ladsgroup.json
- 13:09 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 13:07 urbanecm@deploy2002: Started scap sync-world: Backport for cswikivoyage: Set category collation to uca-cs-u-kn (T377446), QuickSurveys: Update safety survey coverage (T376517)
- 13:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T376905)', diff saved to https://phabricator.wikimedia.org/P70255 and previous config saved to /var/cache/conftool/dbconfig/20241017-130115-ladsgroup.json
- 13:00 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster2005.codfw.wmnet with OS bookworm
- 12:59 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 12:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2209.codfw.wmnet with reason: Maintenance
- 12:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2209.codfw.wmnet with reason: Maintenance
- 12:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1189.eqiad.wmnet with reason: Maintenance
- 12:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1189.eqiad.wmnet with reason: Maintenance
- 12:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P70254 and previous config saved to /var/cache/conftool/dbconfig/20241017-125440-ladsgroup.json
- 12:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P70253 and previous config saved to /var/cache/conftool/dbconfig/20241017-123932-ladsgroup.json
- 12:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T376905)', diff saved to https://phabricator.wikimedia.org/P70252 and previous config saved to /var/cache/conftool/dbconfig/20241017-122425-ladsgroup.json
- 12:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1157 (T376905)', diff saved to https://phabricator.wikimedia.org/P70251 and previous config saved to /var/cache/conftool/dbconfig/20241017-121525-ladsgroup.json
- 12:15 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
- 12:15 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
- 12:10 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 12:10 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 12:07 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 12:07 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 12:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T376905)', diff saved to https://phabricator.wikimedia.org/P70250 and previous config saved to /var/cache/conftool/dbconfig/20241017-120049-ladsgroup.json
- 12:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 12:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 12:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 12:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 12:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T376905)', diff saved to https://phabricator.wikimedia.org/P70249 and previous config saved to /var/cache/conftool/dbconfig/20241017-120029-ladsgroup.json
- 11:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P70248 and previous config saved to /var/cache/conftool/dbconfig/20241017-114522-ladsgroup.json
- 11:39 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1177.eqiad.wmnet
- 11:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P70247 and previous config saved to /var/cache/conftool/dbconfig/20241017-113014-ladsgroup.json
- 11:29 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-worker1177.eqiad.wmnet
- 11:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T376905)', diff saved to https://phabricator.wikimedia.org/P70246 and previous config saved to /var/cache/conftool/dbconfig/20241017-111507-ladsgroup.json
- 11:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2147 (T376905)', diff saved to https://phabricator.wikimedia.org/P70245 and previous config saved to /var/cache/conftool/dbconfig/20241017-110527-ladsgroup.json
- 11:05 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 11:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 10:58 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 10:58 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 10:17 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on kubestagemaster2005.codfw.wmnet with reason: reimage
- 10:17 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on kubestagemaster2005.codfw.wmnet with reason: reimage
- 09:39 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 09:34 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host phab2002.codfw.wmnet with OS bullseye
- 09:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 09:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 09:09 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Add support for read-only users - oblivian@cumin1002"
- 09:09 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Add support for read-only users - oblivian@cumin1002
- 09:08 oblivian@cumin1002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Add support for read-only users - oblivian@cumin1002
- 09:08 oblivian@cumin1002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Add support for read-only users - oblivian@cumin1002"
- 09:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 100%: post clone', diff saved to https://phabricator.wikimedia.org/P70243 and previous config saved to /var/cache/conftool/dbconfig/20241017-090731-arnaudb.json
- 08:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 75%: post clone', diff saved to https://phabricator.wikimedia.org/P70242 and previous config saved to /var/cache/conftool/dbconfig/20241017-085226-arnaudb.json
- 08:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 50%: post clone', diff saved to https://phabricator.wikimedia.org/P70241 and previous config saved to /var/cache/conftool/dbconfig/20241017-083721-arnaudb.json
- 08:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 25%: post clone', diff saved to https://phabricator.wikimedia.org/P70240 and previous config saved to /var/cache/conftool/dbconfig/20241017-082215-arnaudb.json
- 08:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2149 to reclone on db2205 - T377276', diff saved to https://phabricator.wikimedia.org/P70239 and previous config saved to /var/cache/conftool/dbconfig/20241017-081822-arnaudb.json
- 08:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 25%: post clone', diff saved to https://phabricator.wikimedia.org/P70238 and previous config saved to /var/cache/conftool/dbconfig/20241017-081802-arnaudb.json
- 08:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2149.codfw.wmnet onto db2205.codfw.wmnet
- 08:11 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet
- 08:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet
- 07:55 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 07:55 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 07:51 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
- 07:48 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
- 07:37 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 07:37 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 07:37 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2082.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 07:36 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2082.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 07:28 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster2005.codfw.wmnet with OS bookworm
- 07:19 dcausse@deploy2002: Finished scap sync-world: Backport for cirrus: cleanup removed label_count field on next re-index (T377226) (duration: 10m 40s)
- 07:18 jayme@cumin1002: conftool action : set/pooled=inactive; selector: name=kubestagemaster2005.codfw.wmnet
- 07:14 dcausse@deploy2002: dcausse: Continuing with sync
- 07:13 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on kubestagemaster2005.codfw.wmnet with reason: reimage
- 07:13 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on kubestagemaster2005.codfw.wmnet with reason: reimage
- 07:13 dcausse@deploy2002: dcausse: Backport for cirrus: cleanup removed label_count field on next re-index (T377226) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 07:08 dcausse@deploy2002: Started scap sync-world: Backport for cirrus: cleanup removed label_count field on next re-index (T377226)
- 07:00 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2149.codfw.wmnet onto db2205.codfw.wmnet
- 07:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2149 to reclone on db2205 - T377276', diff saved to https://phabricator.wikimedia.org/P70237 and previous config saved to /var/cache/conftool/dbconfig/20241017-070015-arnaudb.json
- 06:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2205.codfw.wmnet with OS bookworm
- 06:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db1219 (re)pooling @ 100%: T367781', diff saved to https://phabricator.wikimedia.org/P70236 and previous config saved to /var/cache/conftool/dbconfig/20241017-063238-arnaudb.json
- 06:31 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2205.codfw.wmnet with reason: host reimage
- 06:26 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2205.codfw.wmnet with reason: host reimage
- 06:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db1219 (re)pooling @ 75%: T367781', diff saved to https://phabricator.wikimedia.org/P70235 and previous config saved to /var/cache/conftool/dbconfig/20241017-061732-arnaudb.json
- 06:07 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2205.codfw.wmnet with OS bookworm
- 06:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db1219 (re)pooling @ 50%: T367781', diff saved to https://phabricator.wikimedia.org/P70234 and previous config saved to /var/cache/conftool/dbconfig/20241017-060227-arnaudb.json
- 05:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db1219 (re)pooling @ 25%: T367781', diff saved to https://phabricator.wikimedia.org/P70233 and previous config saved to /var/cache/conftool/dbconfig/20241017-054722-arnaudb.json
- 05:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T376905)', diff saved to https://phabricator.wikimedia.org/P70231 and previous config saved to /var/cache/conftool/dbconfig/20241017-051700-ladsgroup.json
- 05:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P70230 and previous config saved to /var/cache/conftool/dbconfig/20241017-050153-ladsgroup.json
- 04:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P70229 and previous config saved to /var/cache/conftool/dbconfig/20241017-044646-ladsgroup.json
- 04:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T376905)', diff saved to https://phabricator.wikimedia.org/P70228 and previous config saved to /var/cache/conftool/dbconfig/20241017-043139-ladsgroup.json
- 04:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2222 (T376905)', diff saved to https://phabricator.wikimedia.org/P70227 and previous config saved to /var/cache/conftool/dbconfig/20241017-042440-ladsgroup.json
- 04:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
- 04:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
- 04:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T376905)', diff saved to https://phabricator.wikimedia.org/P70226 and previous config saved to /var/cache/conftool/dbconfig/20241017-042413-ladsgroup.json
- 04:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P70225 and previous config saved to /var/cache/conftool/dbconfig/20241017-040906-ladsgroup.json
- 03:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P70224 and previous config saved to /var/cache/conftool/dbconfig/20241017-035359-ladsgroup.json
- 03:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T376905)', diff saved to https://phabricator.wikimedia.org/P70223 and previous config saved to /var/cache/conftool/dbconfig/20241017-033852-ladsgroup.json
- 03:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2221 (T376905)', diff saved to https://phabricator.wikimedia.org/P70222 and previous config saved to /var/cache/conftool/dbconfig/20241017-033144-ladsgroup.json
- 03:31 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
- 03:31 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
- 03:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T376905)', diff saved to https://phabricator.wikimedia.org/P70221 and previous config saved to /var/cache/conftool/dbconfig/20241017-033118-ladsgroup.json
- 03:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P70220 and previous config saved to /var/cache/conftool/dbconfig/20241017-031611-ladsgroup.json
- 03:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P70219 and previous config saved to /var/cache/conftool/dbconfig/20241017-030104-ladsgroup.json
- 02:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T376905)', diff saved to https://phabricator.wikimedia.org/P70218 and previous config saved to /var/cache/conftool/dbconfig/20241017-024557-ladsgroup.json
- 02:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2220 (T376905)', diff saved to https://phabricator.wikimedia.org/P70217 and previous config saved to /var/cache/conftool/dbconfig/20241017-023857-ladsgroup.json
- 02:38 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2220.codfw.wmnet with reason: Maintenance
- 02:38 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2220.codfw.wmnet with reason: Maintenance
- 02:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T376905)', diff saved to https://phabricator.wikimedia.org/P70216 and previous config saved to /var/cache/conftool/dbconfig/20241017-023831-ladsgroup.json
- 02:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P70215 and previous config saved to /var/cache/conftool/dbconfig/20241017-022324-ladsgroup.json
- 02:18 tstarling@deploy2002: Synchronized wmf-config/InitialiseSettings.php: T4085 Enable en on Commons and Meta (duration: 06m 34s)
- 02:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P70214 and previous config saved to /var/cache/conftool/dbconfig/20241017-020817-ladsgroup.json
- 01:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T376905)', diff saved to https://phabricator.wikimedia.org/P70213 and previous config saved to /var/cache/conftool/dbconfig/20241017-015310-ladsgroup.json
- 01:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2208 (T376905)', diff saved to https://phabricator.wikimedia.org/P70212 and previous config saved to /var/cache/conftool/dbconfig/20241017-014500-ladsgroup.json
- 01:44 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
- 01:44 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
- 01:39 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
- 01:39 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
- 01:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T376905)', diff saved to https://phabricator.wikimedia.org/P70211 and previous config saved to /var/cache/conftool/dbconfig/20241017-013926-ladsgroup.json
- 01:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P70210 and previous config saved to /var/cache/conftool/dbconfig/20241017-012419-ladsgroup.json
- 01:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P70209 and previous config saved to /var/cache/conftool/dbconfig/20241017-010912-ladsgroup.json
- 00:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T376905)', diff saved to https://phabricator.wikimedia.org/P70208 and previous config saved to /var/cache/conftool/dbconfig/20241017-005405-ladsgroup.json
- 00:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T376905)', diff saved to https://phabricator.wikimedia.org/P70207 and previous config saved to /var/cache/conftool/dbconfig/20241017-004537-ladsgroup.json
- 00:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
- 00:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
- 00:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T376905)', diff saved to https://phabricator.wikimedia.org/P70206 and previous config saved to /var/cache/conftool/dbconfig/20241017-004511-ladsgroup.json
- 00:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P70204 and previous config saved to /var/cache/conftool/dbconfig/20241017-003004-ladsgroup.json
- 00:26 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
- 00:25 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
- 00:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P70203 and previous config saved to /var/cache/conftool/dbconfig/20241017-001457-ladsgroup.json
2024-10-16
- 23:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T376905)', diff saved to https://phabricator.wikimedia.org/P70202 and previous config saved to /var/cache/conftool/dbconfig/20241016-235950-ladsgroup.json
- 23:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2168 (T376905)', diff saved to https://phabricator.wikimedia.org/P70201 and previous config saved to /var/cache/conftool/dbconfig/20241016-235129-ladsgroup.json
- 23:51 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 23:51 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 23:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T376905)', diff saved to https://phabricator.wikimedia.org/P70200 and previous config saved to /var/cache/conftool/dbconfig/20241016-235102-ladsgroup.json
- 23:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P70199 and previous config saved to /var/cache/conftool/dbconfig/20241016-233555-ladsgroup.json
- 23:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P70198 and previous config saved to /var/cache/conftool/dbconfig/20241016-232048-ladsgroup.json
- 23:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T376905)', diff saved to https://phabricator.wikimedia.org/P70197 and previous config saved to /var/cache/conftool/dbconfig/20241016-230541-ladsgroup.json
- 22:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T376905)', diff saved to https://phabricator.wikimedia.org/P70196 and previous config saved to /var/cache/conftool/dbconfig/20241016-225716-ladsgroup.json
- 22:57 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 22:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 22:57 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
- 22:56 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
- 22:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T376905)', diff saved to https://phabricator.wikimedia.org/P70195 and previous config saved to /var/cache/conftool/dbconfig/20241016-225646-ladsgroup.json
- 22:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P70194 and previous config saved to /var/cache/conftool/dbconfig/20241016-224139-ladsgroup.json
- 22:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P70193 and previous config saved to /var/cache/conftool/dbconfig/20241016-222632-ladsgroup.json
- 22:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T376905)', diff saved to https://phabricator.wikimedia.org/P70192 and previous config saved to /var/cache/conftool/dbconfig/20241016-221125-ladsgroup.json
- 22:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T376905)', diff saved to https://phabricator.wikimedia.org/P70191 and previous config saved to /var/cache/conftool/dbconfig/20241016-220053-ladsgroup.json
- 22:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
- 22:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
- 21:17 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 21:17 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 21:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 21:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 20:44 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
- 20:44 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
- 20:43 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich-next: apply
- 20:43 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich-next: apply
- 20:39 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 20:39 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 20:37 gmodena@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 20:37 gmodena@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
- 20:31 brennen@deploy2002: Finished deploy [phabricator/deployment@40a63c9]: deploy phab2002 for T377374 (duration: 00m 08s)
- 20:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T376905)', diff saved to https://phabricator.wikimedia.org/P70189 and previous config saved to /var/cache/conftool/dbconfig/20241016-203034-ladsgroup.json
- 20:30 brennen@deploy2002: Started deploy [phabricator/deployment@40a63c9]: deploy phab2002 for T377374
- 20:29 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 20:29 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
- 20:26 jhuneidi@deploy2002: Finished scap sync-world: Backport for Make wikitech a target for CentralNotice banners (T377030) (duration: 10m 02s)
- 20:21 jhuneidi@deploy2002: ejegg, jhuneidi: Continuing with sync
- 20:18 jhuneidi@deploy2002: ejegg, jhuneidi: Backport for Make wikitech a target for CentralNotice banners (T377030) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:18 mutante: phab2002 - ln -s /var/lib/scap/scap/bin/scap /usr/bin/scap
- 20:17 mutante: phab2002 - after manually running bootstrap-scap-target.sh and "Scap from local bullseye wheels successfully installed at /var/lib/scap/scap" still "cannot open `/usr/bin/scap' (No such file or directory)" though. T303559 T310740 T377374
- 20:17 jhuneidi@deploy2002: Started scap sync-world: Backport for Make wikitech a target for CentralNotice banners (T377030)
- 20:16 mutante: phab2002 - manually bootstrapping scap since puppet did not do it due to dependency cycles: sudo -u scap /usr/local/bin/bootstrap-scap-target.sh deploy2002.codfw.wmnet /var/lib/scap T303559 T310740 T377374
- 20:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P70188 and previous config saved to /var/cache/conftool/dbconfig/20241016-201527-ladsgroup.json
- 20:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P70187 and previous config saved to /var/cache/conftool/dbconfig/20241016-200020-ladsgroup.json
- 19:54 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mx-out1001.wikimedia.org
- 19:50 jhathaway@cumin1002: START - Cookbook sre.hosts.reboot-single for host mx-out1001.wikimedia.org
- 19:49 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mx-out2001.wikimedia.org
- 19:47 jhathaway@cumin1002: START - Cookbook sre.hosts.reboot-single for host mx-out2001.wikimedia.org
- 19:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T376905)', diff saved to https://phabricator.wikimedia.org/P70186 and previous config saved to /var/cache/conftool/dbconfig/20241016-194513-ladsgroup.json
- 19:47 jhathaway@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host mx-out2001.wikimedia.org
- 19:47 jhathaway@cumin1002: START - Cookbook sre.hosts.reboot-single for host mx-out2001.wikimedia.org
- 19:46 jhathaway@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host mx-out2001.wikimedia.org
- 19:45 jhathaway@cumin1002: START - Cookbook sre.hosts.reboot-single for host mx-out2001.wikimedia.org
- 19:45 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mx1001.wikimedia.org
- 19:44 jhathaway@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:44 jhathaway@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mx1001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jhathaway@cumin1002"
- 19:43 jhathaway@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mx1001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jhathaway@cumin1002"
- 19:42 jhathaway@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host mx-out2001.wikimedia.org
- 19:42 jhathaway@cumin1002: START - Cookbook sre.hosts.reboot-single for host mx-out2001.wikimedia.org
- 19:40 jhathaway@cumin1002: START - Cookbook sre.dns.netbox
- 19:36 jhathaway@cumin1002: START - Cookbook sre.hosts.decommission for hosts mx1001.wikimedia.org
- 19:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2237 (T376905)', diff saved to https://phabricator.wikimedia.org/P70185 and previous config saved to /var/cache/conftool/dbconfig/20241016-193500-ladsgroup.json
- 19:35 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
- 19:34 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
- 19:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T376905)', diff saved to https://phabricator.wikimedia.org/P70184 and previous config saved to /var/cache/conftool/dbconfig/20241016-193433-ladsgroup.json
- 19:30 brennen@deploy2002: Finished deploy [phabricator/deployment@40a63c9]: deploy phab2002 for T377374 (duration: 10m 42s)
- 19:19 brennen@deploy2002: Started deploy [phabricator/deployment@40a63c9]: deploy phab2002 for T377374
- 19:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P70183 and previous config saved to /var/cache/conftool/dbconfig/20241016-191926-ladsgroup.json
- 19:16 inflatador: bking@stat1011 racadm>>racadm jobqueue create BIOS.Setup.1-1 Commit JID = JID_291241139935 T376813
- 19:14 inflatador: bking@stat1011 racadm>>racadm set BIOS.MemSettings.NodeInterleave Enabled T376813
- 19:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P70182 and previous config saved to /var/cache/conftool/dbconfig/20241016-190419-ladsgroup.json
- 18:54 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1177.eqiad.wmnet with OS bullseye
- 18:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T376905)', diff saved to https://phabricator.wikimedia.org/P70181 and previous config saved to /var/cache/conftool/dbconfig/20241016-184912-ladsgroup.json
- 18:47 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mx2001.wikimedia.org
- 18:47 jhathaway@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:46 jhathaway@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mx2001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jhathaway@cumin1002"
- 18:45 jhathaway@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mx2001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jhathaway@cumin1002"
- 18:43 papaul: maintenance on mr1-ulsfo complete
- 18:41 jhathaway@cumin1002: START - Cookbook sre.dns.netbox
- 18:36 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1176.eqiad.wmnet with OS bullseye
- 18:35 jhathaway@cumin1002: START - Cookbook sre.hosts.decommission for hosts mx2001.wikimedia.org
- 18:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab2002.codfw.wmnet with reason: host reimage
- 18:32 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
- 18:32 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
- 18:31 gmodena@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich-next: apply
- 18:31 gmodena@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich-next: apply
- 18:29 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab2002.codfw.wmnet with reason: host reimage
- 18:27 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 18:27 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 18:21 gmodena@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 18:20 gmodena@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
- 18:17 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.43.0-wmf.27 refs T375658
- 18:13 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host phab2002
- 18:13 dzahn@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host phab2002
- 18:13 dzahn@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host phab2002
- 18:12 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) phab2002.codfw.wmnet 54.32.192.10.in-addr.arpa 4.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 18:12 dzahn@cumin2002: START - Cookbook sre.dns.wipe-cache phab2002.codfw.wmnet 54.32.192.10.in-addr.arpa 4.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 18:12 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:12 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host phab2002 - dzahn@cumin2002"
- 18:11 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 18:11 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host phab2002 - dzahn@cumin2002"
- 18:11 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
- 18:06 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 18:05 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 18:04 cdanis@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 18:02 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 18:01 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 18:00 papaul: ongoing maintenance on mr1-ulsfo
- 18:00 cdanis@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 17:58 dzahn@cumin2002: START - Cookbook sre.hosts.move-vlan for host phab2002
- 17:58 cdanis@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 17:57 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host phab2002.codfw.wmnet with OS bullseye
- 17:56 cdanis@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 17:55 cdanis@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 17:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2219 (T376905)', diff saved to https://phabricator.wikimedia.org/P70179 and previous config saved to /var/cache/conftool/dbconfig/20241016-174847-ladsgroup.json
- 17:49 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 17:49 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 17:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T376905)', diff saved to https://phabricator.wikimedia.org/P70178 and previous config saved to /var/cache/conftool/dbconfig/20241016-174821-ladsgroup.json
- 17:48 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:48 swfrench@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add newly allocated LVS VIPs for mw-web-next and mw-api-ext-next - swfrench@cumin2002"
- 17:41 swfrench@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add newly allocated LVS VIPs for mw-web-next and mw-api-ext-next - swfrench@cumin2002"
- 17:39 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 17:38 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
- 17:37 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1177.eqiad.wmnet with OS bullseye
- 17:37 swfrench@cumin2002: START - Cookbook sre.dns.netbox
- 17:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P70177 and previous config saved to /var/cache/conftool/dbconfig/20241016-173314-ladsgroup.json
- 17:20 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1176.eqiad.wmnet with OS bullseye
- 17:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P70176 and previous config saved to /var/cache/conftool/dbconfig/20241016-171807-ladsgroup.json
- 17:16 xcollazo@deploy2002: Finished deploy [analytics/refinery@f186c94] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@f186c94a] (duration: 03m 44s)
- 17:13 xcollazo@deploy2002: Started deploy [analytics/refinery@f186c94] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@f186c94a]
- 17:12 xcollazo@deploy2002: Finished deploy [analytics/refinery@f186c94] (thin): Regular analytics weekly train THIN [analytics/refinery@f186c94a] (duration: 05m 11s)
- 17:06 xcollazo@deploy2002: Started deploy [analytics/refinery@f186c94] (thin): Regular analytics weekly train THIN [analytics/refinery@f186c94a]
- 17:06 xcollazo@deploy2002: Finished deploy [analytics/refinery@f186c94]: Regular analytics weekly train [analytics/refinery@f186c94a] (duration: 08m 54s)
- 17:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T376905)', diff saved to https://phabricator.wikimedia.org/P70175 and previous config saved to /var/cache/conftool/dbconfig/20241016-170300-ladsgroup.json
- 16:57 xcollazo@deploy2002: Started deploy [analytics/refinery@f186c94]: Regular analytics weekly train [analytics/refinery@f186c94a]
- 16:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2210 (T376905)', diff saved to https://phabricator.wikimedia.org/P70174 and previous config saved to /var/cache/conftool/dbconfig/20241016-165343-ladsgroup.json
- 16:54 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
- 16:53 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
- 16:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T376905)', diff saved to https://phabricator.wikimedia.org/P70173 and previous config saved to /var/cache/conftool/dbconfig/20241016-165317-ladsgroup.json
- 16:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P70172 and previous config saved to /var/cache/conftool/dbconfig/20241016-163810-ladsgroup.json
- 16:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P70171 and previous config saved to /var/cache/conftool/dbconfig/20241016-162303-ladsgroup.json
- 16:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T376905)', diff saved to https://phabricator.wikimedia.org/P70170 and previous config saved to /var/cache/conftool/dbconfig/20241016-160756-ladsgroup.json
- 16:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2206 (T376905)', diff saved to https://phabricator.wikimedia.org/P70169 and previous config saved to /var/cache/conftool/dbconfig/20241016-155948-ladsgroup.json
- 15:59 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
- 15:59 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
- 15:55 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance
- 15:55 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance
- 15:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T376905)', diff saved to https://phabricator.wikimedia.org/P70168 and previous config saved to /var/cache/conftool/dbconfig/20241016-155450-ladsgroup.json
- 15:52 papaul: maintenance on mr1-eqsin complete
- 15:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P70167 and previous config saved to /var/cache/conftool/dbconfig/20241016-153943-ladsgroup.json
- 15:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P70166 and previous config saved to /var/cache/conftool/dbconfig/20241016-152436-ladsgroup.json
- 15:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T376905)', diff saved to https://phabricator.wikimedia.org/P70165 and previous config saved to /var/cache/conftool/dbconfig/20241016-150928-ladsgroup.json
- 15:05 papaul: ongoing maintenance on mr1-eqsin
- 14:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2082.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 14:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2082.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 14:41 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] beta: Lower batch size for reassignMenteesJob (T376124) (duration: 06m 46s)
- 14:35 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] beta: Lower batch size for reassignMenteesJob (T376124)
- 14:25 Lucas_WMDE: UTC afternoon backport+config window done
- 14:25 Lucas_WMDE: [cont.] 7)]], Hard-code LabelCountField::NAME (T377226), Remove LabelCountField (T377226), Drop label_count field (LabelCountField) (T377226) (duration: 11m 36s)
- {{safesubst:SAL entry|1=14:24 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Tests: Skip testViewForExistingGlobalTemporaryAccount (T377197), Hard-code LabelCountField::NAME (T377226), Remove LabelCountField (T377226), Drop label_count field (LabelCountField) (T377226), [[gerrit:1080703|Tests: Skip testViewForExistingGlobalTemporaryAccount (T37719}}
- 14:23 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 14:23 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 14:20 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Continuing with sync
- 14:19 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "[not really into teleological thinking] - oblivian@cumin1002"
- 14:19 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: [not really into teleological thinking] - oblivian@cumin1002
- 14:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 14:18 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 14:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T367856)', diff saved to https://phabricator.wikimedia.org/P70164 and previous config saved to /var/cache/conftool/dbconfig/20241016-141819-ladsgroup.json
- 14:18 oblivian@cumin1002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: [not really into teleological thinking] - oblivian@cumin1002
- 14:18 oblivian@cumin1002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "[not really into teleological thinking] - oblivian@cumin1002"
- 14:17 oblivian@cumin1002: END (FAIL) - Cookbook sre.deploy.hiddenparma (exit_code=99) Hiddenparma deployment to the alerting hosts with reason: "[not really into teleological thinking] - oblivian@cumin1002"
- 14:17 oblivian@cumin1002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "[not really into teleological thinking] - oblivian@cumin1002"
- 14:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 14:16 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 14:15 Lucas_WMDE: [cont.] ], Hard-code LabelCountField::NAME (T377226), Remove LabelCountField (T377226), Drop label_count field (LabelCountField) (T377226) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- {{safesubst:SAL entry|1=14:15 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for Tests: Skip testViewForExistingGlobalTemporaryAccount (T377197), Hard-code LabelCountField::NAME (T377226), Remove LabelCountField (T377226), Drop label_count field (LabelCountField) (T377226), Tests: Skip testViewForExistingGlobalTemporaryAccount (T377197)]}}
- 14:13 Lucas_WMDE: [cont.] ), Hard-code LabelCountField::NAME (T377226), Remove LabelCountField (T377226), Drop label_count field (LabelCountField) (T377226)
- {{safesubst:SAL entry|1=14:13 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Tests: Skip testViewForExistingGlobalTemporaryAccount (T377197), Hard-code LabelCountField::NAME (T377226), Remove LabelCountField (T377226), Drop label_count field (LabelCountField) (T377226), [[gerrit:1080703|Tests: Skip testViewForExistingGlobalTemporaryAccount (T377197}}
- 14:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T376905)', diff saved to https://phabricator.wikimedia.org/P70163 and previous config saved to /var/cache/conftool/dbconfig/20241016-140902-ladsgroup.json
- 14:09 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 14:08 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 14:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T376905)', diff saved to https://phabricator.wikimedia.org/P70162 and previous config saved to /var/cache/conftool/dbconfig/20241016-140835-ladsgroup.json
- 14:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P70161 and previous config saved to /var/cache/conftool/dbconfig/20241016-140312-ladsgroup.json
- 13:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P70160 and previous config saved to /var/cache/conftool/dbconfig/20241016-135328-ladsgroup.json
- 13:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P70159 and previous config saved to /var/cache/conftool/dbconfig/20241016-134805-ladsgroup.json
- 13:43 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1177.eqiad.wmnet with OS bullseye
- 13:41 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1176.eqiad.wmnet with OS bullseye
- 13:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P70158 and previous config saved to /var/cache/conftool/dbconfig/20241016-133821-ladsgroup.json
- 13:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T367856)', diff saved to https://phabricator.wikimedia.org/P70157 and previous config saved to /var/cache/conftool/dbconfig/20241016-133257-ladsgroup.json
- 13:25 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Update Z669x references to Z609x (duration: 08m 23s)
- 13:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T376905)', diff saved to https://phabricator.wikimedia.org/P70156 and previous config saved to /var/cache/conftool/dbconfig/20241016-132314-ladsgroup.json
- 13:20 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, jforrester: Continuing with sync
- 13:19 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, jforrester: Backport for Update Z669x references to Z609x synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:16 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Update Z669x references to Z609x
- 13:16 Dreamy_Jazz: Started time limited scan on enwiki - https://wikitech.wikimedia.org/wiki/MediaModeration
- 13:16 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Remove wgGEUseNewImpactModule config (T350077) (duration: 11m 35s)
- 13:11 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, cyndywikime: Continuing with sync
- 13:07 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, cyndywikime: Backport for Remove wgGEUseNewImpactModule config (T350077) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:04 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Remove wgGEUseNewImpactModule config (T350077)
- 12:58 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2200.codfw.wmnet with reason: Maintenance
- 12:58 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2200.codfw.wmnet with reason: Maintenance
- 12:52 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1177.eqiad.wmnet with OS bullseye
- 12:47 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 12:46 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 12:46 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 12:46 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 12:43 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1176.eqiad.wmnet with OS bullseye
- 12:35 stevemunene@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host an-worker1177
- 12:35 stevemunene@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host an-worker1177
- 12:35 stevemunene@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host an-worker1176
- 12:34 stevemunene@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host an-worker1176
- 12:33 Dreamy_Jazz: Restarting MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
- 12:32 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:32 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add newly reassigned an-worker hosts in analytics eqiad - stevemunene@cumin1002"
- 12:32 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add newly reassigned an-worker hosts in analytics eqiad - stevemunene@cumin1002"
- 12:28 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
- 12:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T376905)', diff saved to https://phabricator.wikimedia.org/P70155 and previous config saved to /var/cache/conftool/dbconfig/20241016-122248-ladsgroup.json
- 12:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 12:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 12:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 12:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 12:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T376905)', diff saved to https://phabricator.wikimedia.org/P70154 and previous config saved to /var/cache/conftool/dbconfig/20241016-122206-ladsgroup.json
- 12:15 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 12:14 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 12:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P70153 and previous config saved to /var/cache/conftool/dbconfig/20241016-120659-ladsgroup.json
- 11:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P70152 and previous config saved to /var/cache/conftool/dbconfig/20241016-115152-ladsgroup.json
- 11:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2198.codfw.wmnet with reason: Maintenance
- 11:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T376905)', diff saved to https://phabricator.wikimedia.org/P70150 and previous config saved to /var/cache/conftool/dbconfig/20241016-113645-ladsgroup.json
- 11:36 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2198.codfw.wmnet with reason: Maintenance
- 11:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T371742)', diff saved to https://phabricator.wikimedia.org/P70149 and previous config saved to /var/cache/conftool/dbconfig/20241016-113639-ladsgroup.json
- 11:29 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
- 11:28 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
- 11:26 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
- 11:25 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
- 11:22 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 11:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P70148 and previous config saved to /var/cache/conftool/dbconfig/20241016-112132-ladsgroup.json
- 11:21 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
- 11:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P70147 and previous config saved to /var/cache/conftool/dbconfig/20241016-110625-ladsgroup.json
- 10:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T371742)', diff saved to https://phabricator.wikimedia.org/P70146 and previous config saved to /var/cache/conftool/dbconfig/20241016-105118-ladsgroup.json
- 10:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2147 (T376905)', diff saved to https://phabricator.wikimedia.org/P70145 and previous config saved to /var/cache/conftool/dbconfig/20241016-103620-ladsgroup.json
- 10:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 10:35 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 10:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T376905)', diff saved to https://phabricator.wikimedia.org/P70144 and previous config saved to /var/cache/conftool/dbconfig/20241016-103553-ladsgroup.json
- 10:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140', diff saved to https://phabricator.wikimedia.org/P70143 and previous config saved to /var/cache/conftool/dbconfig/20241016-102046-ladsgroup.json
- 10:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140', diff saved to https://phabricator.wikimedia.org/P70142 and previous config saved to /var/cache/conftool/dbconfig/20241016-100539-ladsgroup.json
- 09:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T376905)', diff saved to https://phabricator.wikimedia.org/P70141 and previous config saved to /var/cache/conftool/dbconfig/20241016-095032-ladsgroup.json
- 09:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2140 (T376905)', diff saved to https://phabricator.wikimedia.org/P70140 and previous config saved to /var/cache/conftool/dbconfig/20241016-093852-ladsgroup.json
- 09:38 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
- 09:38 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
- 09:32 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 09:31 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 09:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T376905)', diff saved to https://phabricator.wikimedia.org/P70139 and previous config saved to /var/cache/conftool/dbconfig/20241016-093147-ladsgroup.json
- 09:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2195 (T371742)', diff saved to https://phabricator.wikimedia.org/P70138 and previous config saved to /var/cache/conftool/dbconfig/20241016-092219-ladsgroup.json
- 09:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2195.codfw.wmnet with reason: Maintenance
- 09:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2195.codfw.wmnet with reason: Maintenance
- 09:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T371742)', diff saved to https://phabricator.wikimedia.org/P70137 and previous config saved to /var/cache/conftool/dbconfig/20241016-092157-ladsgroup.json
- 09:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P70136 and previous config saved to /var/cache/conftool/dbconfig/20241016-091640-ladsgroup.json
- 09:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P70134 and previous config saved to /var/cache/conftool/dbconfig/20241016-090650-ladsgroup.json
- 09:04 kevinbazira@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
- 09:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P70133 and previous config saved to /var/cache/conftool/dbconfig/20241016-090133-ladsgroup.json
- 08:57 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
- 08:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P70132 and previous config saved to /var/cache/conftool/dbconfig/20241016-085143-ladsgroup.json
- 08:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T376905)', diff saved to https://phabricator.wikimedia.org/P70131 and previous config saved to /var/cache/conftool/dbconfig/20241016-084626-ladsgroup.json
- 08:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2136 (T376905)', diff saved to https://phabricator.wikimedia.org/P70130 and previous config saved to /var/cache/conftool/dbconfig/20241016-083651-ladsgroup.json
- 08:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
- 08:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T371742)', diff saved to https://phabricator.wikimedia.org/P70129 and previous config saved to /var/cache/conftool/dbconfig/20241016-083636-ladsgroup.json
- 08:36 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
- 08:07 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 08:07 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 08:05 brouberol@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 08:04 brouberol@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 08:03 brouberol@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 08:02 brouberol@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 08:01 brouberol@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 08:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 07:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 07:41 awight: UTC morning deployments done
- 07:40 awight@deploy2002: Finished scap sync-world: Backport for zhwiki: Revise contact page deprecated usage (duration: 09m 07s)
- 07:35 awight@deploy2002: awight, hamishz: Continuing with sync
- 07:34 awight@deploy2002: awight, hamishz: Backport for zhwiki: Revise contact page deprecated usage synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 07:31 awight@deploy2002: Started scap sync-world: Backport for zhwiki: Revise contact page deprecated usage
- 07:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T376905)', diff saved to https://phabricator.wikimedia.org/P70128 and previous config saved to /var/cache/conftool/dbconfig/20241016-072501-ladsgroup.json
- 07:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P70127 and previous config saved to /var/cache/conftool/dbconfig/20241016-070954-ladsgroup.json
- 07:09 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 07:08 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 07:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2181 (T371742)', diff saved to https://phabricator.wikimedia.org/P70126 and previous config saved to /var/cache/conftool/dbconfig/20241016-070246-ladsgroup.json
- 07:02 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2181.codfw.wmnet with reason: Maintenance
- 07:02 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2181.codfw.wmnet with reason: Maintenance
- 07:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T371742)', diff saved to https://phabricator.wikimedia.org/P70125 and previous config saved to /var/cache/conftool/dbconfig/20241016-070224-ladsgroup.json
- 06:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P70124 and previous config saved to /var/cache/conftool/dbconfig/20241016-065447-ladsgroup.json
- 06:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P70123 and previous config saved to /var/cache/conftool/dbconfig/20241016-064717-ladsgroup.json
- 06:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T376905)', diff saved to https://phabricator.wikimedia.org/P70122 and previous config saved to /var/cache/conftool/dbconfig/20241016-063940-ladsgroup.json
- 06:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P70121 and previous config saved to /var/cache/conftool/dbconfig/20241016-063210-ladsgroup.json
- 06:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1249 (T376905)', diff saved to https://phabricator.wikimedia.org/P70120 and previous config saved to /var/cache/conftool/dbconfig/20241016-063132-ladsgroup.json
- 06:31 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
- 06:31 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
- 06:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T376905)', diff saved to https://phabricator.wikimedia.org/P70119 and previous config saved to /var/cache/conftool/dbconfig/20241016-063107-ladsgroup.json
- 06:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T371742)', diff saved to https://phabricator.wikimedia.org/P70118 and previous config saved to /var/cache/conftool/dbconfig/20241016-061703-ladsgroup.json
- 06:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P70117 and previous config saved to /var/cache/conftool/dbconfig/20241016-061558-ladsgroup.json
- 06:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P70116 and previous config saved to /var/cache/conftool/dbconfig/20241016-060051-ladsgroup.json
- 05:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T376905)', diff saved to https://phabricator.wikimedia.org/P70115 and previous config saved to /var/cache/conftool/dbconfig/20241016-054544-ladsgroup.json
- 05:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1248 (T376905)', diff saved to https://phabricator.wikimedia.org/P70114 and previous config saved to /var/cache/conftool/dbconfig/20241016-053943-ladsgroup.json
- 05:39 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
- 05:39 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
- 05:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T376905)', diff saved to https://phabricator.wikimedia.org/P70113 and previous config saved to /var/cache/conftool/dbconfig/20241016-053918-ladsgroup.json
- 05:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P70112 and previous config saved to /var/cache/conftool/dbconfig/20241016-052411-ladsgroup.json
- 05:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P70111 and previous config saved to /var/cache/conftool/dbconfig/20241016-050904-ladsgroup.json
- 04:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T376905)', diff saved to https://phabricator.wikimedia.org/P70110 and previous config saved to /var/cache/conftool/dbconfig/20241016-045356-ladsgroup.json
- 04:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1247 (T376905)', diff saved to https://phabricator.wikimedia.org/P70109 and previous config saved to /var/cache/conftool/dbconfig/20241016-044657-ladsgroup.json
- 04:46 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
- 04:46 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
- 04:42 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 04:42 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 04:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T376905)', diff saved to https://phabricator.wikimedia.org/P70108 and previous config saved to /var/cache/conftool/dbconfig/20241016-044204-ladsgroup.json
- 04:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2167 (T371742)', diff saved to https://phabricator.wikimedia.org/P70107 and previous config saved to /var/cache/conftool/dbconfig/20241016-043757-ladsgroup.json
- 04:37 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
- 04:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
- 04:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T371742)', diff saved to https://phabricator.wikimedia.org/P70106 and previous config saved to /var/cache/conftool/dbconfig/20241016-043734-ladsgroup.json
- 04:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P70105 and previous config saved to /var/cache/conftool/dbconfig/20241016-042657-ladsgroup.json
- 04:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P70104 and previous config saved to /var/cache/conftool/dbconfig/20241016-042227-ladsgroup.json
- 04:22 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 04:21 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for new frack devices - pt1979@cumin2002"
- 04:21 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for new frack devices - pt1979@cumin2002"
- 04:18 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 04:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P70103 and previous config saved to /var/cache/conftool/dbconfig/20241016-041150-ladsgroup.json
- 04:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P70102 and previous config saved to /var/cache/conftool/dbconfig/20241016-040721-ladsgroup.json
- 04:05 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 04:05 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for new frack devices - pt1979@cumin2002"
- 04:05 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for new frack devices - pt1979@cumin2002"
- 04:01 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 03:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T376905)', diff saved to https://phabricator.wikimedia.org/P70101 and previous config saved to /var/cache/conftool/dbconfig/20241016-035643-ladsgroup.json
- 03:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T371742)', diff saved to https://phabricator.wikimedia.org/P70100 and previous config saved to /var/cache/conftool/dbconfig/20241016-035214-ladsgroup.json
- 03:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1244 (T376905)', diff saved to https://phabricator.wikimedia.org/P70099 and previous config saved to /var/cache/conftool/dbconfig/20241016-034932-ladsgroup.json
- 03:49 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
- 03:49 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
- 03:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T376905)', diff saved to https://phabricator.wikimedia.org/P70098 and previous config saved to /var/cache/conftool/dbconfig/20241016-034907-ladsgroup.json
- 03:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P70097 and previous config saved to /var/cache/conftool/dbconfig/20241016-033400-ladsgroup.json
- 03:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P70096 and previous config saved to /var/cache/conftool/dbconfig/20241016-031852-ladsgroup.json
- 03:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T376905)', diff saved to https://phabricator.wikimedia.org/P70095 and previous config saved to /var/cache/conftool/dbconfig/20241016-030345-ladsgroup.json
- 02:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1243 (T376905)', diff saved to https://phabricator.wikimedia.org/P70094 and previous config saved to /var/cache/conftool/dbconfig/20241016-025633-ladsgroup.json
- 02:56 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
- 02:56 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
- 02:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T376905)', diff saved to https://phabricator.wikimedia.org/P70093 and previous config saved to /var/cache/conftool/dbconfig/20241016-025608-ladsgroup.json
- 02:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P70092 and previous config saved to /var/cache/conftool/dbconfig/20241016-024101-ladsgroup.json
- 02:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P70091 and previous config saved to /var/cache/conftool/dbconfig/20241016-022554-ladsgroup.json
- 02:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2166 (T371742)', diff saved to https://phabricator.wikimedia.org/P70090 and previous config saved to /var/cache/conftool/dbconfig/20241016-021358-ladsgroup.json
- 02:13 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2166.codfw.wmnet with reason: Maintenance
- 02:13 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2166.codfw.wmnet with reason: Maintenance
- 02:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T371742)', diff saved to https://phabricator.wikimedia.org/P70089 and previous config saved to /var/cache/conftool/dbconfig/20241016-021347-ladsgroup.json
- 02:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T376905)', diff saved to https://phabricator.wikimedia.org/P70088 and previous config saved to /var/cache/conftool/dbconfig/20241016-021047-ladsgroup.json
- 02:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1242 (T376905)', diff saved to https://phabricator.wikimedia.org/P70087 and previous config saved to /var/cache/conftool/dbconfig/20241016-020333-ladsgroup.json
- 02:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
- 02:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
- 02:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T376905)', diff saved to https://phabricator.wikimedia.org/P70086 and previous config saved to /var/cache/conftool/dbconfig/20241016-020308-ladsgroup.json
- 01:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P70085 and previous config saved to /var/cache/conftool/dbconfig/20241016-015840-ladsgroup.json
- 01:50 eileen: tools upgraded from 62f2d170 to 68f64e43
- 01:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P70084 and previous config saved to /var/cache/conftool/dbconfig/20241016-014801-ladsgroup.json
- 01:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P70083 and previous config saved to /var/cache/conftool/dbconfig/20241016-014333-ladsgroup.json
- 01:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P70082 and previous config saved to /var/cache/conftool/dbconfig/20241016-013254-ladsgroup.json
- 01:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T371742)', diff saved to https://phabricator.wikimedia.org/P70081 and previous config saved to /var/cache/conftool/dbconfig/20241016-012826-ladsgroup.json
- 01:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T376905)', diff saved to https://phabricator.wikimedia.org/P70080 and previous config saved to /var/cache/conftool/dbconfig/20241016-011747-ladsgroup.json
- 01:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1241 (T376905)', diff saved to https://phabricator.wikimedia.org/P70079 and previous config saved to /var/cache/conftool/dbconfig/20241016-011036-ladsgroup.json
- 01:10 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
- 01:10 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
- 01:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T376905)', diff saved to https://phabricator.wikimedia.org/P70078 and previous config saved to /var/cache/conftool/dbconfig/20241016-011010-ladsgroup.json
- 00:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P70077 and previous config saved to /var/cache/conftool/dbconfig/20241016-005500-ladsgroup.json
- 00:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P70076 and previous config saved to /var/cache/conftool/dbconfig/20241016-003953-ladsgroup.json
- 00:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T376905)', diff saved to https://phabricator.wikimedia.org/P70075 and previous config saved to /var/cache/conftool/dbconfig/20241016-002446-ladsgroup.json
- 00:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1238 (T376905)', diff saved to https://phabricator.wikimedia.org/P70074 and previous config saved to /var/cache/conftool/dbconfig/20241016-001629-ladsgroup.json
- 00:16 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
- 00:16 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
- 00:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T376905)', diff saved to https://phabricator.wikimedia.org/P70073 and previous config saved to /var/cache/conftool/dbconfig/20241016-001604-ladsgroup.json
- 00:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P70072 and previous config saved to /var/cache/conftool/dbconfig/20241016-000057-ladsgroup.json
2024-10-15
- 23:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2164 (T371742)', diff saved to https://phabricator.wikimedia.org/P70071 and previous config saved to /var/cache/conftool/dbconfig/20241015-235055-ladsgroup.json
- 23:50 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 23:50 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 23:50 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2164.codfw.wmnet with reason: Maintenance
- 23:50 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2164.codfw.wmnet with reason: Maintenance
- 23:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T371742)', diff saved to https://phabricator.wikimedia.org/P70070 and previous config saved to /var/cache/conftool/dbconfig/20241015-235017-ladsgroup.json
- 23:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P70069 and previous config saved to /var/cache/conftool/dbconfig/20241015-234550-ladsgroup.json
- 23:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P70068 and previous config saved to /var/cache/conftool/dbconfig/20241015-233510-ladsgroup.json
- 23:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T376905)', diff saved to https://phabricator.wikimedia.org/P70067 and previous config saved to /var/cache/conftool/dbconfig/20241015-233043-ladsgroup.json
- 23:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1221 (T376905)', diff saved to https://phabricator.wikimedia.org/P70066 and previous config saved to /var/cache/conftool/dbconfig/20241015-232456-ladsgroup.json
- 23:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 23:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 23:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
- 23:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
- 23:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T376905)', diff saved to https://phabricator.wikimedia.org/P70065 and previous config saved to /var/cache/conftool/dbconfig/20241015-232423-ladsgroup.json
- 23:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P70064 and previous config saved to /var/cache/conftool/dbconfig/20241015-232003-ladsgroup.json
- 23:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P70063 and previous config saved to /var/cache/conftool/dbconfig/20241015-230916-ladsgroup.json
- 23:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T371742)', diff saved to https://phabricator.wikimedia.org/P70062 and previous config saved to /var/cache/conftool/dbconfig/20241015-230456-ladsgroup.json
- 22:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P70061 and previous config saved to /var/cache/conftool/dbconfig/20241015-225409-ladsgroup.json
- 22:48 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 22:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T376905)', diff saved to https://phabricator.wikimedia.org/P70060 and previous config saved to /var/cache/conftool/dbconfig/20241015-223902-ladsgroup.json
- 22:38 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 22:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1199 (T376905)', diff saved to https://phabricator.wikimedia.org/P70059 and previous config saved to /var/cache/conftool/dbconfig/20241015-222936-ladsgroup.json
- 22:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
- 22:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
- 22:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T376905)', diff saved to https://phabricator.wikimedia.org/P70058 and previous config saved to /var/cache/conftool/dbconfig/20241015-222911-ladsgroup.json
- 22:25 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 22:25 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 22:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1189.eqiad.wmnet with reason: Maintenance
- 22:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1189.eqiad.wmnet with reason: Maintenance
- 22:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P70057 and previous config saved to /var/cache/conftool/dbconfig/20241015-221404-ladsgroup.json
- 22:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T370903)', diff saved to https://phabricator.wikimedia.org/P70056 and previous config saved to /var/cache/conftool/dbconfig/20241015-221356-ladsgroup.json
- 22:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P70055 and previous config saved to /var/cache/conftool/dbconfig/20241015-220316-ladsgroup.json
- 21:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P70054 and previous config saved to /var/cache/conftool/dbconfig/20241015-215857-ladsgroup.json
- 21:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P70053 and previous config saved to /var/cache/conftool/dbconfig/20241015-215849-ladsgroup.json
- 21:51 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2083.codfw.wmnet with OS bullseye
- 21:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P70052 and previous config saved to /var/cache/conftool/dbconfig/20241015-214811-ladsgroup.json
- 21:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T376905)', diff saved to https://phabricator.wikimedia.org/P70051 and previous config saved to /var/cache/conftool/dbconfig/20241015-214350-ladsgroup.json
- 21:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P70050 and previous config saved to /var/cache/conftool/dbconfig/20241015-214342-ladsgroup.json
- 21:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1190 (T376905)', diff saved to https://phabricator.wikimedia.org/P70049 and previous config saved to /var/cache/conftool/dbconfig/20241015-213423-ladsgroup.json
- 21:34 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
- 21:34 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
- 21:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P70048 and previous config saved to /var/cache/conftool/dbconfig/20241015-213305-ladsgroup.json
- 21:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2163 (T371742)', diff saved to https://phabricator.wikimedia.org/P70047 and previous config saved to /var/cache/conftool/dbconfig/20241015-213227-ladsgroup.json
- 21:32 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2163.codfw.wmnet with reason: Maintenance
- 21:32 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2163.codfw.wmnet with reason: Maintenance
- 21:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T371742)', diff saved to https://phabricator.wikimedia.org/P70046 and previous config saved to /var/cache/conftool/dbconfig/20241015-213203-ladsgroup.json
- 21:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 21:28 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 21:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T370903)', diff saved to https://phabricator.wikimedia.org/P70045 and previous config saved to /var/cache/conftool/dbconfig/20241015-212835-ladsgroup.json
- 21:25 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2205.codfw.wmnet with reason: Sad
- 21:25 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2205.codfw.wmnet with reason: Sad
- 21:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T370903)', diff saved to https://phabricator.wikimedia.org/P70044 and previous config saved to /var/cache/conftool/dbconfig/20241015-212431-ladsgroup.json
- 21:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 21:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 21:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P70043 and previous config saved to /var/cache/conftool/dbconfig/20241015-211800-ladsgroup.json
- 21:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P70042 and previous config saved to /var/cache/conftool/dbconfig/20241015-211656-ladsgroup.json
- 21:04 cjming: end of UTC late backport window
- 21:04 cjming@deploy2002: Finished scap sync-world: Backport for SkinComponentCopyright: Fix message existence check for history-copyright (T45646) (duration: 06m 51s)
- 21:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P70041 and previous config saved to /var/cache/conftool/dbconfig/20241015-210149-ladsgroup.json
- 20:59 cjming@deploy2002: cjming, matmarex: Continuing with sync
- 20:59 cjming@deploy2002: cjming, matmarex: Backport for SkinComponentCopyright: Fix message existence check for history-copyright (T45646) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:57 ladsgroup@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2194.codfw.wmnet onto db2205.codfw.wmnet
- 20:57 cjming@deploy2002: Started scap sync-world: Backport for SkinComponentCopyright: Fix message existence check for history-copyright (T45646)
- 20:56 cjming@deploy2002: Finished scap sync-world: Backport for Redirect all namespace-in-Wikipedia cases to Wikipedia (T376923) (duration: 12m 33s)
- 20:51 cjming@deploy2002: cjming, pppery: Continuing with sync
- 20:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T371742)', diff saved to https://phabricator.wikimedia.org/P70040 and previous config saved to /var/cache/conftool/dbconfig/20241015-204642-ladsgroup.json
- 20:46 cjming@deploy2002: cjming, pppery: Backport for Redirect all namespace-in-Wikipedia cases to Wikipedia (T376923) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:43 cjming@deploy2002: Started scap sync-world: Backport for Redirect all namespace-in-Wikipedia cases to Wikipedia (T376923)
- 20:42 cjming@deploy2002: Finished scap sync-world: Backport for Missing.php: Improve detection of interwikis in certain cases (T363538) (duration: 08m 50s)
- 20:37 cjming@deploy2002: cjming, pppery: Continuing with sync
- 20:35 cjming@deploy2002: cjming, pppery: Backport for Missing.php: Improve detection of interwikis in certain cases (T363538) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:33 cjming@deploy2002: Started scap sync-world: Backport for Missing.php: Improve detection of interwikis in certain cases (T363538)
- 20:31 cjming@deploy2002: Finished scap sync-world: Backport for contactpages: Move stewards contactpage to MetaContactPages.php (duration: 10m 56s)
- 20:31 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2083.codfw.wmnet with OS bullseye
- 20:27 cjming@deploy2002: ammarpad, cjming: Continuing with sync
- 20:23 cjming@deploy2002: ammarpad, cjming: Backport for contactpages: Move stewards contactpage to MetaContactPages.php synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:20 cjming@deploy2002: Started scap sync-world: Backport for contactpages: Move stewards contactpage to MetaContactPages.php
- 20:16 cjming@deploy2002: Finished scap sync-world: Backport for Remove legacy UI actions tracking (T376065) (duration: 12m 28s)
- 20:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2083.codfw.wmnet with OS bullseye
- 20:12 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 20:12 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 20:11 cjming@deploy2002: ksarabia, cjming: Continuing with sync
- 20:11 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 20:10 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 20:10 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 20:09 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 20:09 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2082.codfw.wmnet with OS bullseye
- 20:08 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2081.codfw.wmnet with OS bullseye
- 20:07 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 20:07 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 20:06 cjming@deploy2002: ksarabia, cjming: Backport for Remove legacy UI actions tracking (T376065) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:05 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 20:04 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 20:04 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 20:03 cjming@deploy2002: Started scap sync-world: Backport for Remove legacy UI actions tracking (T376065)
- 20:03 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 20:02 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 20:01 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 20:00 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 19:59 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 19:56 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 19:56 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 19:16 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.27 refs T375658
- 19:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2162 (T371742)', diff saved to https://phabricator.wikimedia.org/P70039 and previous config saved to /var/cache/conftool/dbconfig/20241015-191345-ladsgroup.json
- 19:14 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2162.codfw.wmnet with reason: Maintenance
- 19:13 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2162.codfw.wmnet with reason: Maintenance
- 19:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T371742)', diff saved to https://phabricator.wikimedia.org/P70038 and previous config saved to /var/cache/conftool/dbconfig/20241015-191322-ladsgroup.json
- 19:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T367781)', diff saved to https://phabricator.wikimedia.org/P70037 and previous config saved to /var/cache/conftool/dbconfig/20241015-190231-arnaudb.json
- 18:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P70036 and previous config saved to /var/cache/conftool/dbconfig/20241015-185814-ladsgroup.json
- 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2083.codfw.wmnet with OS bullseye
- 18:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2082.codfw.wmnet with OS bullseye
- 18:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2081.codfw.wmnet with OS bullseye
- 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2082.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:50 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:49 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:48 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2082.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P70035 and previous config saved to /var/cache/conftool/dbconfig/20241015-184724-arnaudb.json
- 18:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P70034 and previous config saved to /var/cache/conftool/dbconfig/20241015-184307-ladsgroup.json
- 18:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2082.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2083.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:40 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2082.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:40 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2081.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2082
- 18:38 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2081
- 18:38 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2083
- 18:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2083
- 18:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2082
- 18:36 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2081
- 18:36 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:35 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-be2081-3 to codfw - jhancock@cumin2002"
- 18:34 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-be2081-3 to codfw - jhancock@cumin2002"
- 18:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P70033 and previous config saved to /var/cache/conftool/dbconfig/20241015-183218-arnaudb.json
- 18:31 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 18:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T371742)', diff saved to https://phabricator.wikimedia.org/P70032 and previous config saved to /var/cache/conftool/dbconfig/20241015-182800-ladsgroup.json
- 18:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T376905)', diff saved to https://phabricator.wikimedia.org/P70031 and previous config saved to /var/cache/conftool/dbconfig/20241015-181930-ladsgroup.json
- 18:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T367781)', diff saved to https://phabricator.wikimedia.org/P70030 and previous config saved to /var/cache/conftool/dbconfig/20241015-181711-arnaudb.json
- 18:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2216 (T367781)', diff saved to https://phabricator.wikimedia.org/P70029 and previous config saved to /var/cache/conftool/dbconfig/20241015-181455-arnaudb.json
- 18:15 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2216.codfw.wmnet with reason: Maintenance
- 18:15 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2216.codfw.wmnet with reason: Maintenance
- 18:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212 (T367781)', diff saved to https://phabricator.wikimedia.org/P70028 and previous config saved to /var/cache/conftool/dbconfig/20241015-181433-arnaudb.json
- 18:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P70027 and previous config saved to /var/cache/conftool/dbconfig/20241015-180423-ladsgroup.json
- 17:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212', diff saved to https://phabricator.wikimedia.org/P70026 and previous config saved to /var/cache/conftool/dbconfig/20241015-175926-arnaudb.json
- 17:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P70025 and previous config saved to /var/cache/conftool/dbconfig/20241015-174916-ladsgroup.json
- 17:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212', diff saved to https://phabricator.wikimedia.org/P70024 and previous config saved to /var/cache/conftool/dbconfig/20241015-174419-arnaudb.json
- 17:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T376905)', diff saved to https://phabricator.wikimedia.org/P70023 and previous config saved to /var/cache/conftool/dbconfig/20241015-173409-ladsgroup.json
- 17:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212 (T367781)', diff saved to https://phabricator.wikimedia.org/P70022 and previous config saved to /var/cache/conftool/dbconfig/20241015-172912-arnaudb.json
- 17:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1236 (T376905)', diff saved to https://phabricator.wikimedia.org/P70021 and previous config saved to /var/cache/conftool/dbconfig/20241015-172714-ladsgroup.json
- 17:28 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1236.eqiad.wmnet with reason: Maintenance
- 17:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2212 (T367781)', diff saved to https://phabricator.wikimedia.org/P70020 and previous config saved to /var/cache/conftool/dbconfig/20241015-172657-arnaudb.json
- 17:28 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1236.eqiad.wmnet with reason: Maintenance
- 17:27 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2212.codfw.wmnet with reason: Maintenance
- 17:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T376905)', diff saved to https://phabricator.wikimedia.org/P70019 and previous config saved to /var/cache/conftool/dbconfig/20241015-172648-ladsgroup.json
- 17:27 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2212.codfw.wmnet with reason: Maintenance
- 17:26 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2202.codfw.wmnet with reason: Maintenance
- 17:26 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2202.codfw.wmnet with reason: Maintenance
- 17:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T367781)', diff saved to https://phabricator.wikimedia.org/P70018 and previous config saved to /var/cache/conftool/dbconfig/20241015-172610-arnaudb.json
- 17:13 swfrench@deploy2002: Finished scap sync-world: Testing scap after mediawiki-deployments.yaml format change - T370934 (duration: 02m 47s)
- 17:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P70017 and previous config saved to /var/cache/conftool/dbconfig/20241015-171141-ladsgroup.json
- 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P70016 and previous config saved to /var/cache/conftool/dbconfig/20241015-171103-arnaudb.json
- 17:10 swfrench@deploy2002: Started scap sync-world: Testing scap after mediawiki-deployments.yaml format change - T370934
- 16:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P70015 and previous config saved to /var/cache/conftool/dbconfig/20241015-165634-ladsgroup.json
- 16:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2161 (T371742)', diff saved to https://phabricator.wikimedia.org/P70014 and previous config saved to /var/cache/conftool/dbconfig/20241015-165608-ladsgroup.json
- 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P70013 and previous config saved to /var/cache/conftool/dbconfig/20241015-165556-arnaudb.json
- 16:56 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
- 16:56 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
- 16:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T371742)', diff saved to https://phabricator.wikimedia.org/P70012 and previous config saved to /var/cache/conftool/dbconfig/20241015-165539-ladsgroup.json
- 16:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T376905)', diff saved to https://phabricator.wikimedia.org/P70011 and previous config saved to /var/cache/conftool/dbconfig/20241015-164127-ladsgroup.json
- 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T367781)', diff saved to https://phabricator.wikimedia.org/P70010 and previous config saved to /var/cache/conftool/dbconfig/20241015-164050-arnaudb.json
- 16:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P70009 and previous config saved to /var/cache/conftool/dbconfig/20241015-164032-ladsgroup.json
- 16:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2188 (T367781)', diff saved to https://phabricator.wikimedia.org/P70008 and previous config saved to /var/cache/conftool/dbconfig/20241015-163834-arnaudb.json
- 16:39 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2188.codfw.wmnet with reason: Maintenance
- 16:38 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2188.codfw.wmnet with reason: Maintenance
- 16:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T367781)', diff saved to https://phabricator.wikimedia.org/P70007 and previous config saved to /var/cache/conftool/dbconfig/20241015-163812-arnaudb.json
- 16:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T376905)', diff saved to https://phabricator.wikimedia.org/P70006 and previous config saved to /var/cache/conftool/dbconfig/20241015-163419-ladsgroup.json
- 16:35 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 16:34 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 16:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T376905)', diff saved to https://phabricator.wikimedia.org/P70005 and previous config saved to /var/cache/conftool/dbconfig/20241015-163404-ladsgroup.json
- 16:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P70004 and previous config saved to /var/cache/conftool/dbconfig/20241015-162525-ladsgroup.json
- 16:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P70003 and previous config saved to /var/cache/conftool/dbconfig/20241015-162305-arnaudb.json
- 16:21 ladsgroup@cumin1002: START - Cookbook sre.mysql.clone of db2194.codfw.wmnet onto db2205.codfw.wmnet
- 16:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depool for reclone (T375652)', diff saved to https://phabricator.wikimedia.org/P70002 and previous config saved to /var/cache/conftool/dbconfig/20241015-161934-ladsgroup.json
- 16:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P70001 and previous config saved to /var/cache/conftool/dbconfig/20241015-161858-ladsgroup.json
- 16:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T371742)', diff saved to https://phabricator.wikimedia.org/P70000 and previous config saved to /var/cache/conftool/dbconfig/20241015-161018-ladsgroup.json
- 16:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P69999 and previous config saved to /var/cache/conftool/dbconfig/20241015-160758-arnaudb.json
- 16:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P69998 and previous config saved to /var/cache/conftool/dbconfig/20241015-160351-ladsgroup.json
- 16:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depool db2205 T377164', diff saved to https://phabricator.wikimedia.org/P69997 and previous config saved to /var/cache/conftool/dbconfig/20241015-160106-ladsgroup.json
- 15:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T367781)', diff saved to https://phabricator.wikimedia.org/P69996 and previous config saved to /var/cache/conftool/dbconfig/20241015-155251-arnaudb.json
- 15:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Promote db2209 to s3 primary and set section read-write T377164', diff saved to https://phabricator.wikimedia.org/P69995 and previous config saved to /var/cache/conftool/dbconfig/20241015-155240-ladsgroup.json
- 15:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T376905)', diff saved to https://phabricator.wikimedia.org/P69994 and previous config saved to /var/cache/conftool/dbconfig/20241015-154844-ladsgroup.json
- 15:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Set s3 codfw as read-only for maintenance - T377164', diff saved to https://phabricator.wikimedia.org/P69993 and previous config saved to /var/cache/conftool/dbconfig/20241015-154834-ladsgroup.json
- 15:48 Amir1: Starting s3 codfw failover from db2205 to db2209 - T377164
- 15:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2176 (T367781)', diff saved to https://phabricator.wikimedia.org/P69992 and previous config saved to /var/cache/conftool/dbconfig/20241015-154318-arnaudb.json
- 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 15:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 15:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T367781)', diff saved to https://phabricator.wikimedia.org/P69991 and previous config saved to /var/cache/conftool/dbconfig/20241015-154256-arnaudb.json
- 15:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Set db2209 with weight 0 T377164', diff saved to https://phabricator.wikimedia.org/P69990 and previous config saved to /var/cache/conftool/dbconfig/20241015-154228-ladsgroup.json
- 15:43 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Primary switchover s3 T377164
- 15:42 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 24 hosts with reason: Primary switchover s3 T377164
- 15:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T376905)', diff saved to https://phabricator.wikimedia.org/P69989 and previous config saved to /var/cache/conftool/dbconfig/20241015-154027-ladsgroup.json
- 15:41 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
- 15:40 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
- 15:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T376905)', diff saved to https://phabricator.wikimedia.org/P69988 and previous config saved to /var/cache/conftool/dbconfig/20241015-154002-ladsgroup.json
- 15:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P69987 and previous config saved to /var/cache/conftool/dbconfig/20241015-152749-arnaudb.json
- 15:26 akosiaris: run gnt-cluster verify-disks after ganeti1034 forceful reboot
- 15:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P69986 and previous config saved to /var/cache/conftool/dbconfig/20241015-152456-ladsgroup.json
- 15:22 volans: force-rebooting ganeti1034 stuck due to drbd traces via mgmt
- 15:19 akosiaris@cumin1002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1034.eqiad.wmnet
- 15:17 akosiaris: drain ganeti1034 of VMs, hardware might be misbehaving
- 15:16 akosiaris@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
- 15:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P69985 and previous config saved to /var/cache/conftool/dbconfig/20241015-151243-arnaudb.json
- 15:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P69984 and previous config saved to /var/cache/conftool/dbconfig/20241015-150948-ladsgroup.json
- 14:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T367781)', diff saved to https://phabricator.wikimedia.org/P69983 and previous config saved to /var/cache/conftool/dbconfig/20241015-145734-arnaudb.json
- 14:56 herron@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1001.eqiad.wmnet
- 14:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2174 (T367781)', diff saved to https://phabricator.wikimedia.org/P69982 and previous config saved to /var/cache/conftool/dbconfig/20241015-145517-arnaudb.json
- 14:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 14:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 14:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T367781)', diff saved to https://phabricator.wikimedia.org/P69981 and previous config saved to /var/cache/conftool/dbconfig/20241015-145453-arnaudb.json
- 14:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T376905)', diff saved to https://phabricator.wikimedia.org/P69980 and previous config saved to /var/cache/conftool/dbconfig/20241015-145441-ladsgroup.json
- 14:48 herron@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan1001.eqiad.wmnet
- 14:47 herron@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2001.codfw.wmnet
- 14:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T376905)', diff saved to https://phabricator.wikimedia.org/P69979 and previous config saved to /var/cache/conftool/dbconfig/20241015-144631-ladsgroup.json
- 14:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
- 14:46 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
- 14:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T376905)', diff saved to https://phabricator.wikimedia.org/P69978 and previous config saved to /var/cache/conftool/dbconfig/20241015-144606-ladsgroup.json
- 14:45 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 02m 24s)
- 14:43 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 06m 46s)
- 14:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P69977 and previous config saved to /var/cache/conftool/dbconfig/20241015-143946-arnaudb.json
- 14:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2154 (T371742)', diff saved to https://phabricator.wikimedia.org/P69976 and previous config saved to /var/cache/conftool/dbconfig/20241015-143803-ladsgroup.json
- 14:38 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2154.codfw.wmnet with reason: Maintenance
- 14:38 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2154.codfw.wmnet with reason: Maintenance
- 14:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T371742)', diff saved to https://phabricator.wikimedia.org/P69975 and previous config saved to /var/cache/conftool/dbconfig/20241015-143740-ladsgroup.json
- 14:36 herron@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan2001.codfw.wmnet
- 14:35 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1003.eqiad.wmnet
- 14:33 herron@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1002.eqiad.wmnet
- 14:31 brouberol@cumin1002: START - Cookbook sre.hosts.reboot-single for host matomo1003.eqiad.wmnet
- 14:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P69974 and previous config saved to /var/cache/conftool/dbconfig/20241015-143059-ladsgroup.json
- 14:29 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 14:28 herron@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan1002.eqiad.wmnet
- 14:28 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 14:27 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 14:27 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 14:26 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 14:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P69973 and previous config saved to /var/cache/conftool/dbconfig/20241015-142439-arnaudb.json
- 14:24 herron@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2002.codfw.wmnet
- 14:24 urbanecm@deploy2002: Finished scap sync-world: Backport for SkinComponentCopyright: Fix message existence check for history-copyright (T45646) (duration: 33m 23s)
- 14:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P69972 and previous config saved to /var/cache/conftool/dbconfig/20241015-142233-ladsgroup.json
- 14:21 btullis@cumin1002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling reboot on A:schema
- 14:19 urbanecm@deploy2002: urbanecm, matmarex: Continuing with sync
- 14:17 herron@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan2002.codfw.wmnet
- 14:16 herron@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host centrallog2002.codfw.wmnet
- 14:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P69971 and previous config saved to /var/cache/conftool/dbconfig/20241015-141552-ladsgroup.json
- 14:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T367781)', diff saved to https://phabricator.wikimedia.org/P69970 and previous config saved to /var/cache/conftool/dbconfig/20241015-140932-arnaudb.json
- 14:09 herron@cumin1002: START - Cookbook sre.hosts.reboot-single for host centrallog2002.codfw.wmnet
- 14:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P69969 and previous config saved to /var/cache/conftool/dbconfig/20241015-140726-ladsgroup.json
- 14:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2173 (T367781)', diff saved to https://phabricator.wikimedia.org/P69968 and previous config saved to /var/cache/conftool/dbconfig/20241015-140716-arnaudb.json
- 14:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 14:08 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1020.eqiad.wmnet
- 14:07 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 14:07 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2173.codfw.wmnet with reason: Maintenance
- 14:07 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2173.codfw.wmnet with reason: Maintenance
- 14:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T367781)', diff saved to https://phabricator.wikimedia.org/P69967 and previous config saved to /var/cache/conftool/dbconfig/20241015-140638-arnaudb.json
- 14:05 btullis@cumin1002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling reboot on A:schema
- 14:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T376905)', diff saved to https://phabricator.wikimedia.org/P69966 and previous config saved to /var/cache/conftool/dbconfig/20241015-140045-ladsgroup.json
- 14:00 brouberol@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-presto1020.eqiad.wmnet
- 13:57 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1019.eqiad.wmnet
- 13:55 herron@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host centrallog1002.eqiad.wmnet
- 13:54 urbanecm@deploy2002: urbanecm, matmarex: Backport for SkinComponentCopyright: Fix message existence check for history-copyright (T45646) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T376905)', diff saved to https://phabricator.wikimedia.org/P69965 and previous config saved to /var/cache/conftool/dbconfig/20241015-135234-ladsgroup.json
- 13:53 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
- 13:53 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
- 13:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T371742)', diff saved to https://phabricator.wikimedia.org/P69964 and previous config saved to /var/cache/conftool/dbconfig/20241015-135213-ladsgroup.json
- 13:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T376905)', diff saved to https://phabricator.wikimedia.org/P69963 and previous config saved to /var/cache/conftool/dbconfig/20241015-135208-ladsgroup.json
- 13:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P69962 and previous config saved to /var/cache/conftool/dbconfig/20241015-135131-arnaudb.json
- 13:51 brouberol@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-presto1019.eqiad.wmnet
- 13:50 urbanecm@deploy2002: Started scap sync-world: Backport for SkinComponentCopyright: Fix message existence check for history-copyright (T45646)
- 13:48 herron@cumin1002: START - Cookbook sre.hosts.reboot-single for host centrallog1002.eqiad.wmnet
- 13:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P69961 and previous config saved to /var/cache/conftool/dbconfig/20241015-133701-ladsgroup.json
- 13:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P69960 and previous config saved to /var/cache/conftool/dbconfig/20241015-133624-arnaudb.json
- 13:32 urbanecm@deploy2002: Finished scap sync-world: Backport for eswiki: switch clearing link recommendations to PageSaveComplete hook (T372337), s7: Reduce revision-slots cache expiry to 60 seconds (T183490) (duration: 07m 44s)
- 13:27 urbanecm@deploy2002: migr, urbanecm, zabe: Continuing with sync
- 13:26 urbanecm@deploy2002: migr, urbanecm, zabe: Backport for eswiki: switch clearing link recommendations to PageSaveComplete hook (T372337), s7: Reduce revision-slots cache expiry to 60 seconds (T183490) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:24 urbanecm@deploy2002: Started scap sync-world: Backport for eswiki: switch clearing link recommendations to PageSaveComplete hook (T372337), s7: Reduce revision-slots cache expiry to 60 seconds (T183490)
- 13:23 urbanecm@deploy2002: Finished scap sync-world: Backport for [wikidatawiki] Enable the CampaignEvents extension (T375411), GrowthExperiments: update stream configuration to capture user id (T376833) (duration: 19m 25s)
- 13:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P69959 and previous config saved to /var/cache/conftool/dbconfig/20241015-132154-ladsgroup.json
- 13:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T367781)', diff saved to https://phabricator.wikimedia.org/P69958 and previous config saved to /var/cache/conftool/dbconfig/20241015-132117-arnaudb.json
- 13:19 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1018.eqiad.wmnet
- 13:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2170 (T367781)', diff saved to https://phabricator.wikimedia.org/P69957 and previous config saved to /var/cache/conftool/dbconfig/20241015-131901-arnaudb.json
- 13:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 13:18 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 13:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T367781)', diff saved to https://phabricator.wikimedia.org/P69956 and previous config saved to /var/cache/conftool/dbconfig/20241015-131839-arnaudb.json
- 13:16 urbanecm@deploy2002: cyndywikime, daimona, urbanecm: Continuing with sync
- 13:12 brouberol@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-presto1018.eqiad.wmnet
- 13:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T370903)', diff saved to https://phabricator.wikimedia.org/P69955 and previous config saved to /var/cache/conftool/dbconfig/20241015-131122-ladsgroup.json
- 13:11 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1017.eqiad.wmnet
- 13:11 urbanecm@deploy2002: cyndywikime, daimona, urbanecm: Backport for [wikidatawiki] Enable the CampaignEvents extension (T375411), GrowthExperiments: update stream configuration to capture user id (T376833) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T376905)', diff saved to https://phabricator.wikimedia.org/P69954 and previous config saved to /var/cache/conftool/dbconfig/20241015-130647-ladsgroup.json
- 13:04 brouberol@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-presto1017.eqiad.wmnet
- 13:04 urbanecm@deploy2002: Started scap sync-world: Backport for [wikidatawiki] Enable the CampaignEvents extension (T375411), GrowthExperiments: update stream configuration to capture user id (T376833)
- 13:03 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1016.eqiad.wmnet
- 13:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P69953 and previous config saved to /var/cache/conftool/dbconfig/20241015-130332-arnaudb.json
- 12:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T376905)', diff saved to https://phabricator.wikimedia.org/P69952 and previous config saved to /var/cache/conftool/dbconfig/20241015-125748-ladsgroup.json
- 12:57 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 12:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 12:57 brouberol@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-presto1016.eqiad.wmnet
- 12:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P69951 and previous config saved to /var/cache/conftool/dbconfig/20241015-125615-ladsgroup.json
- 12:52 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 12:52 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 12:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T376905)', diff saved to https://phabricator.wikimedia.org/P69950 and previous config saved to /var/cache/conftool/dbconfig/20241015-125203-ladsgroup.json
- 12:50 brouberol@cumin1002: END (FAIL) - Cookbook sre.presto.reboot-workers (exit_code=99) for Presto an-presto cluster: Reboot Presto nodes
- 12:50 elukey: destroy old certs from puppetmaster1001's CA (parsoid.svc.{eqiad,codfw}.wmnet, debmonitor.discovery.wmnet)
- 12:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P69949 and previous config saved to /var/cache/conftool/dbconfig/20241015-124825-arnaudb.json
- 12:46 brouberol@cumin1002: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes
- 12:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P69948 and previous config saved to /var/cache/conftool/dbconfig/20241015-124108-ladsgroup.json
- 12:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P69947 and previous config saved to /var/cache/conftool/dbconfig/20241015-123656-ladsgroup.json
- 12:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T367781)', diff saved to https://phabricator.wikimedia.org/P69946 and previous config saved to /var/cache/conftool/dbconfig/20241015-123318-arnaudb.json
- 12:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2153 (T367781)', diff saved to https://phabricator.wikimedia.org/P69945 and previous config saved to /var/cache/conftool/dbconfig/20241015-123101-arnaudb.json
- 12:30 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2153.codfw.wmnet with reason: Maintenance
- 12:30 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2153.codfw.wmnet with reason: Maintenance
- 12:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T367781)', diff saved to https://phabricator.wikimedia.org/P69944 and previous config saved to /var/cache/conftool/dbconfig/20241015-123039-arnaudb.json
- 12:30 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 12:29 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 12:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T370903)', diff saved to https://phabricator.wikimedia.org/P69943 and previous config saved to /var/cache/conftool/dbconfig/20241015-122601-ladsgroup.json
- 12:24 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 12:24 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 12:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1230 (T370903)', diff saved to https://phabricator.wikimedia.org/P69942 and previous config saved to /var/cache/conftool/dbconfig/20241015-122251-ladsgroup.json
- 12:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1230.eqiad.wmnet with reason: Maintenance
- 12:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1230.eqiad.wmnet with reason: Maintenance
- 12:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P69941 and previous config saved to /var/cache/conftool/dbconfig/20241015-122149-ladsgroup.json
- 12:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2152 (T371742)', diff saved to https://phabricator.wikimedia.org/P69940 and previous config saved to /var/cache/conftool/dbconfig/20241015-121706-ladsgroup.json
- 12:17 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2152.codfw.wmnet with reason: Maintenance
- 12:16 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2152.codfw.wmnet with reason: Maintenance
- 12:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P69939 and previous config saved to /var/cache/conftool/dbconfig/20241015-121532-arnaudb.json
- 12:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T371742)', diff saved to https://phabricator.wikimedia.org/P69938 and previous config saved to /var/cache/conftool/dbconfig/20241015-121349-ladsgroup.json
- 12:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T376905)', diff saved to https://phabricator.wikimedia.org/P69937 and previous config saved to /var/cache/conftool/dbconfig/20241015-120642-ladsgroup.json
- 12:03 brouberol@cumin1002: END (FAIL) - Cookbook sre.presto.reboot-workers (exit_code=99) for Presto an-presto cluster: Reboot Presto nodes
- 12:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P69936 and previous config saved to /var/cache/conftool/dbconfig/20241015-120025-arnaudb.json
- 11:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P69935 and previous config saved to /var/cache/conftool/dbconfig/20241015-115842-ladsgroup.json
- 11:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T376905)', diff saved to https://phabricator.wikimedia.org/P69934 and previous config saved to /var/cache/conftool/dbconfig/20241015-115630-ladsgroup.json
- 11:56 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 11:56 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 11:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T376905)', diff saved to https://phabricator.wikimedia.org/P69933 and previous config saved to /var/cache/conftool/dbconfig/20241015-115606-ladsgroup.json
- 11:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T367781)', diff saved to https://phabricator.wikimedia.org/P69932 and previous config saved to /var/cache/conftool/dbconfig/20241015-114518-arnaudb.json
- 11:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P69931 and previous config saved to /var/cache/conftool/dbconfig/20241015-114336-ladsgroup.json
- 11:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2146 (T367781)', diff saved to https://phabricator.wikimedia.org/P69930 and previous config saved to /var/cache/conftool/dbconfig/20241015-114302-arnaudb.json
- 11:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2146.codfw.wmnet with reason: Maintenance
- 11:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2146.codfw.wmnet with reason: Maintenance
- 11:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T367781)', diff saved to https://phabricator.wikimedia.org/P69929 and previous config saved to /var/cache/conftool/dbconfig/20241015-114240-arnaudb.json
- 11:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P69927 and previous config saved to /var/cache/conftool/dbconfig/20241015-114059-ladsgroup.json
- 11:34 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 11:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T371742)', diff saved to https://phabricator.wikimedia.org/P69926 and previous config saved to /var/cache/conftool/dbconfig/20241015-112829-ladsgroup.json
- 11:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P69925 and previous config saved to /var/cache/conftool/dbconfig/20241015-112733-arnaudb.json
- 11:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P69924 and previous config saved to /var/cache/conftool/dbconfig/20241015-112551-ladsgroup.json
- 11:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P69923 and previous config saved to /var/cache/conftool/dbconfig/20241015-111226-arnaudb.json
- 11:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T376905)', diff saved to https://phabricator.wikimedia.org/P69922 and previous config saved to /var/cache/conftool/dbconfig/20241015-111045-ladsgroup.json
- 11:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1230 (T371742)', diff saved to https://phabricator.wikimedia.org/P69921 and previous config saved to /var/cache/conftool/dbconfig/20241015-110741-ladsgroup.json
- 11:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
- 11:07 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
- 11:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T376905)', diff saved to https://phabricator.wikimedia.org/P69920 and previous config saved to /var/cache/conftool/dbconfig/20241015-110132-ladsgroup.json
- 11:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 11:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 11:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 11:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 10:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T367781)', diff saved to https://phabricator.wikimedia.org/P69919 and previous config saved to /var/cache/conftool/dbconfig/20241015-105719-arnaudb.json
- 10:53 tappof: expand LVs on prometheus instances (k8s-mlserve and k8s-stagin) T377196
- 10:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2145 (T367781)', diff saved to https://phabricator.wikimedia.org/P69918 and previous config saved to /var/cache/conftool/dbconfig/20241015-105301-arnaudb.json
- 10:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2145.codfw.wmnet with reason: Maintenance
- 10:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2145.codfw.wmnet with reason: Maintenance
- 10:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 10:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 10:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T367781)', diff saved to https://phabricator.wikimedia.org/P69917 and previous config saved to /var/cache/conftool/dbconfig/20241015-105213-arnaudb.json
- 10:38 brouberol@cumin1002: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes
- 10:38 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2002.codfw.wmnet
- 10:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P69915 and previous config saved to /var/cache/conftool/dbconfig/20241015-103706-arnaudb.json
- 10:34 brouberol@cumin1002: START - Cookbook sre.hosts.reboot-single for host flink-zk2002.codfw.wmnet
- 10:30 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2003.codfw.wmnet
- 10:26 brouberol@cumin1002: START - Cookbook sre.hosts.reboot-single for host flink-zk2003.codfw.wmnet
- 10:25 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2001.codfw.wmnet
- 10:22 brouberol@cumin1002: START - Cookbook sre.hosts.reboot-single for host flink-zk2001.codfw.wmnet
- 10:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P69914 and previous config saved to /var/cache/conftool/dbconfig/20241015-102159-arnaudb.json
- 10:21 brouberol@cumin1002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
- 10:14 brouberol@cumin1002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
- 10:11 brouberol@cumin1002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker
- 10:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T367781)', diff saved to https://phabricator.wikimedia.org/P69913 and previous config saved to /var/cache/conftool/dbconfig/20241015-100652-arnaudb.json
- 10:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2130 (T367781)', diff saved to https://phabricator.wikimedia.org/P69912 and previous config saved to /var/cache/conftool/dbconfig/20241015-100435-arnaudb.json
- 10:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2130.codfw.wmnet with reason: Maintenance
- 10:04 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2130.codfw.wmnet with reason: Maintenance
- 10:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T367781)', diff saved to https://phabricator.wikimedia.org/P69911 and previous config saved to /var/cache/conftool/dbconfig/20241015-100413-arnaudb.json
- 09:57 brouberol@cumin1002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker
- 09:55 brouberol@cumin1002: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:dse-k8s-worker
- 09:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 09:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P69910 and previous config saved to /var/cache/conftool/dbconfig/20241015-094906-arnaudb.json
- 09:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P69909 and previous config saved to /var/cache/conftool/dbconfig/20241015-093359-arnaudb.json
- 09:26 brouberol@cumin1002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker
- 09:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T367781)', diff saved to https://phabricator.wikimedia.org/P69908 and previous config saved to /var/cache/conftool/dbconfig/20241015-091852-arnaudb.json
- 09:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2116 (T367781)', diff saved to https://phabricator.wikimedia.org/P69907 and previous config saved to /var/cache/conftool/dbconfig/20241015-091635-arnaudb.json
- 09:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2116.codfw.wmnet with reason: Maintenance
- 09:16 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2116.codfw.wmnet with reason: Maintenance
- 09:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 09:15 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 09:15 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 09:15 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 09:15 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1239.eqiad.wmnet with reason: Maintenance
- 09:15 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1239.eqiad.wmnet with reason: Maintenance
- 09:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T367781)', diff saved to https://phabricator.wikimedia.org/P69906 and previous config saved to /var/cache/conftool/dbconfig/20241015-091502-arnaudb.json
- 09:07 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 08:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P69905 and previous config saved to /var/cache/conftool/dbconfig/20241015-085955-arnaudb.json
- 08:47 oblivian@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: init - oblivian@cumin2002
- 08:46 oblivian@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: init - oblivian@cumin2002
- 08:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P69903 and previous config saved to /var/cache/conftool/dbconfig/20241015-084448-arnaudb.json
- 08:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T367781)', diff saved to https://phabricator.wikimedia.org/P69902 and previous config saved to /var/cache/conftool/dbconfig/20241015-082941-arnaudb.json
- 08:27 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on pc1013.eqiad.wmnet with reason: maintenance
- 08:27 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 08:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1235 (T367781)', diff saved to https://phabricator.wikimedia.org/P69901 and previous config saved to /var/cache/conftool/dbconfig/20241015-082727-arnaudb.json
- 08:27 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on pc1013.eqiad.wmnet with reason: maintenance
- 08:27 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1235.eqiad.wmnet with reason: Maintenance
- 08:27 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1235.eqiad.wmnet with reason: Maintenance
- 08:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T367781)', diff saved to https://phabricator.wikimedia.org/P69900 and previous config saved to /var/cache/conftool/dbconfig/20241015-082704-arnaudb.json
- 08:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P69899 and previous config saved to /var/cache/conftool/dbconfig/20241015-081157-arnaudb.json
- 07:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P69898 and previous config saved to /var/cache/conftool/dbconfig/20241015-075650-arnaudb.json
- 07:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 100%: post sunday p.age T368098', diff saved to https://phabricator.wikimedia.org/P69897 and previous config saved to /var/cache/conftool/dbconfig/20241015-074843-arnaudb.json
- 07:47 hashar: Restarted Gerrit - T373897
- 07:46 hashar@deploy2002: Finished deploy [gerrit/gerrit@2f0c927]: Gerrit to 3.10.2 on gerrit1003 - T373897 (duration: 00m 09s)
- 07:46 hashar@deploy2002: Started deploy [gerrit/gerrit@2f0c927]: Gerrit to 3.10.2 on gerrit1003 - T373897
- 07:42 hashar@deploy2002: Finished deploy [gerrit/gerrit@2f0c927]: Gerrit to 3.10.2 on gerrit2002 - T373897 (duration: 00m 07s)
- 07:42 hashar@deploy2002: Started deploy [gerrit/gerrit@2f0c927]: Gerrit to 3.10.2 on gerrit2002 - T373897
- 07:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T367781)', diff saved to https://phabricator.wikimedia.org/P69896 and previous config saved to /var/cache/conftool/dbconfig/20241015-074143-arnaudb.json
- 07:40 hashar@deploy2002: Finished deploy [gerrit/gerrit@2f0c927]: Gerrit to 3.10.2 on gerrit2003 - T373897 (duration: 00m 07s)
- 07:40 hashar@deploy2002: Started deploy [gerrit/gerrit@2f0c927]: Gerrit to 3.10.2 on gerrit2003 - T373897
- 07:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T367781)', diff saved to https://phabricator.wikimedia.org/P69895 and previous config saved to /var/cache/conftool/dbconfig/20241015-073928-arnaudb.json
- 07:39 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1234.eqiad.wmnet with reason: Maintenance
- 07:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1234.eqiad.wmnet with reason: Maintenance
- 07:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T367781)', diff saved to https://phabricator.wikimedia.org/P69894 and previous config saved to /var/cache/conftool/dbconfig/20241015-073906-arnaudb.json
- 07:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit[1003,2002-2003].wikimedia.org with reason: Gerrit 3.10.2 update
- 07:38 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on gerrit[1003,2002-2003].wikimedia.org with reason: Gerrit 3.10.2 update
- 07:35 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 07:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 75%: post sunday p.age T368098', diff saved to https://phabricator.wikimedia.org/P69893 and previous config saved to /var/cache/conftool/dbconfig/20241015-073338-arnaudb.json
- 07:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P69892 and previous config saved to /var/cache/conftool/dbconfig/20241015-072359-arnaudb.json
- 07:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 50%: post sunday p.age T368098', diff saved to https://phabricator.wikimedia.org/P69891 and previous config saved to /var/cache/conftool/dbconfig/20241015-071833-arnaudb.json
- 07:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P69890 and previous config saved to /var/cache/conftool/dbconfig/20241015-070852-arnaudb.json
- 07:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 25%: post sunday p.age T368098', diff saved to https://phabricator.wikimedia.org/P69889 and previous config saved to /var/cache/conftool/dbconfig/20241015-070327-arnaudb.json
- 06:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T367781)', diff saved to https://phabricator.wikimedia.org/P69888 and previous config saved to /var/cache/conftool/dbconfig/20241015-065345-arnaudb.json
- 06:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T367781)', diff saved to https://phabricator.wikimedia.org/P69887 and previous config saved to /var/cache/conftool/dbconfig/20241015-065130-arnaudb.json
- 06:51 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1232.eqiad.wmnet with reason: Maintenance
- 06:51 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1232.eqiad.wmnet with reason: Maintenance
- 06:30 kart_: Updated MinT to 2024-10-11-113932-production
- 06:27 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
- 06:18 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
- 06:16 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
- 06:08 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
- 05:38 _joe_: restart tomcat on idp1004
- 05:35 _joe_: restart tomcat on idp2004
- 05:15 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
- 05:10 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
- 04:00 mwpresync@deploy2002: Pruned MediaWiki: 1.43.0-wmf.24 (duration: 00m 56s)
- 03:51 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.43.0-wmf.27 refs T375658 (duration: 48m 30s)
- 03:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.43.0-wmf.27 refs T375658
- 02:40 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 02:40 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 02:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246 (T376905)', diff saved to https://phabricator.wikimedia.org/P69885 and previous config saved to /var/cache/conftool/dbconfig/20241015-024037-ladsgroup.json
- 02:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246', diff saved to https://phabricator.wikimedia.org/P69884 and previous config saved to /var/cache/conftool/dbconfig/20241015-022530-ladsgroup.json
- 02:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246', diff saved to https://phabricator.wikimedia.org/P69883 and previous config saved to /var/cache/conftool/dbconfig/20241015-021023-ladsgroup.json
- 01:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246 (T376905)', diff saved to https://phabricator.wikimedia.org/P69882 and previous config saved to /var/cache/conftool/dbconfig/20241015-015516-ladsgroup.json
- 01:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1246 (T376905)', diff saved to https://phabricator.wikimedia.org/P69881 and previous config saved to /var/cache/conftool/dbconfig/20241015-014831-ladsgroup.json
- 01:48 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1246.eqiad.wmnet with reason: Maintenance
- 01:48 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1246.eqiad.wmnet with reason: Maintenance
- 01:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T376905)', diff saved to https://phabricator.wikimedia.org/P69880 and previous config saved to /var/cache/conftool/dbconfig/20241015-014803-ladsgroup.json
- 01:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P69879 and previous config saved to /var/cache/conftool/dbconfig/20241015-013257-ladsgroup.json
- 01:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P69878 and previous config saved to /var/cache/conftool/dbconfig/20241015-011749-ladsgroup.json
- 01:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T376905)', diff saved to https://phabricator.wikimedia.org/P69877 and previous config saved to /var/cache/conftool/dbconfig/20241015-010242-ladsgroup.json
- 00:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T376905)', diff saved to https://phabricator.wikimedia.org/P69876 and previous config saved to /var/cache/conftool/dbconfig/20241015-005551-ladsgroup.json
- 00:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T370903)', diff saved to https://phabricator.wikimedia.org/P69875 and previous config saved to /var/cache/conftool/dbconfig/20241015-005546-ladsgroup.json
- 00:55 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: Maintenance
- 00:55 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: Maintenance
- 00:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T376905)', diff saved to https://phabricator.wikimedia.org/P69874 and previous config saved to /var/cache/conftool/dbconfig/20241015-005525-ladsgroup.json
- 00:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140', diff saved to https://phabricator.wikimedia.org/P69873 and previous config saved to /var/cache/conftool/dbconfig/20241015-004039-ladsgroup.json
- 00:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P69872 and previous config saved to /var/cache/conftool/dbconfig/20241015-004018-ladsgroup.json
- 00:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140', diff saved to https://phabricator.wikimedia.org/P69871 and previous config saved to /var/cache/conftool/dbconfig/20241015-002531-ladsgroup.json
- 00:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P69870 and previous config saved to /var/cache/conftool/dbconfig/20241015-002511-ladsgroup.json
- 00:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T370903)', diff saved to https://phabricator.wikimedia.org/P69869 and previous config saved to /var/cache/conftool/dbconfig/20241015-001024-ladsgroup.json
- 00:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T376905)', diff saved to https://phabricator.wikimedia.org/P69868 and previous config saved to /var/cache/conftool/dbconfig/20241015-001004-ladsgroup.json
- 00:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T376905)', diff saved to https://phabricator.wikimedia.org/P69867 and previous config saved to /var/cache/conftool/dbconfig/20241015-000304-ladsgroup.json
- 00:02 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1229.eqiad.wmnet with reason: Maintenance
- 00:02 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1229.eqiad.wmnet with reason: Maintenance
- 00:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T376905)', diff saved to https://phabricator.wikimedia.org/P69866 and previous config saved to /var/cache/conftool/dbconfig/20241015-000236-ladsgroup.json
2024-10-14
- 23:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P69865 and previous config saved to /var/cache/conftool/dbconfig/20241014-234729-ladsgroup.json
- 23:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P69864 and previous config saved to /var/cache/conftool/dbconfig/20241014-233222-ladsgroup.json
- 23:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2140 (T370903)', diff saved to https://phabricator.wikimedia.org/P69863 and previous config saved to /var/cache/conftool/dbconfig/20241014-232857-ladsgroup.json
- 23:28 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2140.codfw.wmnet with reason: Maintenance
- 23:28 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2140.codfw.wmnet with reason: Maintenance
- 23:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T370903)', diff saved to https://phabricator.wikimedia.org/P69862 and previous config saved to /var/cache/conftool/dbconfig/20241014-232835-ladsgroup.json
- 23:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T376905)', diff saved to https://phabricator.wikimedia.org/P69861 and previous config saved to /var/cache/conftool/dbconfig/20241014-231715-ladsgroup.json
- 23:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P69860 and previous config saved to /var/cache/conftool/dbconfig/20241014-231328-ladsgroup.json
- 23:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T376905)', diff saved to https://phabricator.wikimedia.org/P69859 and previous config saved to /var/cache/conftool/dbconfig/20241014-230903-ladsgroup.json
- 23:08 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 23:08 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 23:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T376905)', diff saved to https://phabricator.wikimedia.org/P69858 and previous config saved to /var/cache/conftool/dbconfig/20241014-230838-ladsgroup.json
- 22:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P69857 and previous config saved to /var/cache/conftool/dbconfig/20241014-225818-ladsgroup.json
- 22:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T371742)', diff saved to https://phabricator.wikimedia.org/P69856 and previous config saved to /var/cache/conftool/dbconfig/20241014-225528-ladsgroup.json
- 22:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P69855 and previous config saved to /var/cache/conftool/dbconfig/20241014-225331-ladsgroup.json
- 22:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T370903)', diff saved to https://phabricator.wikimedia.org/P69854 and previous config saved to /var/cache/conftool/dbconfig/20241014-224311-ladsgroup.json
- 22:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P69853 and previous config saved to /var/cache/conftool/dbconfig/20241014-224022-ladsgroup.json
- 22:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P69852 and previous config saved to /var/cache/conftool/dbconfig/20241014-223824-ladsgroup.json
- 22:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P69851 and previous config saved to /var/cache/conftool/dbconfig/20241014-222515-ladsgroup.json
- 22:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T376905)', diff saved to https://phabricator.wikimedia.org/P69850 and previous config saved to /var/cache/conftool/dbconfig/20241014-222317-ladsgroup.json
- 22:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P69849 and previous config saved to /var/cache/conftool/dbconfig/20241014-222009-ladsgroup.json
- 22:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T376905)', diff saved to https://phabricator.wikimedia.org/P69848 and previous config saved to /var/cache/conftool/dbconfig/20241014-221508-ladsgroup.json
- 22:15 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 22:14 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 22:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T376905)', diff saved to https://phabricator.wikimedia.org/P69847 and previous config saved to /var/cache/conftool/dbconfig/20241014-221443-ladsgroup.json
- 22:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T371742)', diff saved to https://phabricator.wikimedia.org/P69846 and previous config saved to /var/cache/conftool/dbconfig/20241014-221008-ladsgroup.json
- 22:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P69845 and previous config saved to /var/cache/conftool/dbconfig/20241014-220504-ladsgroup.json
- 22:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1243 (T370903)', diff saved to https://phabricator.wikimedia.org/P69844 and previous config saved to /var/cache/conftool/dbconfig/20241014-220134-ladsgroup.json
- 22:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1243.eqiad.wmnet with reason: Maintenance
- 22:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1243.eqiad.wmnet with reason: Maintenance
- 21:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P69843 and previous config saved to /var/cache/conftool/dbconfig/20241014-215936-ladsgroup.json
- 21:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P69842 and previous config saved to /var/cache/conftool/dbconfig/20241014-214958-ladsgroup.json
- 21:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1236 (T371742)', diff saved to https://phabricator.wikimedia.org/P69841 and previous config saved to /var/cache/conftool/dbconfig/20241014-214515-ladsgroup.json
- 21:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1236.eqiad.wmnet with reason: Maintenance
- 21:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1236.eqiad.wmnet with reason: Maintenance
- 21:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P69840 and previous config saved to /var/cache/conftool/dbconfig/20241014-214429-ladsgroup.json
- 21:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1226 (T367856)', diff saved to https://phabricator.wikimedia.org/P69839 and previous config saved to /var/cache/conftool/dbconfig/20241014-213902-ladsgroup.json
- 21:38 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on db1226.eqiad.wmnet with reason: Maintenance
- 21:38 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on db1226.eqiad.wmnet with reason: Maintenance
- 21:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P69838 and previous config saved to /var/cache/conftool/dbconfig/20241014-213453-ladsgroup.json
- 21:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T376905)', diff saved to https://phabricator.wikimedia.org/P69837 and previous config saved to /var/cache/conftool/dbconfig/20241014-212922-ladsgroup.json
- 21:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T376905)', diff saved to https://phabricator.wikimedia.org/P69836 and previous config saved to /var/cache/conftool/dbconfig/20241014-212001-ladsgroup.json
- 21:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 21:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 21:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T376905)', diff saved to https://phabricator.wikimedia.org/P69835 and previous config saved to /var/cache/conftool/dbconfig/20241014-211937-ladsgroup.json
- 21:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P69834 and previous config saved to /var/cache/conftool/dbconfig/20241014-210430-ladsgroup.json
- 20:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P69833 and previous config saved to /var/cache/conftool/dbconfig/20241014-204923-ladsgroup.json
- 20:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T376905)', diff saved to https://phabricator.wikimedia.org/P69832 and previous config saved to /var/cache/conftool/dbconfig/20241014-203416-ladsgroup.json
- 20:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1162 (T376905)', diff saved to https://phabricator.wikimedia.org/P69831 and previous config saved to /var/cache/conftool/dbconfig/20241014-202504-ladsgroup.json
- 20:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 20:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 20:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T376905)', diff saved to https://phabricator.wikimedia.org/P69830 and previous config saved to /var/cache/conftool/dbconfig/20241014-202439-ladsgroup.json
- 20:21 TheresNoTime: UTC late backport window done
- 20:18 samtar@deploy2002: Finished scap sync-world: Backport for Missing.php: Redirect Scots Wiktionary to Scots Wikipedia (T249648) (duration: 08m 14s)
- 20:14 samtar@deploy2002: samtar, pppery: Continuing with sync
- 20:12 samtar@deploy2002: samtar, pppery: Backport for Missing.php: Redirect Scots Wiktionary to Scots Wikipedia (T249648) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:10 samtar@deploy2002: Started scap sync-world: Backport for Missing.php: Redirect Scots Wiktionary to Scots Wikipedia (T249648)
- 20:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P69829 and previous config saved to /var/cache/conftool/dbconfig/20241014-200932-ladsgroup.json
- 19:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P69828 and previous config saved to /var/cache/conftool/dbconfig/20241014-195425-ladsgroup.json
- 19:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T376905)', diff saved to https://phabricator.wikimedia.org/P69827 and previous config saved to /var/cache/conftool/dbconfig/20241014-193918-ladsgroup.json
- 19:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T376905)', diff saved to https://phabricator.wikimedia.org/P69826 and previous config saved to /var/cache/conftool/dbconfig/20241014-192956-ladsgroup.json
- 19:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 19:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 19:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 19:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 18:57 aqu@deploy2002: Finished deploy [airflow-dags/analytics@a1a70ce]: Deploy last version for Refine staging [airflow-dags@a1a70ce8] (duration: 00m 29s)
- 18:57 aqu@deploy2002: Started deploy [airflow-dags/analytics@a1a70ce]: Deploy last version for Refine staging [airflow-dags@a1a70ce8]
- 18:52 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 18:52 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 18:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T376905)', diff saved to https://phabricator.wikimedia.org/P69825 and previous config saved to /var/cache/conftool/dbconfig/20241014-185225-ladsgroup.json
- 18:47 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@a1a70ce]: Deploy last fixes on Refine staging [airflow-dags@a1a70ce8] (duration: 00m 13s)
- 18:47 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@a1a70ce]: Deploy last fixes on Refine staging [airflow-dags@a1a70ce8]
- 18:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P69824 and previous config saved to /var/cache/conftool/dbconfig/20241014-183718-ladsgroup.json
- 18:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P69823 and previous config saved to /var/cache/conftool/dbconfig/20241014-182211-ladsgroup.json
- 18:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T376905)', diff saved to https://phabricator.wikimedia.org/P69822 and previous config saved to /var/cache/conftool/dbconfig/20241014-180704-ladsgroup.json
- 17:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1231 (T376905)', diff saved to https://phabricator.wikimedia.org/P69821 and previous config saved to /var/cache/conftool/dbconfig/20241014-170647-ladsgroup.json
- 17:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
- 17:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
- 17:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 17:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 17:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T376905)', diff saved to https://phabricator.wikimedia.org/P69820 and previous config saved to /var/cache/conftool/dbconfig/20241014-170123-ladsgroup.json
- 16:51 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on cloudvirt1063.eqiad.wmnet with reason: cloudvirt1063 needs maintenance T375223
- 16:50 fnegri@cumin1002: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on cloudvirt1063.eqiad.wmnet with reason: cloudvirt1063 needs maintenance T375223
- 16:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P69819 and previous config saved to /var/cache/conftool/dbconfig/20241014-164616-ladsgroup.json
- 16:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P69818 and previous config saved to /var/cache/conftool/dbconfig/20241014-163109-ladsgroup.json
- 16:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T376905)', diff saved to https://phabricator.wikimedia.org/P69817 and previous config saved to /var/cache/conftool/dbconfig/20241014-161602-ladsgroup.json
- 16:03 sergi0: Running `sgimeno@mwmaint2002:~$ foreachwiki userOptions.php --delete --old=1 growthexperiments-tour-newimpact-discovery` (T376461)
- 15:52 aikochou@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
- 15:46 aikochou@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
- 15:16 isaranto@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
- 15:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T376905)', diff saved to https://phabricator.wikimedia.org/P69816 and previous config saved to /var/cache/conftool/dbconfig/20241014-151546-ladsgroup.json
- 15:15 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 15:15 isaranto@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 15:15 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 15:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T376905)', diff saved to https://phabricator.wikimedia.org/P69815 and previous config saved to /var/cache/conftool/dbconfig/20241014-151521-ladsgroup.json
- 15:07 elukey@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
- 15:06 elukey@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
- 15:05 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
- 15:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P69814 and previous config saved to /var/cache/conftool/dbconfig/20241014-150014-ladsgroup.json
- 14:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P69813 and previous config saved to /var/cache/conftool/dbconfig/20241014-144507-ladsgroup.json
- 14:43 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
- 14:43 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 14:41 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 14:41 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 14:39 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 14:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T376905)', diff saved to https://phabricator.wikimedia.org/P69812 and previous config saved to /var/cache/conftool/dbconfig/20241014-143000-ladsgroup.json
- 14:16 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts an-worker1177.eqiad.wmnet
- 14:16 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:16 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1177.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
- 14:16 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1177.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
- 14:12 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
- 14:12 Lucas_WMDE: UTC afternoon backport+config window done
- 14:10 Lucas_WMDE: [untruncated duration: 06m 48s]
- 14:09 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for refactor(tests): don't use per-method coverage annotation, refactor(HomepageHooks): extract method for simpler modifyability, Clear LinkRecommendation suggestions on page save (T364341 T372337), Run fixLinkRecommendationData even when disabled in CC (T373176) (duration: 0
- 14:07 stevemunene@cumin1002: START - Cookbook sre.hosts.decommission for hosts an-worker1177.eqiad.wmnet
- 14:07 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts an-worker1176.eqiad.wmnet
- 14:07 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:07 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1176.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
- 14:06 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1176.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
- 14:04 lucaswerkmeister-wmde@deploy2002: migr, lucaswerkmeister-wmde: Continuing with sync
- 14:04 lucaswerkmeister-wmde@deploy2002: migr, lucaswerkmeister-wmde: Backport for refactor(tests): don't use per-method coverage annotation, refactor(HomepageHooks): extract method for simpler modifyability, Clear LinkRecommendation suggestions on page save (T364341 T372337), Run fixLinkRecommendationData even when disabled in CC (T373176) synced to
- 14:03 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
- 14:02 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for refactor(tests): don't use per-method coverage annotation, refactor(HomepageHooks): extract method for simpler modifyability, Clear LinkRecommendation suggestions on page save (T364341 T372337), Run fixLinkRecommendationData even when disabled in CC (T373176)
- 13:58 stevemunene@cumin1002: START - Cookbook sre.hosts.decommission for hosts an-worker1176.eqiad.wmnet
- 13:46 ladsgroup@deploy2002: Finished scap sync-world: Backport for Update interwiki.php (duration: 07m 00s)
- 13:45 kcvelaga@deploy2002: Finished deploy [airflow-dags/analytics_product@fbcf880]: T375480 (duration: 01m 07s)
- 13:44 kcvelaga@deploy2002: Started deploy [airflow-dags/analytics_product@fbcf880]: T375480
- 13:41 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 13:41 ladsgroup@deploy2002: ladsgroup: Backport for Update interwiki.php synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:39 ladsgroup@deploy2002: Started scap sync-world: Backport for Update interwiki.php
- 13:35 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aux-k8s-etcd1002.eqiad.wmnet
- 13:35 elukey@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:35 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aux-k8s-etcd1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1002"
- 13:34 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aux-k8s-etcd1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1002"
- 13:31 elukey@cumin1002: START - Cookbook sre.dns.netbox
- 13:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T376905)', diff saved to https://phabricator.wikimedia.org/P69811 and previous config saved to /var/cache/conftool/dbconfig/20241014-132944-ladsgroup.json
- 13:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 13:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 13:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T376905)', diff saved to https://phabricator.wikimedia.org/P69810 and previous config saved to /var/cache/conftool/dbconfig/20241014-132918-ladsgroup.json
- 13:26 elukey@cumin1002: START - Cookbook sre.hosts.decommission for hosts aux-k8s-etcd1002.eqiad.wmnet
- 13:26 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aux-k8s-etcd1001.eqiad.wmnet
- 13:26 elukey@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:26 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aux-k8s-etcd1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1002"
- 13:26 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aux-k8s-etcd1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1002"
- 13:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P69809 and previous config saved to /var/cache/conftool/dbconfig/20241014-132409-ladsgroup.json
- 13:22 elukey@cumin1002: START - Cookbook sre.dns.netbox
- 13:18 elukey@cumin1002: START - Cookbook sre.hosts.decommission for hosts aux-k8s-etcd1001.eqiad.wmnet
- 13:16 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on aux-k8s-etcd1002.eqiad.wmnet with reason: about to decom
- 13:16 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on aux-k8s-etcd1002.eqiad.wmnet with reason: about to decom
- 13:15 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on aux-k8s-etcd1001.eqiad.wmnet with reason: about to decom
- 13:15 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on aux-k8s-etcd1001.eqiad.wmnet with reason: about to decom
- 13:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P69808 and previous config saved to /var/cache/conftool/dbconfig/20241014-131411-ladsgroup.json
- 13:13 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [uawikimedia] Enable the CampaignEvents extension (T376695) (duration: 10m 19s)
- 13:09 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, daimona: Continuing with sync
- 13:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P69807 and previous config saved to /var/cache/conftool/dbconfig/20241014-130904-ladsgroup.json
- 13:05 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, daimona: Backport for [uawikimedia] Enable the CampaignEvents extension (T376695) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:03 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [uawikimedia] Enable the CampaignEvents extension (T376695)
- 12:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P69806 and previous config saved to /var/cache/conftool/dbconfig/20241014-125904-ladsgroup.json
- 12:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P69805 and previous config saved to /var/cache/conftool/dbconfig/20241014-125358-ladsgroup.json
- 12:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T367781)', diff saved to https://phabricator.wikimedia.org/P69804 and previous config saved to /var/cache/conftool/dbconfig/20241014-124554-arnaudb.json
- 12:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 12:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 12:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T367781)', diff saved to https://phabricator.wikimedia.org/P69803 and previous config saved to /var/cache/conftool/dbconfig/20241014-124532-arnaudb.json
- 12:44 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@4b69f50]: Stage Refine fixes on test cluster [airflow-dags@4b69f503] (duration: 00m 12s)
- 12:44 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@4b69f50]: Stage Refine fixes on test cluster [airflow-dags@4b69f503]
- 12:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T376905)', diff saved to https://phabricator.wikimedia.org/P69802 and previous config saved to /var/cache/conftool/dbconfig/20241014-124357-ladsgroup.json
- 12:43 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aux-k8s-worker1001.eqiad.wmnet
- 12:43 elukey@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:43 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aux-k8s-worker1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1002"
- 12:41 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aux-k8s-worker1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1002"
- 12:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P69801 and previous config saved to /var/cache/conftool/dbconfig/20241014-123853-ladsgroup.json
- 12:37 elukey@cumin1002: START - Cookbook sre.dns.netbox
- 12:32 elukey@cumin1002: START - Cookbook sre.hosts.decommission for hosts aux-k8s-worker1001.eqiad.wmnet
- 12:32 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aux-k8s-ctrl1001.eqiad.wmnet
- 12:32 elukey@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:32 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aux-k8s-ctrl1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1002"
- 12:32 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aux-k8s-ctrl1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1002"
- 12:30 hnowlan: removed all aqsv1 service components from aqs* hosts
- 12:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P69800 and previous config saved to /var/cache/conftool/dbconfig/20241014-123025-arnaudb.json
- 12:28 elukey@cumin1002: START - Cookbook sre.dns.netbox
- 12:23 elukey@cumin1002: START - Cookbook sre.hosts.decommission for hosts aux-k8s-ctrl1001.eqiad.wmnet
- 12:22 elukey@puppetserver1001: conftool action : set/pooled=inactive; selector: name=aux-k8s-worker1001.eqiad.wmnet
- 12:22 elukey@puppetserver1001: conftool action : set/pooled=inactive; selector: name=aux-k8s-ctrl1001.eqiad.wmnet
- 12:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P69799 and previous config saved to /var/cache/conftool/dbconfig/20241014-121518-arnaudb.json
- 12:09 elukey: increase etcd k8s aux cluster from 3 -> 5 - T344230
- 12:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T367781)', diff saved to https://phabricator.wikimedia.org/P69798 and previous config saved to /var/cache/conftool/dbconfig/20241014-120011-arnaudb.json
- 11:59 aborrero@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:59 aborrero@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudlb2004-dev cloud-private adddress - aborrero@cumin1002"
- 11:59 aborrero@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudlb2004-dev cloud-private adddress - aborrero@cumin1002"
- 11:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T367781)', diff saved to https://phabricator.wikimedia.org/P69797 and previous config saved to /var/cache/conftool/dbconfig/20241014-115755-arnaudb.json
- 11:57 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1218.eqiad.wmnet with reason: Maintenance
- 11:57 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1218.eqiad.wmnet with reason: Maintenance
- 11:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T367781)', diff saved to https://phabricator.wikimedia.org/P69796 and previous config saved to /var/cache/conftool/dbconfig/20241014-115732-arnaudb.json
- 11:56 Dreamy_Jazz: Started time limited scan on enwiki for MediaModeration - https://wikitech.wikimedia.org/wiki/MediaModeration
- 11:56 aborrero@cumin1002: START - Cookbook sre.dns.netbox
- 11:52 btullis@cumin1002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wcqs-public
- 11:52 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2194.codfw.wmnet onto db2227.codfw.wmnet
- 11:50 btullis@cumin1002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wcqs-public
- 11:50 hnowlan@deploy2002: Finished deploy [restbase/deploy@26112d4]: Remove unused AQS components. Add bdrwiki (T371761) (duration: 15m 38s)
- 11:45 Dreamy_Jazz: Restarting MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
- 11:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T376905)', diff saved to https://phabricator.wikimedia.org/P69794 and previous config saved to /var/cache/conftool/dbconfig/20241014-114341-ladsgroup.json
- 11:43 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
- 11:43 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
- 11:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T376905)', diff saved to https://phabricator.wikimedia.org/P69793 and previous config saved to /var/cache/conftool/dbconfig/20241014-114316-ladsgroup.json
- 11:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P69792 and previous config saved to /var/cache/conftool/dbconfig/20241014-114225-arnaudb.json
- 11:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P69791 and previous config saved to /var/cache/conftool/dbconfig/20241014-113941-arnaudb.json
- 11:34 hnowlan@deploy2002: Started deploy [restbase/deploy@26112d4]: Remove unused AQS components. Add bdrwiki (T371761)
- 11:31 andrewtavis-wmde@deploy2002: Finished deploy [airflow-dags/wmde@c9a2532]: (no justification provided) (duration: 00m 08s)
- 11:30 andrewtavis-wmde@deploy2002: Started deploy [airflow-dags/wmde@c9a2532]: (no justification provided)
- 11:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P69790 and previous config saved to /var/cache/conftool/dbconfig/20241014-112809-ladsgroup.json
- 11:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P69789 and previous config saved to /var/cache/conftool/dbconfig/20241014-112719-arnaudb.json
- 11:26 claime: Running ./redis-check-aof --fix on rdb1014 tcp_6379 instance - T376961
- 11:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P69788 and previous config saved to /var/cache/conftool/dbconfig/20241014-112434-arnaudb.json
- 11:16 ladsgroup@deploy2002: Finished scap sync-world: Creating bclwikisource (T377084) (duration: 06m 49s)
- 11:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P69787 and previous config saved to /var/cache/conftool/dbconfig/20241014-111302-ladsgroup.json
- 11:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T367781)', diff saved to https://phabricator.wikimedia.org/P69786 and previous config saved to /var/cache/conftool/dbconfig/20241014-111211-arnaudb.json
- 11:10 ladsgroup@deploy2002: Started scap sync-world: Creating bclwikisource (T377084)
- 11:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1207 (T367781)', diff saved to https://phabricator.wikimedia.org/P69785 and previous config saved to /var/cache/conftool/dbconfig/20241014-110956-arnaudb.json
- 11:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 11:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 11:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T367781)', diff saved to https://phabricator.wikimedia.org/P69784 and previous config saved to /var/cache/conftool/dbconfig/20241014-110933-arnaudb.json
- 11:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P69783 and previous config saved to /var/cache/conftool/dbconfig/20241014-110927-arnaudb.json
- 11:07 ladsgroup@deploy2002: Finished scap sync-world: Creating ibawiki (T376568) (duration: 06m 45s)
- 11:05 eoghan@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
- 11:01 ladsgroup@deploy2002: Started scap sync-world: Creating ibawiki (T376568)
- 11:00 eoghan@cumin2002: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
- 10:58 ladsgroup@deploy2002: Finished scap sync-world: Creating annwiki (T376332) (duration: 06m 45s)
- 10:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T376905)', diff saved to https://phabricator.wikimedia.org/P69782 and previous config saved to /var/cache/conftool/dbconfig/20241014-105755-ladsgroup.json
- 10:55 mvernon@cumin1002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe
- 10:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P69781 and previous config saved to /var/cache/conftool/dbconfig/20241014-105426-arnaudb.json
- 10:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P69780 and previous config saved to /var/cache/conftool/dbconfig/20241014-105421-arnaudb.json
- 10:52 ladsgroup@deploy2002: Started scap sync-world: Creating annwiki (T376332)
- 10:51 mvernon@cumin1002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe
- 10:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T376905)', diff saved to https://phabricator.wikimedia.org/P69779 and previous config saved to /var/cache/conftool/dbconfig/20241014-104941-ladsgroup.json
- 10:49 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 10:49 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 10:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T376905)', diff saved to https://phabricator.wikimedia.org/P69778 and previous config saved to /var/cache/conftool/dbconfig/20241014-104916-ladsgroup.json
- 10:48 ladsgroup@deploy2002: Finished scap sync-world: Creating tddwiki (T375422) (duration: 06m 46s)
- 10:44 oblivian@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert1002.wikimedia.org with reason: init - oblivian@cumin2002
- 10:44 oblivian@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert1002.wikimedia.org with reason: init - oblivian@cumin2002
- 10:42 ladsgroup@deploy2002: Started scap sync-world: Creating tddwiki (T375422)
- 10:40 ladsgroup@deploy2002: Finished scap sync-world: Creating nrwiki (T375087) (duration: 06m 54s)
- 10:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P69777 and previous config saved to /var/cache/conftool/dbconfig/20241014-103919-arnaudb.json
- 10:35 oblivian@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert2002.wikimedia.org with reason: init - oblivian@cumin2002
- 10:35 oblivian@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert2002.wikimedia.org with reason: init - oblivian@cumin2002
- 10:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P69776 and previous config saved to /var/cache/conftool/dbconfig/20241014-103410-ladsgroup.json
- 10:33 ladsgroup@deploy2002: Started scap sync-world: Creating nrwiki (T375087)
- 10:31 ladsgroup@deploy2002: Finished scap sync-world: Backport for Add namespace translations for Tai Nüa (tdd) (T375421) (duration: 06m 45s)
- 10:27 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 10:27 ladsgroup@deploy2002: ladsgroup: Backport for Add namespace translations for Tai Nüa (tdd) (T375421) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 10:25 ladsgroup@deploy2002: Started scap sync-world: Backport for Add namespace translations for Tai Nüa (tdd) (T375421)
- 10:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T367781)', diff saved to https://phabricator.wikimedia.org/P69775 and previous config saved to /var/cache/conftool/dbconfig/20241014-102412-arnaudb.json
- 10:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T367781)', diff saved to https://phabricator.wikimedia.org/P69774 and previous config saved to /var/cache/conftool/dbconfig/20241014-102256-arnaudb.json
- 10:22 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1206.eqiad.wmnet with reason: Maintenance
- 10:22 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1206.eqiad.wmnet with reason: Maintenance
- 10:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T367781)', diff saved to https://phabricator.wikimedia.org/P69773 and previous config saved to /var/cache/conftool/dbconfig/20241014-102234-arnaudb.json
- 10:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P69772 and previous config saved to /var/cache/conftool/dbconfig/20241014-101903-ladsgroup.json
- 10:17 ladsgroup@cumin1002: START - Cookbook sre.mysql.clone of db2194.codfw.wmnet onto db2227.codfw.wmnet
- 10:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling for reclone (T375652)', diff saved to https://phabricator.wikimedia.org/P69771 and previous config saved to /var/cache/conftool/dbconfig/20241014-101354-ladsgroup.json
- 10:13 eoghan@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
- 10:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling for reclone (T375652)', diff saved to https://phabricator.wikimedia.org/P69770 and previous config saved to /var/cache/conftool/dbconfig/20241014-101246-ladsgroup.json
- 10:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P69769 and previous config saved to /var/cache/conftool/dbconfig/20241014-100727-arnaudb.json
- 10:06 eoghan@cumin2002: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
- 10:06 eoghan@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists2001.wikimedia.org
- 10:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T376905)', diff saved to https://phabricator.wikimedia.org/P69768 and previous config saved to /var/cache/conftool/dbconfig/20241014-100356-ladsgroup.json
- 10:00 akosiaris: powercycle rdb1014 T376961
- 10:00 eoghan@cumin2002: START - Cookbook sre.hosts.reboot-single for host lists2001.wikimedia.org
- 10:00 oblivian@cumin2002: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) hiddenparma to alert2002.wikimedia.org with reason: init - oblivian@cumin2002
- 10:00 oblivian@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert2002.wikimedia.org with reason: init - oblivian@cumin2002
- 10:00 ladsgroup@deploy2002: Finished scap sync-world: Creating rskwiki (T374963) (duration: 18m 38s)
- 09:59 oblivian@cumin2002: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) hiddenparma to alert2002.wikimedia.org with reason: init - oblivian@cumin2002
- 09:59 oblivian@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert2002.wikimedia.org with reason: init - oblivian@cumin2002
- 09:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P69767 and previous config saved to /var/cache/conftool/dbconfig/20241014-095354-arnaudb.json
- 09:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 09:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 09:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T367781)', diff saved to https://phabricator.wikimedia.org/P69766 and previous config saved to /var/cache/conftool/dbconfig/20241014-095331-arnaudb.json
- 09:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P69765 and previous config saved to /var/cache/conftool/dbconfig/20241014-095220-arnaudb.json
- 09:41 ladsgroup@deploy2002: Started scap sync-world: Creating rskwiki (T374963)
- 09:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P69764 and previous config saved to /var/cache/conftool/dbconfig/20241014-093824-arnaudb.json
- 09:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T367781)', diff saved to https://phabricator.wikimedia.org/P69763 and previous config saved to /var/cache/conftool/dbconfig/20241014-093713-arnaudb.json
- 09:36 isaranto@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
- 09:35 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T367781)', diff saved to https://phabricator.wikimedia.org/P69762 and previous config saved to /var/cache/conftool/dbconfig/20241014-093459-arnaudb.json
- 09:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 09:34 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 09:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1196.eqiad.wmnet with reason: Maintenance
- 09:34 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1196.eqiad.wmnet with reason: Maintenance
- 09:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T367781)', diff saved to https://phabricator.wikimedia.org/P69761 and previous config saved to /var/cache/conftool/dbconfig/20241014-093418-arnaudb.json
- 09:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P69760 and previous config saved to /var/cache/conftool/dbconfig/20241014-092317-arnaudb.json
- 09:21 isaranto@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
- 09:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P69759 and previous config saved to /var/cache/conftool/dbconfig/20241014-091911-arnaudb.json
- 09:09 ayounsi@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 09:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T367781)', diff saved to https://phabricator.wikimedia.org/P69758 and previous config saved to /var/cache/conftool/dbconfig/20241014-090810-arnaudb.json
- 09:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P69757 and previous config saved to /var/cache/conftool/dbconfig/20241014-090403-arnaudb.json
- 09:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T376905)', diff saved to https://phabricator.wikimedia.org/P69756 and previous config saved to /var/cache/conftool/dbconfig/20241014-090340-ladsgroup.json
- 09:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 09:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 09:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 09:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 09:01 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubestagemaster2005.codfw.wmnet
- 08:58 ayounsi@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 08:55 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubestagemaster2005.codfw.wmnet
- 08:55 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubestagemaster2004.codfw.wmnet
- 08:49 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubestagemaster2004.codfw.wmnet
- 08:49 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubestagemaster2003.codfw.wmnet
- 08:49 ayounsi@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 08:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T367781)', diff saved to https://phabricator.wikimedia.org/P69755 and previous config saved to /var/cache/conftool/dbconfig/20241014-084856-arnaudb.json
- 08:48 ayounsi@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 08:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1195 (T367781)', diff saved to https://phabricator.wikimedia.org/P69754 and previous config saved to /var/cache/conftool/dbconfig/20241014-084643-arnaudb.json
- 08:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1195.eqiad.wmnet with reason: Maintenance
- 08:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1195.eqiad.wmnet with reason: Maintenance
- 08:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T367781)', diff saved to https://phabricator.wikimedia.org/P69753 and previous config saved to /var/cache/conftool/dbconfig/20241014-084620-arnaudb.json
- 08:43 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubestagemaster2003.codfw.wmnet
- 08:43 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1029.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 08:40 elukey@cumin1002: START - Cookbook sre.hosts.provision for host dbproxy1029.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 08:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P69752 and previous config saved to /var/cache/conftool/dbconfig/20241014-083113-arnaudb.json
- 08:16 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 08:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P69751 and previous config saved to /var/cache/conftool/dbconfig/20241014-081606-arnaudb.json
- 08:13 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubestagemaster2003.codfw.wmnet
- 08:12 elukey@cumin1002: START - Cookbook sre.hosts.provision for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 08:12 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 08:11 elukey@cumin1002: START - Cookbook sre.hosts.provision for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 08:11 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 08:10 elukey@cumin1002: START - Cookbook sre.hosts.provision for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 08:10 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubestagemaster2004.codfw.wmnet
- 08:08 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubestagemaster2003.codfw.wmnet
- 08:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T367781)', diff saved to https://phabricator.wikimedia.org/P69750 and previous config saved to /var/cache/conftool/dbconfig/20241014-080744-arnaudb.json
- 08:07 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 08:07 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 08:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P69749 and previous config saved to /var/cache/conftool/dbconfig/20241014-080721-arnaudb.json
- 08:07 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubestagemaster2005.codfw.wmnet
- 08:02 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubestagemaster2004.codfw.wmnet
- 08:01 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubestagemaster2005.codfw.wmnet
- 08:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T367781)', diff saved to https://phabricator.wikimedia.org/P69748 and previous config saved to /var/cache/conftool/dbconfig/20241014-080059-arnaudb.json
- 08:00 jayme@cumin1002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM kubestagemaster2005.codfw.wmnet
- 08:00 jayme@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM kubestagemaster2005.codfw.wmnet
- 07:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T367781)', diff saved to https://phabricator.wikimedia.org/P69747 and previous config saved to /var/cache/conftool/dbconfig/20241014-075845-arnaudb.json
- 07:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1186.eqiad.wmnet with reason: Maintenance
- 07:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1186.eqiad.wmnet with reason: Maintenance
- 07:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T367781)', diff saved to https://phabricator.wikimedia.org/P69746 and previous config saved to /var/cache/conftool/dbconfig/20241014-075823-arnaudb.json
- 07:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 07:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 07:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P69745 and previous config saved to /var/cache/conftool/dbconfig/20241014-075214-arnaudb.json
- 07:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P69744 and previous config saved to /var/cache/conftool/dbconfig/20241014-074317-arnaudb.json
- 07:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P69743 and previous config saved to /var/cache/conftool/dbconfig/20241014-073707-arnaudb.json
- 07:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P69742 and previous config saved to /var/cache/conftool/dbconfig/20241014-072810-arnaudb.json
- 07:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P69741 and previous config saved to /var/cache/conftool/dbconfig/20241014-072201-arnaudb.json
- 07:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T367781)', diff saved to https://phabricator.wikimedia.org/P69740 and previous config saved to /var/cache/conftool/dbconfig/20241014-071302-arnaudb.json
- 07:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1184 (T367781)', diff saved to https://phabricator.wikimedia.org/P69739 and previous config saved to /var/cache/conftool/dbconfig/20241014-071048-arnaudb.json
- 07:10 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 07:10 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 07:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T367781)', diff saved to https://phabricator.wikimedia.org/P69738 and previous config saved to /var/cache/conftool/dbconfig/20241014-071026-arnaudb.json
- 06:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P69737 and previous config saved to /var/cache/conftool/dbconfig/20241014-065519-arnaudb.json
- 06:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P69736 and previous config saved to /var/cache/conftool/dbconfig/20241014-064012-arnaudb.json
- 06:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T367781)', diff saved to https://phabricator.wikimedia.org/P69735 and previous config saved to /var/cache/conftool/dbconfig/20241014-062505-arnaudb.json
- 06:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T367781)', diff saved to https://phabricator.wikimedia.org/P69734 and previous config saved to /var/cache/conftool/dbconfig/20241014-062249-arnaudb.json
- 06:22 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 06:22 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 06:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P69733 and previous config saved to /var/cache/conftool/dbconfig/20241014-062135-arnaudb.json
- 06:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 06:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 06:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 06:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 06:20 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 06:20 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 04:30 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 04:30 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 04:25 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 04:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 04:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T376905)', diff saved to https://phabricator.wikimedia.org/P69732 and previous config saved to /var/cache/conftool/dbconfig/20241014-042443-ladsgroup.json
- 04:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P69731 and previous config saved to /var/cache/conftool/dbconfig/20241014-040936-ladsgroup.json
- 03:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P69730 and previous config saved to /var/cache/conftool/dbconfig/20241014-035429-ladsgroup.json
- 03:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T376905)', diff saved to https://phabricator.wikimedia.org/P69729 and previous config saved to /var/cache/conftool/dbconfig/20241014-033922-ladsgroup.json
- 03:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1230 (T376905)', diff saved to https://phabricator.wikimedia.org/P69728 and previous config saved to /var/cache/conftool/dbconfig/20241014-033237-ladsgroup.json
- 03:32 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1230.eqiad.wmnet with reason: Maintenance
- 03:32 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1230.eqiad.wmnet with reason: Maintenance
- 03:27 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 03:27 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 03:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T376905)', diff saved to https://phabricator.wikimedia.org/P69727 and previous config saved to /var/cache/conftool/dbconfig/20241014-032710-ladsgroup.json
- 03:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to https://phabricator.wikimedia.org/P69726 and previous config saved to /var/cache/conftool/dbconfig/20241014-031203-ladsgroup.json
- 02:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to https://phabricator.wikimedia.org/P69725 and previous config saved to /var/cache/conftool/dbconfig/20241014-025656-ladsgroup.json
- 02:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T376905)', diff saved to https://phabricator.wikimedia.org/P69724 and previous config saved to /var/cache/conftool/dbconfig/20241014-024149-ladsgroup.json
- 02:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1213 (T376905)', diff saved to https://phabricator.wikimedia.org/P69723 and previous config saved to /var/cache/conftool/dbconfig/20241014-023616-ladsgroup.json
- 02:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1213.eqiad.wmnet with reason: Maintenance
- 02:35 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1213.eqiad.wmnet with reason: Maintenance
- 02:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T376905)', diff saved to https://phabricator.wikimedia.org/P69722 and previous config saved to /var/cache/conftool/dbconfig/20241014-023551-ladsgroup.json
- 02:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P69721 and previous config saved to /var/cache/conftool/dbconfig/20241014-022044-ladsgroup.json
- 02:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P69720 and previous config saved to /var/cache/conftool/dbconfig/20241014-020537-ladsgroup.json
- 01:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T376905)', diff saved to https://phabricator.wikimedia.org/P69719 and previous config saved to /var/cache/conftool/dbconfig/20241014-015030-ladsgroup.json
- 01:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1210 (T376905)', diff saved to https://phabricator.wikimedia.org/P69718 and previous config saved to /var/cache/conftool/dbconfig/20241014-014435-ladsgroup.json
- 01:44 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
- 01:44 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
- 01:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T376905)', diff saved to https://phabricator.wikimedia.org/P69717 and previous config saved to /var/cache/conftool/dbconfig/20241014-014410-ladsgroup.json
- 01:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P69716 and previous config saved to /var/cache/conftool/dbconfig/20241014-012903-ladsgroup.json
- 01:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P69715 and previous config saved to /var/cache/conftool/dbconfig/20241014-011356-ladsgroup.json
- 00:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T376905)', diff saved to https://phabricator.wikimedia.org/P69714 and previous config saved to /var/cache/conftool/dbconfig/20241014-005849-ladsgroup.json
- 00:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T376905)', diff saved to https://phabricator.wikimedia.org/P69713 and previous config saved to /var/cache/conftool/dbconfig/20241014-005056-ladsgroup.json
- 00:50 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
- 00:50 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
- 00:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T376905)', diff saved to https://phabricator.wikimedia.org/P69712 and previous config saved to /var/cache/conftool/dbconfig/20241014-005042-ladsgroup.json
- 00:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P69711 and previous config saved to /var/cache/conftool/dbconfig/20241014-003534-ladsgroup.json
- 00:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P69710 and previous config saved to /var/cache/conftool/dbconfig/20241014-002027-ladsgroup.json
- 00:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T376905)', diff saved to https://phabricator.wikimedia.org/P69709 and previous config saved to /var/cache/conftool/dbconfig/20241014-000520-ladsgroup.json
2024-10-13
- 23:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T376905)', diff saved to https://phabricator.wikimedia.org/P69708 and previous config saved to /var/cache/conftool/dbconfig/20241013-235726-ladsgroup.json
- 23:57 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
- 23:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
- 23:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T376905)', diff saved to https://phabricator.wikimedia.org/P69707 and previous config saved to /var/cache/conftool/dbconfig/20241013-235701-ladsgroup.json
- 23:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P69706 and previous config saved to /var/cache/conftool/dbconfig/20241013-234154-ladsgroup.json
- 23:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P69705 and previous config saved to /var/cache/conftool/dbconfig/20241013-232647-ladsgroup.json
- 23:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T376905)', diff saved to https://phabricator.wikimedia.org/P69704 and previous config saved to /var/cache/conftool/dbconfig/20241013-231140-ladsgroup.json
- 23:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T376905)', diff saved to https://phabricator.wikimedia.org/P69703 and previous config saved to /var/cache/conftool/dbconfig/20241013-230403-ladsgroup.json
- 23:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 23:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 23:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 23:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 12:12 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: maintenance
- 12:12 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: maintenance
- 12:11 arnaudb@cumin1002: dbctl commit (dc=all): 'depool db2147', diff saved to https://phabricator.wikimedia.org/P69702 and previous config saved to /var/cache/conftool/dbconfig/20241013-121154-arnaudb.json
- 10:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 10:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 10:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T367856)', diff saved to https://phabricator.wikimedia.org/P69701 and previous config saved to /var/cache/conftool/dbconfig/20241013-102205-ladsgroup.json
- 10:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P69700 and previous config saved to /var/cache/conftool/dbconfig/20241013-100658-ladsgroup.json
- 09:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P69699 and previous config saved to /var/cache/conftool/dbconfig/20241013-095151-ladsgroup.json
- 09:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T367856)', diff saved to https://phabricator.wikimedia.org/P69698 and previous config saved to /var/cache/conftool/dbconfig/20241013-093644-ladsgroup.json
2024-10-11
- 22:18 btullis@cumin1002: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on P{cephosd100[3-5]*} and (A:cephosd)
- 21:38 btullis@cumin1002: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P{cephosd100[3-5]*} and (A:cephosd)
- 21:36 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cephosd1002.eqiad.wmnet
- 21:26 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host cephosd1002.eqiad.wmnet
- 21:24 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cephosd1001.eqiad.wmnet
- 21:14 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host cephosd1001.eqiad.wmnet
- 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudlb2004-dev.codfw.wmnet with OS bookworm
- 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 16:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 16:49 btullis@cumin1002: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd
- 16:40 mfossati@deploy2002: Finished deploy [airflow-dags/platform_eng@c1d2914]: bump section topics to v0.16.0 (duration: 00m 42s)
- 16:39 mfossati@deploy2002: Started deploy [airflow-dags/platform_eng@c1d2914]: bump section topics to v0.16.0
- 16:38 mfossati@deploy2002: Finished deploy [airflow-dags/platform_eng@c1d2914]: bump section topics to v0.16.0 (duration: 01m 06s)
- 16:38 mfossati@deploy2002: Started deploy [airflow-dags/platform_eng@c1d2914]: bump section topics to v0.16.0
- 16:37 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudlb2004-dev.codfw.wmnet with reason: host reimage
- 16:34 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudlb2004-dev.codfw.wmnet with reason: host reimage
- 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudlb2004-dev.codfw.wmnet with OS bookworm
- 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudlb2004-dev to codfw - jhancock@cumin2002"
- 16:14 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudlb2004-dev to codfw - jhancock@cumin2002"
- 16:11 kcvelaga@deploy2002: Finished deploy [airflow-dags/analytics_product@1fb69c4]: T376456 (duration: 01m 15s)
- 16:10 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 16:10 kcvelaga@deploy2002: Started deploy [airflow-dags/analytics_product@1fb69c4]: T376456
- 15:41 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudlb2004-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTARTand with Dell SCP reboot policy FORCED
- 15:40 btullis@cumin1002: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd
- 15:37 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:37 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for codfw cloudgw - cmooney@cumin1002"
- 15:37 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for codfw cloudgw - cmooney@cumin1002"
- 15:36 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host cloudlb2004-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTARTand with Dell SCP reboot policy FORCED
- 15:34 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudlb2004-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTARTand with Dell SCP reboot policy FORCED
- 15:34 cmooney@cumin1002: START - Cookbook sre.dns.netbox
- 15:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host cloudlb2004-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTARTand with Dell SCP reboot policy FORCED
- 14:48 eevans@deploy2002: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
- 14:48 eevans@deploy2002: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
- 14:47 urandom: upgrading data-gateway to v1.0.10
- 14:46 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
- 14:46 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/data-gateway: apply
- 14:39 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
- 14:38 eevans@deploy2002: helmfile [staging] START helmfile.d/services/data-gateway: apply
- 14:31 andrewtavis-wmde@deploy2002: Finished deploy [airflow-dags/wmde@c9a2532]: (no justification provided) (duration: 00m 25s)
- 14:30 andrewtavis-wmde@deploy2002: Started deploy [airflow-dags/wmde@c9a2532]: (no justification provided)
- 13:59 arnaudb@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 100%: T376988', diff saved to https://phabricator.wikimedia.org/P69695 and previous config saved to /var/cache/conftool/dbconfig/20241011-135903-arnaudb.json
- 13:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudlb2004-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTARTand with Dell SCP reboot policy FORCED
- 13:44 arnaudb@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 75%: T376988', diff saved to https://phabricator.wikimedia.org/P69694 and previous config saved to /var/cache/conftool/dbconfig/20241011-134357-arnaudb.json
- 13:28 arnaudb@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 50%: T376988', diff saved to https://phabricator.wikimedia.org/P69693 and previous config saved to /var/cache/conftool/dbconfig/20241011-132852-arnaudb.json
- 13:13 arnaudb@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 25%: T376988', diff saved to https://phabricator.wikimedia.org/P69692 and previous config saved to /var/cache/conftool/dbconfig/20241011-131347-arnaudb.json
- 13:13 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "renamed k8s prefixes descriptions in Netbox - ayounsi@cumin1002"
- 13:12 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "renamed k8s prefixes descriptions in Netbox - ayounsi@cumin1002"
- 13:08 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host cloudlb2004-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTARTand with Dell SCP reboot policy FORCED
- 12:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 10%: T376988', diff saved to https://phabricator.wikimedia.org/P69691 and previous config saved to /var/cache/conftool/dbconfig/20241011-125841-arnaudb.json
- 12:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 5%: T376988', diff saved to https://phabricator.wikimedia.org/P69690 and previous config saved to /var/cache/conftool/dbconfig/20241011-124336-arnaudb.json
- 12:37 hashar: Restarting Gerrit
- 12:34 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts scandium.eqiad.wmnet
- 12:34 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:34 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: scandium.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - akosiaris@cumin1002"
- 12:34 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: scandium.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - akosiaris@cumin1002"
- 12:28 arnaudb@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 2%: T376988', diff saved to https://phabricator.wikimedia.org/P69688 and previous config saved to /var/cache/conftool/dbconfig/20241011-122830-arnaudb.json
- 12:13 arnaudb@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 1%: T376988', diff saved to https://phabricator.wikimedia.org/P69687 and previous config saved to /var/cache/conftool/dbconfig/20241011-121325-arnaudb.json
- 11:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1214 (T367856)', diff saved to https://phabricator.wikimedia.org/P69686 and previous config saved to /var/cache/conftool/dbconfig/20241011-114446-ladsgroup.json
- 11:44 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on db1214.eqiad.wmnet with reason: Maintenance
- 11:44 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on db1214.eqiad.wmnet with reason: Maintenance
- 11:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T367856)', diff saved to https://phabricator.wikimedia.org/P69685 and previous config saved to /var/cache/conftool/dbconfig/20241011-114424-ladsgroup.json
- 11:36 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
- 11:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P69684 and previous config saved to /var/cache/conftool/dbconfig/20241011-112917-ladsgroup.json
- 11:27 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker2092.codfw.wmnet
- 11:27 cgoubert@cumin1002: START - Cookbook sre.hosts.remove-downtime for wikikube-worker2092.codfw.wmnet
- 11:26 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2092.codfw.wmnet
- 11:26 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2092.codfw.wmnet
- 11:20 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2092.codfw.wmnet with OS bullseye
- 11:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P69683 and previous config saved to /var/cache/conftool/dbconfig/20241011-111410-ladsgroup.json
- 11:02 fabfur@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet
- 10:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T367856)', diff saved to https://phabricator.wikimedia.org/P69682 and previous config saved to /var/cache/conftool/dbconfig/20241011-105903-ladsgroup.json
- 10:58 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2092.codfw.wmnet with reason: host reimage
- 10:57 fabfur@cumin1002: START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet
- 10:56 fabfur@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet
- 10:56 cgoubert@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 10:55 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2092.codfw.wmnet with reason: host reimage
- 10:53 fabfur@cumin1002: START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet
- 10:50 brouberol@cumin1002: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd
- 10:50 fabfur: enabled puppet on R:acme_chief::cert for T376800
- 10:50 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
- 10:47 fabfur@cumin1002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host acmechief2002.codfw.wmnet
- 10:44 fabfur@cumin1002: START - Cookbook sre.hosts.reboot-single for host acmechief2002.codfw.wmnet
- 10:44 fabfur: rebooting acmechief1002|2002 (sequentially) (T376800)
- 10:37 fabfur@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1002.eqiad.wmnet
- 10:37 fabfur@cumin1002: START - Cookbook sre.hosts.reboot-single for host acmechief1002.eqiad.wmnet
- 10:35 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2092.codfw.wmnet with OS bullseye
- 10:34 fabfur: disabled puppet on acmechief1002 (T376800)
- 10:33 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2175.codfw.wmnet with reason: index corruption
- 10:33 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2175.codfw.wmnet with reason: index corruption
- 10:31 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2092.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTARTand with Dell SCP reboot policy GRACEFUL
- 10:27 jynus@cumin1002: dbctl commit (dc=all): 'depool db2175', diff saved to https://phabricator.wikimedia.org/P69680 and previous config saved to /var/cache/conftool/dbconfig/20241011-102706-jynus.json
- 10:26 fabfur: disabling puppet on R:acme_chief::cert for T376800
- 10:23 cgoubert@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-worker2092.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTARTand with Dell SCP reboot policy GRACEFUL
- 09:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1211 (T367856)', diff saved to https://phabricator.wikimedia.org/P69678 and previous config saved to /var/cache/conftool/dbconfig/20241011-095847-ladsgroup.json
- 09:58 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on db1211.eqiad.wmnet with reason: Maintenance
- 09:58 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on db1211.eqiad.wmnet with reason: Maintenance
- 09:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T367856)', diff saved to https://phabricator.wikimedia.org/P69677 and previous config saved to /var/cache/conftool/dbconfig/20241011-095826-ladsgroup.json
- 09:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P69676 and previous config saved to /var/cache/conftool/dbconfig/20241011-094319-ladsgroup.json
- 09:41 brouberol@cumin1002: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd
- 09:38 akosiaris@cumin1002: START - Cookbook sre.hosts.decommission for hosts scandium.eqiad.wmnet
- 09:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P69675 and previous config saved to /var/cache/conftool/dbconfig/20241011-092812-ladsgroup.json
- 09:27 Dreamy_Jazz: Restarted MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
- 09:18 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1176.eqiad.wmnet with OS bullseye
- 09:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T367856)', diff saved to https://phabricator.wikimedia.org/P69674 and previous config saved to /var/cache/conftool/dbconfig/20241011-091305-ladsgroup.json
- 08:19 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 08:17 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 08:12 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 08:10 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1177.eqiad.wmnet with OS bullseye
- 08:10 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 08:02 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1176.eqiad.wmnet with OS bullseye
- 08:00 moritzm: upload ircstream 0.13.0+wmf12u2 to apt.wikimedia.org (sync to latest git and the async_broadcast feature branch) T376014
- 07:59 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1177.eqiad.wmnet with OS bullseye
- 07:56 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1177.eqiad.wmnet with OS bullseye
- 02:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T367781)', diff saved to https://phabricator.wikimedia.org/P69673 and previous config saved to /var/cache/conftool/dbconfig/20241011-021156-arnaudb.json
- 01:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P69672 and previous config saved to /var/cache/conftool/dbconfig/20241011-015649-arnaudb.json
- 01:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P69671 and previous config saved to /var/cache/conftool/dbconfig/20241011-014142-arnaudb.json
- 01:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T367781)', diff saved to https://phabricator.wikimedia.org/P69670 and previous config saved to /var/cache/conftool/dbconfig/20241011-012635-arnaudb.json
- 01:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2237 (T367781)', diff saved to https://phabricator.wikimedia.org/P69669 and previous config saved to /var/cache/conftool/dbconfig/20241011-012424-arnaudb.json
- 01:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2237.codfw.wmnet with reason: Maintenance
- 01:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2237.codfw.wmnet with reason: Maintenance
- 01:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P69668 and previous config saved to /var/cache/conftool/dbconfig/20241011-012401-arnaudb.json
- 01:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P69667 and previous config saved to /var/cache/conftool/dbconfig/20241011-010854-arnaudb.json
- 00:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P69666 and previous config saved to /var/cache/conftool/dbconfig/20241011-005347-arnaudb.json
- 00:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P69665 and previous config saved to /var/cache/conftool/dbconfig/20241011-003840-arnaudb.json
2024-10-10
- 23:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2219 (T367781)', diff saved to https://phabricator.wikimedia.org/P69664 and previous config saved to /var/cache/conftool/dbconfig/20241010-233814-arnaudb.json
- 23:38 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 23:37 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 23:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T367781)', diff saved to https://phabricator.wikimedia.org/P69663 and previous config saved to /var/cache/conftool/dbconfig/20241010-233752-arnaudb.json
- 23:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P69662 and previous config saved to /var/cache/conftool/dbconfig/20241010-232245-arnaudb.json
- 23:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P69661 and previous config saved to /var/cache/conftool/dbconfig/20241010-230738-arnaudb.json
- 22:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T367781)', diff saved to https://phabricator.wikimedia.org/P69660 and previous config saved to /var/cache/conftool/dbconfig/20241010-225231-arnaudb.json
- 22:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2210 (T367781)', diff saved to https://phabricator.wikimedia.org/P69659 and previous config saved to /var/cache/conftool/dbconfig/20241010-225019-arnaudb.json
- 22:50 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2210.codfw.wmnet with reason: Maintenance
- 22:50 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2210.codfw.wmnet with reason: Maintenance
- 22:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T367781)', diff saved to https://phabricator.wikimedia.org/P69658 and previous config saved to /var/cache/conftool/dbconfig/20241010-224957-arnaudb.json
- 22:37 cstone: payments-wiki upgraded from ebb42c67 to 40e4a592
- 22:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P69657 and previous config saved to /var/cache/conftool/dbconfig/20241010-223450-arnaudb.json
- 22:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P69656 and previous config saved to /var/cache/conftool/dbconfig/20241010-221943-arnaudb.json
- 22:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T367781)', diff saved to https://phabricator.wikimedia.org/P69655 and previous config saved to /var/cache/conftool/dbconfig/20241010-220437-arnaudb.json
- 22:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2206 (T367781)', diff saved to https://phabricator.wikimedia.org/P69654 and previous config saved to /var/cache/conftool/dbconfig/20241010-220125-arnaudb.json
- 22:01 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2206.codfw.wmnet with reason: Maintenance
- 22:01 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2206.codfw.wmnet with reason: Maintenance
- 22:01 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2199.codfw.wmnet with reason: Maintenance
- 22:00 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2199.codfw.wmnet with reason: Maintenance
- 22:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T367781)', diff saved to https://phabricator.wikimedia.org/P69653 and previous config saved to /var/cache/conftool/dbconfig/20241010-220043-arnaudb.json
- 21:52 jforrester@deploy2002: Finished deploy [integration/docroot@ff9e25a]: Add Codex PHP doc and source code link, for T375939 (duration: 00m 08s)
- 21:52 jforrester@deploy2002: Started deploy [integration/docroot@ff9e25a]: Add Codex PHP doc and source code link, for T375939
- 21:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P69652 and previous config saved to /var/cache/conftool/dbconfig/20241010-214536-arnaudb.json
- 21:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P69651 and previous config saved to /var/cache/conftool/dbconfig/20241010-213029-arnaudb.json
- 21:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T367781)', diff saved to https://phabricator.wikimedia.org/P69650 and previous config saved to /var/cache/conftool/dbconfig/20241010-211522-arnaudb.json
- 21:05 aqu@deploy2002: Finished deploy [airflow-dags/analytics@c9a2532]: Webrequest-Refine fix [airflow-dags@c9a2532e] (duration: 00m 51s)
- 21:04 aqu@deploy2002: Started deploy [airflow-dags/analytics@c9a2532]: Webrequest-Refine fix [airflow-dags@c9a2532e]
- 21:04 thcipriani@deploy2002: Finished scap sync-world: Backport for Update VE core submodule to master (c98f3a542) (T376901) (duration: 08m 56s)
- 20:59 thcipriani@deploy2002: jforrester, thcipriani: Continuing with sync
- 20:57 thcipriani@deploy2002: jforrester, thcipriani: Backport for Update VE core submodule to master (c98f3a542) (T376901) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:55 thcipriani@deploy2002: Started scap sync-world: Backport for Update VE core submodule to master (c98f3a542) (T376901)
- 20:27 eileen: config revision changed from 150b02a9 to 3c6d2054
- 20:23 thcipriani@deploy2002: Finished scap sync-world: Backport for REST: Make experimental endpoints available on beta and testwiki (T375512) (duration: 08m 34s)
- 20:18 thcipriani@deploy2002: bpirkle, thcipriani: Continuing with sync
- 20:16 thcipriani@deploy2002: bpirkle, thcipriani: Backport for REST: Make experimental endpoints available on beta and testwiki (T375512) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T367781)', diff saved to https://phabricator.wikimedia.org/P69649 and previous config saved to /var/cache/conftool/dbconfig/20241010-201456-arnaudb.json
- 20:14 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 20:14 thcipriani@deploy2002: Started scap sync-world: Backport for REST: Make experimental endpoints available on beta and testwiki (T375512)
- 20:14 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 20:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P69648 and previous config saved to /var/cache/conftool/dbconfig/20241010-201433-arnaudb.json
- 20:05 eileen: civicrm upgraded from 07dee21c to ff3144dd
- 19:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P69647 and previous config saved to /var/cache/conftool/dbconfig/20241010-195926-arnaudb.json
- 19:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P69646 and previous config saved to /var/cache/conftool/dbconfig/20241010-194419-arnaudb.json
- 19:43 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@4b69f50]: Stage Webrequest-Refine fix on test cluster [airflow-dags@4b69f503] (duration: 00m 13s)
- 19:43 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@4b69f50]: Stage Webrequest-Refine fix on test cluster [airflow-dags@4b69f503]
- 19:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P69645 and previous config saved to /var/cache/conftool/dbconfig/20241010-192912-arnaudb.json
- 19:23 rzl@deploy2002: Finished scap sync-world: chart version bump for 1078720 (duration: 02m 09s)
- 19:21 rzl@deploy2002: Started scap sync-world: chart version bump for 1078720
- 19:06 eileen: config revision changed from ae4a5be9 to 150b02a9
- 18:50 papaul: maintenance on mr1-eqiad complete
- 18:44 eileen: tools upgraded from 632bf430 to 62f2d170
- 18:29 eileen: tools upgraded from e9c05e30 to 632bf430
- 18:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T367781)', diff saved to https://phabricator.wikimedia.org/P69644 and previous config saved to /var/cache/conftool/dbconfig/20241010-182846-arnaudb.json
- 18:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 18:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 18:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 18:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 18:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T367781)', diff saved to https://phabricator.wikimedia.org/P69643 and previous config saved to /var/cache/conftool/dbconfig/20241010-182808-arnaudb.json
- 18:14 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet
- 18:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P69642 and previous config saved to /var/cache/conftool/dbconfig/20241010-181301-arnaudb.json
- 18:08 jhathaway@cumin1002: START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet
- 18:00 papaul: ongoing maintenance on mr1-eqiad
- 17:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P69641 and previous config saved to /var/cache/conftool/dbconfig/20241010-175754-arnaudb.json
- 17:57 root@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for dbprov1001.eqiad.wmnet: Renew puppet certificate - root@cumin1002
- 17:54 root@cumin1002: START - Cookbook sre.puppet.renew-cert for dbprov1001.eqiad.wmnet: Renew puppet certificate - root@cumin1002
- 17:47 swfrench@cumin2002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool echostore in eqiad: Repooling echostore after migration to service mesh - T376766
- 17:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T367781)', diff saved to https://phabricator.wikimedia.org/P69640 and previous config saved to /var/cache/conftool/dbconfig/20241010-174247-arnaudb.json
- 17:42 swfrench@cumin2002: START - Cookbook sre.discovery.service-route pool echostore in eqiad: Repooling echostore after migration to service mesh - T376766
- 17:39 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
- 17:39 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/echostore: apply
- 17:38 swfrench-wmf: removing echostore eqiad deployment (depooled) to unblock breaking change - T376766
- 17:34 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
- 17:34 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
- 17:34 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
- 17:33 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
- 17:33 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
- 17:32 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 17:25 swfrench@cumin2002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool echostore in eqiad: Depooling echostore for migration to service mesh - T376766
- 17:20 swfrench@cumin2002: START - Cookbook sre.discovery.service-route depool echostore in eqiad: Depooling echostore for migration to service mesh - T376766
- 17:04 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 17:04 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 17:04 swfrench@cumin2002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool echostore in codfw: Repooling echostore after migration to service mesh - T376766
- 16:59 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cephosd1001.eqiad.wmnet
- 16:58 swfrench@cumin2002: START - Cookbook sre.discovery.service-route pool echostore in codfw: Repooling echostore after migration to service mesh - T376766
- 16:53 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 16:53 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 16:53 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 16:51 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 16:51 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
- 16:51 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
- 16:50 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/echostore: apply
- 16:50 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/echostore: apply
- 16:49 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host cephosd1001.eqiad.wmnet
- 16:47 swfrench-wmf: removing echostore codfw deployment (depooled) to unblock breaking change - T376766
- 16:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2147 (T367781)', diff saved to https://phabricator.wikimedia.org/P69639 and previous config saved to /var/cache/conftool/dbconfig/20241010-164221-arnaudb.json
- 16:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 16:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 16:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T367781)', diff saved to https://phabricator.wikimedia.org/P69638 and previous config saved to /var/cache/conftool/dbconfig/20241010-164159-arnaudb.json
- 16:40 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage1003.eqiad.wmnet with OS bookworm
- 16:30 jhathaway@cumin1002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
- 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140', diff saved to https://phabricator.wikimedia.org/P69637 and previous config saved to /var/cache/conftool/dbconfig/20241010-162652-arnaudb.json
- 16:23 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1003.eqiad.wmnet with reason: host reimage
- 16:23 jhathaway@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
- 16:21 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1003.eqiad.wmnet with reason: host reimage
- 16:18 swfrench@cumin2002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool echostore in codfw: Depooling echostore for migration to service mesh - T376766
- 16:13 swfrench@cumin2002: START - Cookbook sre.discovery.service-route depool echostore in codfw: Depooling echostore for migration to service mesh - T376766
- 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140', diff saved to https://phabricator.wikimedia.org/P69636 and previous config saved to /var/cache/conftool/dbconfig/20241010-161145-arnaudb.json
- 16:04 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestage1003.eqiad.wmnet with OS bookworm
- 16:03 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
- 16:02 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
- 15:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T367781)', diff saved to https://phabricator.wikimedia.org/P69635 and previous config saved to /var/cache/conftool/dbconfig/20241010-155638-arnaudb.json
- 15:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2140 (T367781)', diff saved to https://phabricator.wikimedia.org/P69634 and previous config saved to /var/cache/conftool/dbconfig/20241010-155426-arnaudb.json
- 15:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2140.codfw.wmnet with reason: Maintenance
- 15:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2140.codfw.wmnet with reason: Maintenance
- 15:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 15:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 15:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T367781)', diff saved to https://phabricator.wikimedia.org/P69633 and previous config saved to /var/cache/conftool/dbconfig/20241010-155345-arnaudb.json
- 15:53 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:47 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1002.eqiad.wmnet with OS bookworm
- 15:40 papaul: mr1-drmrs maintenance complete
- 15:39 dancy@deploy2002: Installation of scap version "4.110.0" completed for 211 hosts
- 15:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P69632 and previous config saved to /var/cache/conftool/dbconfig/20241010-153838-arnaudb.json
- 15:35 dancy@deploy2002: Installing scap version "4.110.0" for 211 hosts
- 15:33 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:28 jhathaway@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1002.eqiad.wmnet with reason: host reimage
- 15:25 jhathaway@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1002.eqiad.wmnet with reason: host reimage
- 15:23 sukhe@cumin1002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir
- 15:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P69631 and previous config saved to /var/cache/conftool/dbconfig/20241010-152331-arnaudb.json
- 15:15 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:13 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 15:13 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 15:13 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 15:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T367781)', diff saved to https://phabricator.wikimedia.org/P69630 and previous config saved to /var/cache/conftool/dbconfig/20241010-150824-arnaudb.json
- 15:08 jhathaway@cumin1002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
- 15:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2136 (T367781)', diff saved to https://phabricator.wikimedia.org/P69629 and previous config saved to /var/cache/conftool/dbconfig/20241010-150512-arnaudb.json
- 15:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2136.codfw.wmnet with reason: Maintenance
- 15:04 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2136.codfw.wmnet with reason: Maintenance
- 15:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 15:04 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 15:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T367781)', diff saved to https://phabricator.wikimedia.org/P69628 and previous config saved to /var/cache/conftool/dbconfig/20241010-150433-arnaudb.json
- 15:02 papaul: ongoing maintenance on mr1-drmrs
- 14:56 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@4b69f50]: Revert previous staging of Refine fixes on test cluster [airflow-dags@4b69f503] (duration: 00m 13s)
- 14:56 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@4b69f50]: Revert previous staging of Refine fixes on test cluster [airflow-dags@4b69f503]
- 14:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P69626 and previous config saved to /var/cache/conftool/dbconfig/20241010-144926-arnaudb.json
- 14:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T367781)', diff saved to https://phabricator.wikimedia.org/P69625 and previous config saved to /var/cache/conftool/dbconfig/20241010-143713-arnaudb.json
- 14:34 jhathaway@cumin1002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest1002.eqiad.wmnet']
- 14:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P69624 and previous config saved to /var/cache/conftool/dbconfig/20241010-143419-arnaudb.json
- 14:28 jhathaway@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1002.eqiad.wmnet']
- 14:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P69623 and previous config saved to /var/cache/conftool/dbconfig/20241010-142206-arnaudb.json
- 14:19 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@4b69f50]: Stage Refine fixes on test cluster [airflow-dags@4b69f503] (duration: 00m 13s)
- 14:19 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@4b69f50]: Stage Refine fixes on test cluster [airflow-dags@4b69f503]
- 14:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T367781)', diff saved to https://phabricator.wikimedia.org/P69622