Server Admin Log

From Wikitech
Jump to navigation Jump to search

2023-12-05

  • 21:12 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P54207 and previous config saved to /var/cache/conftool/dbconfig/20231205-211200-arnaudb.json
  • 21:11 jforrester@deploy2002: Finished scap: Backport for Revert "Do not try to use Thumbor on beta" (T344605), nlwikivoyage: Drop Listings extension (T352696), Drop Listings extension from Wikivoyages where unused (T352719) (duration: 08m 45s)
  • 21:04 jforrester@deploy2002: tgr and jforrester: Continuing with sync
  • 21:04 jforrester@deploy2002: tgr and jforrester: Backport for Revert "Do not try to use Thumbor on beta" (T344605), nlwikivoyage: Drop Listings extension (T352696), Drop Listings extension from Wikivoyages where unused (T352719) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:02 jforrester@deploy2002: Started scap: Backport for Revert "Do not try to use Thumbor on beta" (T344605), nlwikivoyage: Drop Listings extension (T352696), Drop Listings extension from Wikivoyages where unused (T352719)
  • 20:58 inflatador: bking@prometheus1006 disable puppet for troubleshooting T347355
  • 20:56 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P54206 and previous config saved to /var/cache/conftool/dbconfig/20231205-205654-arnaudb.json
  • 20:53 inflatador: bking@prometheus1006 reload prometheus-blackbox service T347355
  • 20:41 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T348183)', diff saved to https://phabricator.wikimedia.org/P54205 and previous config saved to /var/cache/conftool/dbconfig/20231205-204147-arnaudb.json
  • 20:32 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1219 (T348183)', diff saved to https://phabricator.wikimedia.org/P54204 and previous config saved to /var/cache/conftool/dbconfig/20231205-203158-arnaudb.json
  • 20:31 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 20:31 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 20:31 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T348183)', diff saved to https://phabricator.wikimedia.org/P54203 and previous config saved to /var/cache/conftool/dbconfig/20231205-203136-arnaudb.json
  • 20:16 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P54202 and previous config saved to /var/cache/conftool/dbconfig/20231205-201629-arnaudb.json
  • 20:01 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P54201 and previous config saved to /var/cache/conftool/dbconfig/20231205-200123-arnaudb.json
  • 19:46 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T348183)', diff saved to https://phabricator.wikimedia.org/P54200 and previous config saved to /var/cache/conftool/dbconfig/20231205-194616-arnaudb.json
  • 19:36 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1218 (T348183)', diff saved to https://phabricator.wikimedia.org/P54199 and previous config saved to /var/cache/conftool/dbconfig/20231205-193627-arnaudb.json
  • 19:36 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 19:36 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 19:36 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T348183)', diff saved to https://phabricator.wikimedia.org/P54198 and previous config saved to /var/cache/conftool/dbconfig/20231205-193604-arnaudb.json
  • 19:20 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P54197 and previous config saved to /var/cache/conftool/dbconfig/20231205-192057-arnaudb.json
  • 19:05 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P54196 and previous config saved to /var/cache/conftool/dbconfig/20231205-190551-arnaudb.json
  • 18:50 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T348183)', diff saved to https://phabricator.wikimedia.org/P54195 and previous config saved to /var/cache/conftool/dbconfig/20231205-185044-arnaudb.json
  • 18:41 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1207 (T348183)', diff saved to https://phabricator.wikimedia.org/P54194 and previous config saved to /var/cache/conftool/dbconfig/20231205-184108-arnaudb.json
  • 18:41 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 18:40 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 18:40 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T348183)', diff saved to https://phabricator.wikimedia.org/P54193 and previous config saved to /var/cache/conftool/dbconfig/20231205-184045-arnaudb.json
  • 18:25 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P54192 and previous config saved to /var/cache/conftool/dbconfig/20231205-182539-arnaudb.json
  • 18:13 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS bullseye
  • 18:10 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P54191 and previous config saved to /var/cache/conftool/dbconfig/20231205-181032-arnaudb.json
  • 17:55 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T348183)', diff saved to https://phabricator.wikimedia.org/P54190 and previous config saved to /var/cache/conftool/dbconfig/20231205-175526-arnaudb.json
  • 17:52 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
  • 17:49 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
  • 17:46 vgutierrez: rolling restart of text|secondary LVS on drmrs effectively enabling IPIP encapsulation for ncredir@drmrs- T351069
  • 17:29 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 17:29 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 17:29 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 17:29 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS bullseye
  • 17:28 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 17:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['testhost2001']
  • 17:15 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['testhost2001']
  • 17:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['testhost2001']
  • 17:11 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4052.ulsfo.wmnet with OS bullseye
  • 17:00 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host moss-be1002.eqiad.wmnet with OS bookworm
  • 16:55 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1206 (T348183)', diff saved to https://phabricator.wikimedia.org/P54189 and previous config saved to /var/cache/conftool/dbconfig/20231205-165503-arnaudb.json
  • 16:54 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 16:54 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 16:54 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T348183)', diff saved to https://phabricator.wikimedia.org/P54188 and previous config saved to /var/cache/conftool/dbconfig/20231205-165439-arnaudb.json
  • 16:52 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS bullseye
  • 16:52 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4052.ulsfo.wmnet with OS bullseye
  • 16:47 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS bullseye
  • 16:42 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['testhost2001']
  • 16:42 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host testhost2001.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:39 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P54187 and previous config saved to /var/cache/conftool/dbconfig/20231205-163933-arnaudb.json
  • 16:37 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on moss-be1002.eqiad.wmnet with reason: host reimage
  • 16:34 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on moss-be1002.eqiad.wmnet with reason: host reimage
  • 16:24 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P54186 and previous config saved to /var/cache/conftool/dbconfig/20231205-162426-arnaudb.json
  • 16:24 claime: Rolling back k8s-ingress-dse - restarting pybal on lvs1019 - T352639
  • 16:18 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:18 claime: Rolling back k8s-ingress-dse - restarting pybal on lvs1020 - T352639
  • 16:18 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 16:18 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:17 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 16:14 samtar@deploy2002: Finished scap: Backport for .well-known: Add F-Droid signature to assetlinks.json (T346951) (duration: 07m 53s)
  • 16:11 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: sync
  • 16:09 elukey@deploy2002: helmfile [codfw] START helmfile.d/services/recommendation-api: sync
  • 16:09 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: sync
  • 16:09 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T348183)', diff saved to https://phabricator.wikimedia.org/P54185 and previous config saved to /var/cache/conftool/dbconfig/20231205-160920-arnaudb.json
  • 16:09 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/recommendation-api: sync
  • 16:08 samtar@deploy2002: samtar: Continuing with sync
  • 16:08 samtar@deploy2002: samtar: Backport for .well-known: Add F-Droid signature to assetlinks.json (T346951) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:07 samtar@deploy2002: Started scap: Backport for .well-known: Add F-Droid signature to assetlinks.json (T346951)
  • 16:01 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host testhost2001.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:00 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:00 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding testhost2001 to codfw - jhancock@cumin2002"
  • 15:59 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding testhost2001 to codfw - jhancock@cumin2002"
  • 15:59 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1196 (T348183)', diff saved to https://phabricator.wikimedia.org/P54184 and previous config saved to /var/cache/conftool/dbconfig/20231205-155858-arnaudb.json
  • 15:58 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 15:58 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 15:58 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 15:58 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 15:58 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T348183)', diff saved to https://phabricator.wikimedia.org/P54183 and previous config saved to /var/cache/conftool/dbconfig/20231205-155814-arnaudb.json
  • 15:57 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 15:56 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 15:56 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
  • 15:56 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 15:56 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
  • 15:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cp4040.ulsfo.wmnet
  • 15:49 claime: sudo confctl select "service=kubesvc,cluster=dse-k8s" set/pooled=inactive - T352639
  • 15:45 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host cp4040.ulsfo.wmnet
  • 15:43 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P54182 and previous config saved to /var/cache/conftool/dbconfig/20231205-154308-arnaudb.json
  • 15:42 moritzm: installing monitoring-plugins bugfix updates from Bookworm point release
  • 15:42 claime: Manually restarting pybal on lvs1020 - T352639
  • 15:39 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host moss-be1002.eqiad.wmnet with OS bookworm
  • 15:31 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1471.eqiad.wmnet with OS bullseye
  • 15:29 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 15:29 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['sessionstore2005']
  • 15:29 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sessionstore2005']
  • 15:29 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sessionstore2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:29 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['sessionstore2006']
  • 15:28 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sessionstore2006']
  • 15:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sessionstore2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:28 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P54181 and previous config saved to /var/cache/conftool/dbconfig/20231205-152801-arnaudb.json
  • 15:26 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host aqs2001.codfw.wmnet
  • 15:22 claime: Manually restarting pybal on lvs1019 - T352639
  • 15:21 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 15:20 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 15:18 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 15:17 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 15:16 claime: Manually restarting pybal on lvs1020 - T352639
  • 15:15 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
  • 15:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host aqs2001.codfw.wmnet
  • 15:15 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/thumbor: apply
  • 15:13 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1471.eqiad.wmnet with reason: host reimage
  • 15:12 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T348183)', diff saved to https://phabricator.wikimedia.org/P54180 and previous config saved to /var/cache/conftool/dbconfig/20231205-151255-arnaudb.json
  • 15:12 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
  • 15:11 cgoubert@cumin1001: END (FAIL) - Cookbook sre.loadbalancer.restart-pybal (exit_code=1) rolling-restart of pybal on P{lvs[1018,1020].eqiad.wmnet} and A:lvs (T352639)
  • 15:11 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
  • 15:10 kamila@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1471.eqiad.wmnet with reason: host reimage
  • 15:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sessionstore2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sessionstore2004.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:06 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
  • 15:06 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
  • 15:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sessionstore2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cp4043.ulsfo.wmnet
  • 15:02 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1186 (T348183)', diff saved to https://phabricator.wikimedia.org/P54179 and previous config saved to /var/cache/conftool/dbconfig/20231205-150243-arnaudb.json
  • 15:02 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 15:02 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 15:02 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T348183)', diff saved to https://phabricator.wikimedia.org/P54178 and previous config saved to /var/cache/conftool/dbconfig/20231205-150220-arnaudb.json
  • 15:01 cgoubert@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs[1018,1020].eqiad.wmnet} and A:lvs (T352639)
  • 14:58 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: sync
  • 14:58 kamila@cumin1001: START - Cookbook sre.hosts.reimage for host mw1471.eqiad.wmnet with OS bullseye
  • 14:57 elukey@deploy2002: helmfile [codfw] START helmfile.d/services/recommendation-api: sync
  • 14:57 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: sync
  • 14:57 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/recommendation-api: sync
  • 14:55 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sessionstore2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:55 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sessionstore2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:55 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sessionstore2004.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:54 brouberol: adding k8s-ingress-dse backend to LVS - T352639
  • 14:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host cp4043.ulsfo.wmnet
  • 14:47 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P54177 and previous config saved to /var/cache/conftool/dbconfig/20231205-144714-arnaudb.json
  • 14:45 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/recommendation-api: sync
  • 14:45 elukey@deploy2002: helmfile [staging] START helmfile.d/services/recommendation-api: sync
  • 14:44 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:44 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sessionstore2004-6 to codfw - jhancock@cumin2002"
  • 14:43 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sessionstore2004-6 to codfw - jhancock@cumin2002"
  • 14:41 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:41 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:41 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:40 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:40 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: redis::misc::master
  • 14:38 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ceph2002']
  • 14:35 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:32 urbanecm@deploy2002: Finished scap: Backport for User impact: update quantizeViews to process small series of view data (T352349), Add maintenance script to import existing files to scan table (T350863), Only allow drawing and bitmap media types to be scanned (T352234) (duration: 08m 55s)
  • 14:32 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P54176 and previous config saved to /var/cache/conftool/dbconfig/20231205-143207-arnaudb.json
  • 14:30 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: redis::misc::master
  • 14:29 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ceph2002']
  • 14:27 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ceph2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:26 urbanecm@deploy2002: kharlan and urbanecm: Continuing with sync
  • 14:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ceph2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:25 urbanecm@deploy2002: kharlan and urbanecm: Backport for User impact: update quantizeViews to process small series of view data (T352349), Add maintenance script to import existing files to scan table (T350863), Only allow drawing and bitmap media types to be scanned (T352234) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:24 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ceph2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:23 urbanecm@deploy2002: Started scap: Backport for User impact: update quantizeViews to process small series of view data (T352349), Add maintenance script to import existing files to scan table (T350863), Only allow drawing and bitmap media types to be scanned (T352234)
  • 14:20 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:19 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:17 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T348183)', diff saved to https://phabricator.wikimedia.org/P54175 and previous config saved to /var/cache/conftool/dbconfig/20231205-141701-arnaudb.json
  • 14:13 urbanecm@deploy2002: Finished scap: Backport for Growth: Enable Welcome survey user research for ar/en/es (T351266) (duration: 09m 33s)
  • 14:07 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T348183)', diff saved to https://phabricator.wikimedia.org/P54174 and previous config saved to /var/cache/conftool/dbconfig/20231205-140742-arnaudb.json
  • 14:07 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:07 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 14:07 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:07 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T348183)', diff saved to https://phabricator.wikimedia.org/P54173 and previous config saved to /var/cache/conftool/dbconfig/20231205-140720-arnaudb.json
  • 14:06 urbanecm@deploy2002: urbanecm: Backport for Growth: Enable Welcome survey user research for ar/en/es (T351266) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:06 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/recommendation-api: sync
  • 14:05 elukey@deploy2002: helmfile [staging] START helmfile.d/services/recommendation-api: sync
  • 14:04 urbanecm@deploy2002: Started scap: Backport for Growth: Enable Welcome survey user research for ar/en/es (T351266)
  • 14:03 moritzm: installing cups security updates
  • 13:52 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P54172 and previous config saved to /var/cache/conftool/dbconfig/20231205-135213-arnaudb.json
  • 13:51 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cp4048.ulsfo.wmnet
  • 13:50 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1078.eqiad.wmnet with OS bullseye
  • 13:50 jclark@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1001"
  • 13:48 jclark@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1001"
  • 13:48 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1079.eqiad.wmnet with OS bullseye
  • 13:48 jclark@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1001"
  • 13:48 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1470.eqiad.wmnet with OS bullseye
  • 13:44 jclark@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1001"
  • 13:43 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1465.eqiad.wmnet with OS bullseye
  • 13:41 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host cp4048.ulsfo.wmnet
  • 13:38 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1464.eqiad.wmnet with OS bullseye
  • 13:37 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P54171 and previous config saved to /var/cache/conftool/dbconfig/20231205-133706-arnaudb.json
  • 13:30 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1470.eqiad.wmnet with reason: host reimage
  • 13:27 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1078.eqiad.wmnet with reason: host reimage
  • 13:27 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1076.eqiad.wmnet with OS bullseye
  • 13:27 jclark@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1001"
  • 13:26 kamila@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1470.eqiad.wmnet with reason: host reimage
  • 13:26 jclark@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1001"
  • 13:24 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1465.eqiad.wmnet with reason: host reimage
  • 13:24 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be1079.eqiad.wmnet with reason: host reimage
  • 13:24 jclark@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1078.eqiad.wmnet with reason: host reimage
  • 13:23 jclark@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1079.eqiad.wmnet with reason: host reimage
  • 13:22 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T348183)', diff saved to https://phabricator.wikimedia.org/P54169 and previous config saved to /var/cache/conftool/dbconfig/20231205-132200-arnaudb.json
  • 13:21 kamila@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1465.eqiad.wmnet with reason: host reimage
  • 13:21 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1464.eqiad.wmnet with reason: host reimage
  • 13:18 kamila@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1464.eqiad.wmnet with reason: host reimage
  • 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: redis::misc::slave
  • 13:14 kamila@cumin1001: START - Cookbook sre.hosts.reimage for host mw1470.eqiad.wmnet with OS bullseye
  • 13:12 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T348183)', diff saved to https://phabricator.wikimedia.org/P54168 and previous config saved to /var/cache/conftool/dbconfig/20231205-131240-arnaudb.json
  • 13:12 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 13:12 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 13:10 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1078.eqiad.wmnet with OS bullseye
  • 13:09 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1079.eqiad.wmnet with OS bullseye
  • 13:08 kamila@cumin1001: START - Cookbook sre.hosts.reimage for host mw1465.eqiad.wmnet with OS bullseye
  • 13:07 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1076.eqiad.wmnet with reason: host reimage
  • 13:06 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2435.codfw.wmnet with OS bullseye
  • 13:06 kamila@cumin1001: START - Cookbook sre.hosts.reimage for host mw1464.eqiad.wmnet with OS bullseye
  • 13:04 cmooney@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:04 cmooney@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update entry for sretest2003. - cmooney@cumin2002"
  • 13:04 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 13:04 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 13:04 jclark@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1076.eqiad.wmnet with reason: host reimage
  • 13:04 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
  • 13:04 cmooney@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update entry for sretest2003. - cmooney@cumin2002"
  • 13:03 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
  • 13:02 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1463.eqiad.wmnet with OS bullseye
  • 12:59 cmooney@cumin2002: START - Cookbook sre.dns.netbox
  • 12:58 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2434.codfw.wmnet with OS bullseye
  • 12:57 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: redis::misc::slave
  • 12:56 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 12:56 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 12:56 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T348183)', diff saved to https://phabricator.wikimedia.org/P54167 and previous config saved to /var/cache/conftool/dbconfig/20231205-125641-arnaudb.json
  • 12:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cp4042.ulsfo.wmnet
  • 12:50 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2424.codfw.wmnet with OS bullseye
  • 12:50 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1076.eqiad.wmnet with OS bullseye
  • 12:47 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2435.codfw.wmnet with reason: host reimage
  • 12:47 ladsgroup@deploy2002: Finished scap: Backport for Set migration of pagelinks on large wikis of s5 to read new (T351237) (duration: 12m 30s)
  • 12:45 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 12:45 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2423.codfw.wmnet with OS bullseye
  • 12:45 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 12:44 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1463.eqiad.wmnet with reason: host reimage
  • 12:42 kamila@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2435.codfw.wmnet with reason: host reimage
  • 12:41 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P54165 and previous config saved to /var/cache/conftool/dbconfig/20231205-124134-arnaudb.json
  • 12:41 kamila@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1463.eqiad.wmnet with reason: host reimage
  • 12:40 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 12:39 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2434.codfw.wmnet with reason: host reimage
  • 12:37 ladsgroup@deploy2002: ladsgroup: Backport for Set migration of pagelinks on large wikis of s5 to read new (T351237) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:36 kamila@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2434.codfw.wmnet with reason: host reimage
  • 12:34 ladsgroup@deploy2002: Started scap: Backport for Set migration of pagelinks on large wikis of s5 to read new (T351237)
  • 12:32 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host cp4042.ulsfo.wmnet
  • 12:31 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2424.codfw.wmnet with reason: host reimage
  • 12:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cp4051.ulsfo.wmnet
  • 12:28 kamila@cumin1001: START - Cookbook sre.hosts.reimage for host mw1463.eqiad.wmnet with OS bullseye
  • 12:28 kamila@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2424.codfw.wmnet with reason: host reimage
  • 12:27 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 12:26 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2423.codfw.wmnet with reason: host reimage
  • 12:26 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P54164 and previous config saved to /var/cache/conftool/dbconfig/20231205-122628-arnaudb.json
  • 12:26 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 12:25 kamila@cumin1001: START - Cookbook sre.hosts.reimage for host mw2435.codfw.wmnet with OS bullseye
  • 12:24 moritzm: installing unbound bugfix updates from Bookworm point release
  • 12:23 kamila@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2423.codfw.wmnet with reason: host reimage
  • 12:22 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host cp4051.ulsfo.wmnet
  • 12:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cp4039.ulsfo.wmnet
  • 12:18 kamila@cumin1001: START - Cookbook sre.hosts.reimage for host mw2434.codfw.wmnet with OS bullseye
  • 12:11 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T348183)', diff saved to https://phabricator.wikimedia.org/P54163 and previous config saved to /var/cache/conftool/dbconfig/20231205-121121-arnaudb.json
  • 12:10 kamila@cumin1001: START - Cookbook sre.hosts.reimage for host mw2424.codfw.wmnet with OS bullseye
  • 12:07 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:06 kamila@cumin1001: START - Cookbook sre.hosts.reimage for host mw2423.codfw.wmnet with OS bullseye
  • 12:04 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host cp4039.ulsfo.wmnet
  • 12:02 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T348183)', diff saved to https://phabricator.wikimedia.org/P54162 and previous config saved to /var/cache/conftool/dbconfig/20231205-120206-arnaudb.json
  • 12:02 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 12:01 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 12:01 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T348183)', diff saved to https://phabricator.wikimedia.org/P54161 and previous config saved to /var/cache/conftool/dbconfig/20231205-120145-arnaudb.json
  • 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cp4049.ulsfo.wmnet
  • 11:53 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 11:52 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 11:51 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 11:51 kamila@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 11:50 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host cp4049.ulsfo.wmnet
  • 11:46 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P54160 and previous config saved to /var/cache/conftool/dbconfig/20231205-114638-arnaudb.json
  • 11:40 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 11:40 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 11:40 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 11:40 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 11:38 ladsgroup@deploy2002: Finished scap: Backport for Bump ParserCache TTL back to 30 days (T280604) (duration: 07m 47s)
  • 11:33 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:32 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:32 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 11:32 ladsgroup@deploy2002: ladsgroup: Backport for Bump ParserCache TTL back to 30 days (T280604) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:31 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P54159 and previous config saved to /var/cache/conftool/dbconfig/20231205-113132-arnaudb.json
  • 11:30 ladsgroup@deploy2002: Started scap: Backport for Bump ParserCache TTL back to 30 days (T280604)
  • 11:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1023.eqiad.wmnet with OS bookworm
  • 11:17 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 11:16 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 11:16 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 11:16 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T348183)', diff saved to https://phabricator.wikimedia.org/P54158 and previous config saved to /var/cache/conftool/dbconfig/20231205-111625-arnaudb.json
  • 11:16 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 11:15 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 11:15 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 11:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbproxy1023.eqiad.wmnet with reason: host reimage
  • 11:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dbproxy1023.eqiad.wmnet with reason: host reimage
  • 11:08 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 11:08 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
  • 11:07 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 11:07 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
  • 11:04 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T348183)', diff saved to https://phabricator.wikimedia.org/P54157 and previous config saved to /var/cache/conftool/dbconfig/20231205-110448-arnaudb.json
  • 11:04 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 11:04 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 11:04 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 (T348183)', diff saved to https://phabricator.wikimedia.org/P54156 and previous config saved to /var/cache/conftool/dbconfig/20231205-110426-arnaudb.json
  • 11:02 mvernon@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host moss-be1002.eqiad.wmnet with OS bookworm
  • 10:54 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1023.eqiad.wmnet with OS bookworm
  • 10:49 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P54155 and previous config saved to /var/cache/conftool/dbconfig/20231205-104919-arnaudb.json
  • 10:45 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 10:34 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P54154 and previous config saved to /var/cache/conftool/dbconfig/20231205-103413-arnaudb.json
  • 10:21 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host moss-be1002.eqiad.wmnet with OS bookworm
  • 10:20 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host moss-be1003.eqiad.wmnet with OS bookworm
  • 10:19 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 (T348183)', diff saved to https://phabricator.wikimedia.org/P54153 and previous config saved to /var/cache/conftool/dbconfig/20231205-101906-arnaudb.json
  • 10:07 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1132 (T348183)', diff saved to https://phabricator.wikimedia.org/P54152 and previous config saved to /var/cache/conftool/dbconfig/20231205-100744-arnaudb.json
  • 10:07 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1132.eqiad.wmnet with reason: Maintenance
  • 10:07 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1132.eqiad.wmnet with reason: Maintenance
  • 10:07 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T348183)', diff saved to https://phabricator.wikimedia.org/P54151 and previous config saved to /var/cache/conftool/dbconfig/20231205-100722-arnaudb.json
  • 10:05 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 15305
  • 10:02 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on moss-be1003.eqiad.wmnet with reason: host reimage
  • 10:02 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 15305
  • 09:57 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on moss-be1003.eqiad.wmnet with reason: host reimage
  • 09:54 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 63927
  • 09:52 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P54150 and previous config saved to /var/cache/conftool/dbconfig/20231205-095215-arnaudb.json
  • 09:51 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 63927
  • 09:42 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host moss-be1003.eqiad.wmnet with OS bookworm
  • 09:37 brouberol: running authdns-update on dns1004.wikimedia.org - T352639
  • 09:37 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P54149 and previous config saved to /var/cache/conftool/dbconfig/20231205-093709-arnaudb.json
  • 09:22 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T348183)', diff saved to https://phabricator.wikimedia.org/P54148 and previous config saved to /var/cache/conftool/dbconfig/20231205-092202-arnaudb.json
  • 09:12 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1128 (T348183)', diff saved to https://phabricator.wikimedia.org/P54147 and previous config saved to /var/cache/conftool/dbconfig/20231205-091232-arnaudb.json
  • 09:12 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1128.eqiad.wmnet with reason: Maintenance
  • 09:12 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1128.eqiad.wmnet with reason: Maintenance
  • 09:06 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 58952
  • 09:05 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 58952
  • 09:04 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 09:03 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 08:59 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 08:26 marostegui: Failover m2-master dbproxy1023.eqiad.wmnet -> dbproxy1025.eqiad.wmnet T351864
  • 06:55 vgutierrez: rolling restart of text|secondary LVS on eqsin effectively enabling IPIP encapsulation for ncredir@eqsin - T351069
  • 06:23 marostegui: Failover m5 from db1119 to db1176 - T352631
  • 06:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2135,2160].codfw.wmnet,db[1119,1176,1217].eqiad.wmnet with reason: m5 master switch T352631
  • 06:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db[2135,2160].codfw.wmnet,db[1119,1176,1217].eqiad.wmnet with reason: m5 master switch T352631
  • 01:18 mutante: LDAP - added user xqt to group nda (T348520)
  • 01:12 ejegg: payments-wiki upgraded from 5284fc99 to 1d24dc90
  • 00:06 eevans@cumin1001: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host restbase2028.codfw.wmnet

2023-12-04

  • 23:53 eevans@cumin1001: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host restbase2028.codfw.wmnet
  • 23:52 eevans@cumin1001: START - Cookbook sre.puppet.migrate-host for host restbase2028.codfw.wmnet
  • 22:53 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T348183)', diff saved to https://phabricator.wikimedia.org/P54146 and previous config saved to /var/cache/conftool/dbconfig/20231204-225336-arnaudb.json
  • 22:53 eileen: civicrm upgraded from 83816165 to 297a091d
  • 22:38 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P54145 and previous config saved to /var/cache/conftool/dbconfig/20231204-223830-arnaudb.json
  • 22:23 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P54144 and previous config saved to /var/cache/conftool/dbconfig/20231204-222323-arnaudb.json
  • 22:08 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T348183)', diff saved to https://phabricator.wikimedia.org/P54142 and previous config saved to /var/cache/conftool/dbconfig/20231204-220817-arnaudb.json
  • 22:03 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db2189 (T348183)', diff saved to https://phabricator.wikimedia.org/P54141 and previous config saved to /var/cache/conftool/dbconfig/20231204-220345-arnaudb.json
  • 22:03 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 22:03 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 22:03 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T348183)', diff saved to https://phabricator.wikimedia.org/P54140 and previous config saved to /var/cache/conftool/dbconfig/20231204-220322-arnaudb.json
  • 21:58 ebernhardson@deploy2002: Finished scap: Backport for Always load transcode state from db when opting in to primary db (duration: 08m 37s)
  • 21:52 ebernhardson@deploy2002: ebernhardson and brion: Continuing with sync
  • 21:51 ebernhardson@deploy2002: ebernhardson and brion: Backport for Always load transcode state from db when opting in to primary db synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:50 ebernhardson@deploy2002: Started scap: Backport for Always load transcode state from db when opting in to primary db
  • 21:49 ebernhardson@deploy2002: Finished scap: Backport for cirrus: Enable event bus bridge on more wikis (T352335) (duration: 09m 23s)
  • 21:48 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P54138 and previous config saved to /var/cache/conftool/dbconfig/20231204-214816-arnaudb.json
  • 21:47 ryankemper: T351503 Setting partition count to 5: `ryankemper@kafka-main2001:~$ kafka topics --alter --topic codfw.mediawiki.cirrussearch.page_rerender.v1 --partitions 5`
  • 21:47 ryankemper: T351503 Setting partition count to 5: `ryankemper@kafka-main2001:~$ kafka topics --alter --topic eqiad.mediawiki.cirrussearch.page_rerender.v1 --partitions 5`
  • 21:42 ebernhardson@deploy2002: ebernhardson: Continuing with sync
  • 21:41 ebernhardson@deploy2002: ebernhardson: Backport for cirrus: Enable event bus bridge on more wikis (T352335) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:39 ebernhardson@deploy2002: Started scap: Backport for cirrus: Enable event bus bridge on more wikis (T352335)
  • 21:33 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P54137 and previous config saved to /var/cache/conftool/dbconfig/20231204-213309-arnaudb.json
  • 21:27 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:27 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:19 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1077.eqiad.wmnet with OS bullseye
  • 21:19 pt1979@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1001"
  • 21:18 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T348183)', diff saved to https://phabricator.wikimedia.org/P54136 and previous config saved to /var/cache/conftool/dbconfig/20231204-211803-arnaudb.json
  • 21:14 pt1979@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1001"
  • 21:13 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db2175 (T348183)', diff saved to https://phabricator.wikimedia.org/P54135 and previous config saved to /var/cache/conftool/dbconfig/20231204-211305-arnaudb.json
  • 21:12 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 21:12 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 21:12 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T348183)', diff saved to https://phabricator.wikimedia.org/P54134 and previous config saved to /var/cache/conftool/dbconfig/20231204-211241-arnaudb.json
  • 21:09 ryankemper: T351503 Setting partition count to 5: `ryankemper@kafka-main1001:~$ kafka topics --alter --topic codfw.mediawiki.cirrussearch.page_rerender.v1 --partitions 5`
  • 21:06 ryankemper: T351503 Setting partition count to 5: `ryankemper@kafka-main1001:~$ kafka topics --alter --topic eqiad.mediawiki.cirrussearch.page_rerender.v1 --partitions 5`
  • 20:57 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P54133 and previous config saved to /var/cache/conftool/dbconfig/20231204-205735-arnaudb.json
  • 20:53 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1077.eqiad.wmnet with reason: host reimage
  • 20:50 pt1979@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1077.eqiad.wmnet with reason: host reimage
  • 20:42 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P54132 and previous config saved to /var/cache/conftool/dbconfig/20231204-204228-arnaudb.json
  • 20:36 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1077.eqiad.wmnet with OS bullseye
  • 20:27 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T348183)', diff saved to https://phabricator.wikimedia.org/P54131 and previous config saved to /var/cache/conftool/dbconfig/20231204-202722-arnaudb.json
  • 19:43 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be1079.eqiad.wmnet with OS bullseye
  • 19:42 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be1076.eqiad.wmnet with OS bullseye
  • 19:42 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be1078.eqiad.wmnet with OS bullseye
  • 19:42 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be1077.eqiad.wmnet with OS bullseye
  • 19:41 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 (T348183)', diff saved to https://phabricator.wikimedia.org/P54130 and previous config saved to /var/cache/conftool/dbconfig/20231204-194103-arnaudb.json
  • 19:40 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 19:40 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 19:40 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T348183)', diff saved to https://phabricator.wikimedia.org/P54129 and previous config saved to /var/cache/conftool/dbconfig/20231204-194039-arnaudb.json
  • 19:37 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:37 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:25 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P54128 and previous config saved to /var/cache/conftool/dbconfig/20231204-192532-arnaudb.json
  • 19:21 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1076.eqiad.wmnet with OS bullseye
  • 19:21 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1078.eqiad.wmnet with OS bullseye
  • 19:21 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1079.eqiad.wmnet with OS bullseye
  • 19:20 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1077.eqiad.wmnet with OS bullseye
  • 19:10 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P54126 and previous config saved to /var/cache/conftool/dbconfig/20231204-191026-arnaudb.json
  • 19:10 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be1079.eqiad.wmnet with OS bullseye
  • 19:09 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be1078.eqiad.wmnet with OS bullseye
  • 19:08 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be1077.eqiad.wmnet with OS bullseye
  • 18:55 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T348183)', diff saved to https://phabricator.wikimedia.org/P54125 and previous config saved to /var/cache/conftool/dbconfig/20231204-185519-arnaudb.json
  • 18:52 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be1076.eqiad.wmnet with OS bullseye
  • 18:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1077.eqiad.wmnet with OS bullseye
  • 18:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1078.eqiad.wmnet with OS bullseye
  • 18:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1079.eqiad.wmnet with OS bullseye
  • 18:46 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db2148 (T348183)', diff saved to https://phabricator.wikimedia.org/P54124 and previous config saved to /var/cache/conftool/dbconfig/20231204-184630-arnaudb.json
  • 18:46 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 18:46 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 18:46 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T348183)', diff saved to https://phabricator.wikimedia.org/P54123 and previous config saved to /var/cache/conftool/dbconfig/20231204-184607-arnaudb.json
  • 18:31 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P54122 and previous config saved to /var/cache/conftool/dbconfig/20231204-183100-arnaudb.json
  • 18:15 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P54121 and previous config saved to /var/cache/conftool/dbconfig/20231204-181554-arnaudb.json
  • 18:02 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be1077.eqiad.wmnet with OS bullseye
  • 18:00 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T348183)', diff saved to https://phabricator.wikimedia.org/P54120 and previous config saved to /var/cache/conftool/dbconfig/20231204-180047-arnaudb.json
  • 17:59 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1076.eqiad.wmnet with OS bullseye
  • 17:55 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be1078.eqiad.wmnet with OS bullseye
  • 17:54 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3312 (T348183)', diff saved to https://phabricator.wikimedia.org/P54119 and previous config saved to /var/cache/conftool/dbconfig/20231204-175448-arnaudb.json
  • 17:54 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 17:54 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 17:54 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T348183)', diff saved to https://phabricator.wikimedia.org/P54118 and previous config saved to /var/cache/conftool/dbconfig/20231204-175426-arnaudb.json
  • 17:41 ladsgroup@deploy2002: Finished scap: Backport for Category: Stop locking thousands of rows (T352628) (duration: 08m 07s)
  • 17:39 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P54117 and previous config saved to /var/cache/conftool/dbconfig/20231204-173919-arnaudb.json
  • 17:35 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 17:34 ladsgroup@deploy2002: ladsgroup: Backport for Category: Stop locking thousands of rows (T352628) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:33 ladsgroup@deploy2002: Started scap: Backport for Category: Stop locking thousands of rows (T352628)
  • 17:24 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P54116 and previous config saved to /var/cache/conftool/dbconfig/20231204-172413-arnaudb.json
  • 17:19 jclark@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ms-be1076']
  • 17:18 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1076']
  • 17:18 jclark@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ms-be1079']
  • 17:18 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1079']
  • 17:16 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1079']
  • 17:16 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1079']
  • 17:15 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1079']
  • 17:15 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1079']
  • 17:15 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1079']
  • 17:15 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1079']
  • 17:14 jclark@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ms-be1079']
  • 17:12 jclark@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ms-be1076']
  • 17:12 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1076']
  • 17:09 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1076']
  • 17:09 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1076']
  • 17:09 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1076']
  • 17:09 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T348183)', diff saved to https://phabricator.wikimedia.org/P54115 and previous config saved to /var/cache/conftool/dbconfig/20231204-170906-arnaudb.json
  • 17:09 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1078.eqiad.wmnet with OS bullseye
  • 17:08 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1077.eqiad.wmnet with OS bullseye
  • 17:06 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db2126 (T348183)', diff saved to https://phabricator.wikimedia.org/P54114 and previous config saved to /var/cache/conftool/dbconfig/20231204-170604-arnaudb.json
  • 17:05 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 17:05 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 17:05 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 17:05 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 17:05 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T348183)', diff saved to https://phabricator.wikimedia.org/P54113 and previous config saved to /var/cache/conftool/dbconfig/20231204-170525-arnaudb.json
  • 16:52 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 07m 45s)
  • 16:50 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P54112 and previous config saved to /var/cache/conftool/dbconfig/20231204-165018-arnaudb.json
  • 16:47 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 33604
  • 16:46 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 33604
  • 16:44 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 06m 40s)
  • 16:35 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P54111 and previous config saved to /var/cache/conftool/dbconfig/20231204-163511-arnaudb.json
  • 16:20 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T348183)', diff saved to https://phabricator.wikimedia.org/P54110 and previous config saved to /var/cache/conftool/dbconfig/20231204-162005-arnaudb.json
  • 16:14 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db2125 (T348183)', diff saved to https://phabricator.wikimedia.org/P54109 and previous config saved to /var/cache/conftool/dbconfig/20231204-161408-arnaudb.json
  • 16:14 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 16:13 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 16:13 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T348183)', diff saved to https://phabricator.wikimedia.org/P54108 and previous config saved to /var/cache/conftool/dbconfig/20231204-161346-arnaudb.json
  • 15:58 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P54107 and previous config saved to /var/cache/conftool/dbconfig/20231204-155840-arnaudb.json
  • 15:56 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be1076.eqiad.wmnet with OS bullseye
  • 15:48 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 15:48 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 15:47 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 15:47 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 15:46 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 15:45 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 15:43 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P54105 and previous config saved to /var/cache/conftool/dbconfig/20231204-154333-arnaudb.json
  • 15:28 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T348183)', diff saved to https://phabricator.wikimedia.org/P54104 and previous config saved to /var/cache/conftool/dbconfig/20231204-152826-arnaudb.json
  • 15:08 jclark@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ms-be1077']
  • 15:08 jclark@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ms-be1078']
  • 15:03 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1079']
  • 15:02 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1077']
  • 15:02 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1077']
  • 15:02 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1078']
  • 15:02 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1077']
  • 15:01 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1076']
  • 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cp4046.ulsfo.wmnet
  • 14:51 vgutierrez: upload tcp-mss-clamper 0.4 to apt.wm.o (bookworm)
  • 14:50 jclark@cumin1001: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1077
  • 14:50 jclark@cumin1001: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1077
  • 14:47 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1076.eqiad.wmnet with OS bullseye
  • 14:46 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host cp4046.ulsfo.wmnet
  • 14:46 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:46 lucaswerkmeister-wmde@deploy2002: Finished scap: Backport for Create new namespaces and namespace aliases for bd.wikimedia.org (T351903) (duration: 11m 48s)
  • 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cp4038.ulsfo.wmnet
  • 14:43 sukhe: running authdns-update for CR 979976 [revert of T349665]
  • 14:40 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde and mdsshakil: Continuing with sync
  • 14:37 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host cp4038.ulsfo.wmnet
  • 14:36 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde and mdsshakil: Backport for Create new namespaces and namespace aliases for bd.wikimedia.org (T351903) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:34 lucaswerkmeister-wmde@deploy2002: Started scap: Backport for Create new namespaces and namespace aliases for bd.wikimedia.org (T351903)
  • 14:33 sukhe: running authdns-update for T352579
  • 14:32 lucaswerkmeister-wmde@deploy2002: Finished scap: Backport for Enable read new for event tables migration on testwiki (T341829) (duration: 10m 42s)
  • 14:32 btullis@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid analytics cluster: Roll restart of Druid jvm daemons.
  • 14:27 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db2104 (T348183)', diff saved to https://phabricator.wikimedia.org/P54103 and previous config saved to /var/cache/conftool/dbconfig/20231204-142754-arnaudb.json
  • 14:27 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 14:27 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 14:25 lucaswerkmeister-wmde@deploy2002: dreamyjazz and lucaswerkmeister-wmde: Continuing with sync
  • 14:24 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 14:24 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 14:22 lucaswerkmeister-wmde@deploy2002: dreamyjazz and lucaswerkmeister-wmde: Backport for Enable read new for event tables migration on testwiki (T341829) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:21 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 14:21 lucaswerkmeister-wmde@deploy2002: Started scap: Backport for Enable read new for event tables migration on testwiki (T341829)
  • 14:21 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 14:19 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 14:18 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 14:18 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T348183)', diff saved to https://phabricator.wikimedia.org/P54102 and previous config saved to /var/cache/conftool/dbconfig/20231204-141848-arnaudb.json
  • 14:15 jforrester@deploy2002: Finished scap: Backport for wikifunctionswiki: Disable thumbnail in Vector search (T352532), wikifunctionswiki: Add ability for sysops to manage Functioneer (T352495) (duration: 07m 41s)
  • 14:10 jforrester@deploy2002: jforrester and terasail: Continuing with sync
  • 14:09 jforrester@deploy2002: jforrester and terasail: Backport for wikifunctionswiki: Disable thumbnail in Vector search (T352532), wikifunctionswiki: Add ability for sysops to manage Functioneer (T352495) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:08 jforrester@deploy2002: Started scap: Backport for wikifunctionswiki: Disable thumbnail in Vector search (T352532), wikifunctionswiki: Add ability for sysops to manage Functioneer (T352495)
  • 14:03 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P54101 and previous config saved to /var/cache/conftool/dbconfig/20231204-140341-arnaudb.json
  • 13:59 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 13:59 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 13:58 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 13:57 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 13:56 moritzm: installing postgresql-13 security updates
  • 13:52 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:52 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 13:48 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P54100 and previous config saved to /var/cache/conftool/dbconfig/20231204-134835-arnaudb.json
  • 13:43 btullis@cumin1001: START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons.
  • 13:33 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T348183)', diff saved to https://phabricator.wikimedia.org/P54099 and previous config saved to /var/cache/conftool/dbconfig/20231204-133328-arnaudb.json
  • 13:30 moritzm: instaling dbus security updates on buster
  • 13:29 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1222 (T348183)', diff saved to https://phabricator.wikimedia.org/P54098 and previous config saved to /var/cache/conftool/dbconfig/20231204-132859-arnaudb.json
  • 13:28 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 13:28 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 13:28 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T348183)', diff saved to https://phabricator.wikimedia.org/P54097 and previous config saved to /var/cache/conftool/dbconfig/20231204-132836-arnaudb.json
  • 13:22 moritzm: installing libde265 security updates
  • 13:22 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 13:22 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 13:13 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P54096 and previous config saved to /var/cache/conftool/dbconfig/20231204-131329-arnaudb.json
  • 13:06 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 13:05 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 13:05 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 13:04 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 12:58 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P54095 and previous config saved to /var/cache/conftool/dbconfig/20231204-125823-arnaudb.json
  • 12:43 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T348183)', diff saved to https://phabricator.wikimedia.org/P54094 and previous config saved to /var/cache/conftool/dbconfig/20231204-124316-arnaudb.json
  • 12:40 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1197 (T348183)', diff saved to https://phabricator.wikimedia.org/P54093 and previous config saved to /var/cache/conftool/dbconfig/20231204-124037-arnaudb.json
  • 12:40 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 12:40 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 12:40 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T348183)', diff saved to https://phabricator.wikimedia.org/P54092 and previous config saved to /var/cache/conftool/dbconfig/20231204-124015-arnaudb.json
  • 12:35 urbanecm@deploy2002: Finished scap: Backport for User impact: sort datestring keys to ascending alphanumeric order (T352349 T351898) (duration: 09m 43s)
  • 12:29 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 12:28 urbanecm@deploy2002: urbanecm: Backport for User impact: sort datestring keys to ascending alphanumeric order (T352349 T351898) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host an-druid1005.eqiad.wmnet
  • 12:25 urbanecm@deploy2002: Started scap: Backport for User impact: sort datestring keys to ascending alphanumeric order (T352349 T351898)
  • 12:25 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P54091 and previous config saved to /var/cache/conftool/dbconfig/20231204-122508-arnaudb.json
  • 12:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:19 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host an-druid1005.eqiad.wmnet
  • 12:18 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1027.eqiad.wmnet with OS bookworm
  • 12:10 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P54090 and previous config saved to /var/cache/conftool/dbconfig/20231204-121002-arnaudb.json
  • 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host druid1011.eqiad.wmnet
  • 12:00 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host druid1011.eqiad.wmnet
  • 11:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbproxy1027.eqiad.wmnet with reason: host reimage
  • 11:54 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T348183)', diff saved to https://phabricator.wikimedia.org/P54089 and previous config saved to /var/cache/conftool/dbconfig/20231204-115455-arnaudb.json
  • 11:54 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2422.codfw.wmnet with OS bullseye
  • 11:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dbproxy1027.eqiad.wmnet with reason: host reimage
  • 11:52 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1188 (T348183)', diff saved to https://phabricator.wikimedia.org/P54088 and previous config saved to /var/cache/conftool/dbconfig/20231204-115217-arnaudb.json
  • 11:52 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 11:51 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 11:51 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T348183)', diff saved to https://phabricator.wikimedia.org/P54087 and previous config saved to /var/cache/conftool/dbconfig/20231204-115154-arnaudb.json
  • 11:51 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1462.eqiad.wmnet with OS bullseye
  • 11:43 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 11:43 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 11:42 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 44592
  • 11:42 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
  • 11:42 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 44592
  • 11:42 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
  • 11:40 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 11:39 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 11:39 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1027.eqiad.wmnet with OS bookworm
  • 11:37 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P54086 and previous config saved to /var/cache/conftool/dbconfig/20231204-113648-arnaudb.json
  • 11:36 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2422.codfw.wmnet with reason: host reimage
  • 11:33 kamila@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1462.eqiad.wmnet with reason: host reimage
  • 11:32 kamila@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2422.codfw.wmnet with reason: host reimage
  • 11:30 kamila@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1462.eqiad.wmnet with reason: host reimage
  • 11:21 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P54085 and previous config saved to /var/cache/conftool/dbconfig/20231204-112141-arnaudb.json
  • 11:17 kamila@cumin1001: START - Cookbook sre.hosts.reimage for host mw1462.eqiad.wmnet with OS bullseye
  • 11:15 kamila@cumin1001: START - Cookbook sre.hosts.reimage for host mw2422.codfw.wmnet with OS bullseye
  • 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: eventschemas::service
  • 11:06 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T348183)', diff saved to https://phabricator.wikimedia.org/P54084 and previous config saved to /var/cache/conftool/dbconfig/20231204-110635-arnaudb.json
  • 11:02 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T348183)', diff saved to https://phabricator.wikimedia.org/P54083 and previous config saved to /var/cache/conftool/dbconfig/20231204-110156-arnaudb.json
  • 11:02 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 11:01 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 11:01 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T348183)', diff saved to https://phabricator.wikimedia.org/P54082 and previous config saved to /var/cache/conftool/dbconfig/20231204-110134-arnaudb.json
  • 10:54 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: eventschemas::service
  • 10:51 btullis@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:51 btullis@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add service records for the k8s-ingress-dse endpoints - btullis@cumin1001"
  • 10:50 btullis@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add service records for the k8s-ingress-dse endpoints - btullis@cumin1001"
  • 10:48 btullis@cumin1001: START - Cookbook sre.dns.netbox
  • 10:46 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P54081 and previous config saved to /var/cache/conftool/dbconfig/20231204-104628-arnaudb.json
  • 10:39 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 23856
  • 10:39 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 23856
  • 10:39 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 63927
  • 10:38 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 63927
  • 10:38 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 31898
  • 10:37 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 31898
  • 10:37 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 58952
  • 10:36 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 58952
  • 10:36 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 44592
  • 10:36 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 44592
  • 10:35 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 4800
  • 10:35 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 4800
  • 10:35 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 33604
  • 10:34 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 33604
  • 10:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 142505
  • 10:33 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 142505
  • 10:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 398446
  • 10:33 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 398446
  • 10:32 jayme: upgrade istio (buster -> bullseye) on wikikube codfw - T351933
  • 10:32 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 15305
  • 10:32 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 15305
  • 10:31 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 19165
  • 10:31 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P54080 and previous config saved to /var/cache/conftool/dbconfig/20231204-103121-arnaudb.json
  • 10:30 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 19165
  • 10:30 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 237
  • 10:29 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 237
  • 10:28 jayme: pgrade istio (buster -> bullseye) on wikikube eqiad - T351933
  • 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 35 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
  • 10:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 35 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
  • 10:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1022.eqiad.wmnet with OS bookworm
  • 10:17 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 138997
  • 10:17 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 138997
  • 10:16 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T348183)', diff saved to https://phabricator.wikimedia.org/P54079 and previous config saved to /var/cache/conftool/dbconfig/20231204-101615-arnaudb.json
  • 10:11 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T348183)', diff saved to https://phabricator.wikimedia.org/P54078 and previous config saved to /var/cache/conftool/dbconfig/20231204-101143-arnaudb.json
  • 10:11 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 10:11 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 10:11 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T348183)', diff saved to https://phabricator.wikimedia.org/P54077 and previous config saved to /var/cache/conftool/dbconfig/20231204-101120-arnaudb.json
  • 10:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbproxy1022.eqiad.wmnet with reason: host reimage
  • 09:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dbproxy1022.eqiad.wmnet with reason: host reimage
  • 09:58 volans@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1022.mgmt.eqiad.wmnet with reboot policy GRACEFUL
  • 09:57 godog: roll-restart prometheus/k8s to apply size-based retention - T351179
  • 09:56 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P54076 and previous config saved to /var/cache/conftool/dbconfig/20231204-095614-arnaudb.json
  • 09:49 volans@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1022.mgmt.eqiad.wmnet with reboot policy GRACEFUL
  • 09:41 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P54075 and previous config saved to /var/cache/conftool/dbconfig/20231204-094107-arnaudb.json
  • 09:36 elukey: upgrade istio (buster -> bullseye) on ml-serve-codfw - T351933
  • 09:26 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T348183)', diff saved to https://phabricator.wikimedia.org/P54074 and previous config saved to /var/cache/conftool/dbconfig/20231204-092600-arnaudb.json
  • 09:21 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T348183)', diff saved to https://phabricator.wikimedia.org/P54073 and previous config saved to /var/cache/conftool/dbconfig/20231204-092136-arnaudb.json
  • 09:21 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 09:21 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 09:21 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 09:20 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 09:20 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T348183)', diff saved to https://phabricator.wikimedia.org/P54072 and previous config saved to /var/cache/conftool/dbconfig/20231204-092054-arnaudb.json
  • 09:05 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P54070 and previous config saved to /var/cache/conftool/dbconfig/20231204-090547-arnaudb.json
  • 08:58 elukey: upgrade istio (buster -> bullseye) on ml-serve-eqiad - T351933
  • 08:50 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P54069 and previous config saved to /var/cache/conftool/dbconfig/20231204-085041-arnaudb.json
  • 08:50 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bookworm
  • 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM moscovium.eqiad.wmnet
  • 08:48 elukey: upgrade istio (buster -> bullseye) on aux-k8s-eqiad - T351933
  • 08:45 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1022.eqiad.wmnet with OS bookworm
  • 08:43 elukey: upgrade istio (buster -> bullseye) on dse-k8s-eqiad - T351933
  • 08:39 urbanecm@deploy2002: Finished scap: Backport for hewikivoyage: add tagline (T351981), azwiki: Enable $wgMinervaEnableSiteNotice (T352621), trwikivoyage: update wordmark (T352329) (duration: 09m 49s)
  • 08:35 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T348183)', diff saved to https://phabricator.wikimedia.org/P54068 and previous config saved to /var/cache/conftool/dbconfig/20231204-083534-arnaudb.json
  • 08:33 urbanecm@deploy2002: urbanecm and anzx: Continuing with sync
  • 08:31 urbanecm@deploy2002: urbanecm and anzx: Backport for hewikivoyage: add tagline (T351981), azwiki: Enable $wgMinervaEnableSiteNotice (T352621), trwikivoyage: update wordmark (T352329) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:31 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T348183)', diff saved to https://phabricator.wikimedia.org/P54067 and previous config saved to /var/cache/conftool/dbconfig/20231204-083102-arnaudb.json
  • 08:30 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 08:30 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 08:29 urbanecm@deploy2002: Started scap: Backport for hewikivoyage: add tagline (T351981), azwiki: Enable $wgMinervaEnableSiteNotice (T352621), trwikivoyage: update wordmark (T352329)
  • 08:28 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 08:28 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 08:28 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T348183)', diff saved to https://phabricator.wikimedia.org/P54066 and previous config saved to /var/cache/conftool/dbconfig/20231204-082758-arnaudb.json
  • 08:25 oblivian@deploy2002: Finished scap: Backport for Add throttle rule for editathon (T352569) (duration: 18m 04s)
  • 08:24 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM moscovium.eqiad.wmnet
  • 08:23 _joe_: clearing throttle cache for T352569
  • 08:18 oblivian@deploy2002: oblivian: Continuing with sync
  • 08:17 oblivian@deploy2002: oblivian: Backport for Add throttle rule for editathon (T352569) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:12 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P54065 and previous config saved to /var/cache/conftool/dbconfig/20231204-081251-arnaudb.json
  • 08:11 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bookworm
  • 08:10 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1022.eqiad.wmnet with OS bookworm
  • 08:07 oblivian@deploy2002: Started scap: Backport for Add throttle rule for editathon (T352569)
  • 07:57 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P54064 and previous config saved to /var/cache/conftool/dbconfig/20231204-075745-arnaudb.json
  • 07:54 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bookworm
  • 07:42 arnaudb@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T348183)', diff saved to https://phabricator.wikimedia.org/P54063 and previous config saved to /var/cache/conftool/dbconfig/20231204-074238-arnaudb.json
  • 07:39 arnaudb@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T348183)', diff saved to https://phabricator.wikimedia.org/P54062 and previous config saved to /var/cache/conftool/dbconfig/20231204-073957-arnaudb.json
  • 07:39 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 07:39 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1176.eqiad.wmnet with OS bookworm
  • 07:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1176.eqiad.wmnet with reason: host reimage
  • 07:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1176.eqiad.wmnet with reason: host reimage
  • 07:07 kart_: Updated MinT to 2023-11-21-115852-production
  • 07:03 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1176.eqiad.wmnet with OS bookworm
  • 06:57 marostegui: Failover m5 from db1176 to db1119 - T332155
  • 06:49 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
  • 06:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2135,2160].codfw.wmnet,db[1119,1176,1217].eqiad.wmnet with reason: m5 master switch T352505
  • 06:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db[2135,2160].codfw.wmnet,db[1119,1176,1217].eqiad.wmnet with reason: m5 master switch T352505
  • 06:44 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
  • 06:33 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
  • 06:28 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
  • 06:14 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
  • 06:11 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
  • 06:08 kart_: Updated cxserver to 2023-12-04-055024-production (T270060, T350773, T352620)
  • 06:06 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 06:05 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 06:03 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 06:02 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:59 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:58 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 04:43 ryankemper: [WDQS] Clearing `BlazegraphFreeAllocatorsDecreasingRapidly` -> `ryankemper@wdqs1007:~$ sudo systemctl restart wdqs-blazegraph`
  • 00:16 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
  • 00:09 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet

2023-12-02

  • 01:51 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be1078.eqiad.wmnet with OS bullseye
  • 01:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be1079.eqiad.wmnet with OS bullseye
  • 01:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be1077.eqiad.wmnet with OS bullseye
  • 01:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be1076.eqiad.wmnet with OS bullseye
  • 00:30 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1078.eqiad.wmnet with OS bullseye
  • 00:30 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1079.eqiad.wmnet with OS bullseye
  • 00:30 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1077.eqiad.wmnet with OS bullseye
  • 00:30 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1076.eqiad.wmnet with OS bullseye
  • 00:15 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1076']
  • 00:15 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1076']
  • 00:14 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1076']
  • 00:14 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1076']
  • 00:14 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1076']
  • 00:14 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1076']
  • 00:14 jclark@cumin1001: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['ms-be1076']
  • 00:14 jclark@cumin1001: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['ms-be1076']
  • 00:14 jclark@cumin1001: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['ms-be1076']
  • 00:14 jclark@cumin1001: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['ms-be1076']
  • 00:14 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1076']
  • 00:14 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1076']
  • 00:13 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1076']
  • 00:13 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1076']
  • 00:13 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1079.mgmt.eqiad.wmnet with reboot policy FORCED
  • 00:13 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1078.mgmt.eqiad.wmnet with reboot policy FORCED
  • 00:13 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1077.mgmt.eqiad.wmnet with reboot policy FORCED
  • 00:12 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1076.mgmt.eqiad.wmnet with reboot policy FORCED

2023-12-01

  • 22:17 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1079.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:17 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1078.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:17 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1077.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:17 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1076.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:17 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1078.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:16 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1077.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:15 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1076.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:15 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1079.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:15 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1079.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:15 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1078.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:15 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1077.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:15 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1076.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:14 jclark@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:14 jclark@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt ms-be - jclark@cumin1001"
  • 22:13 jclark@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt ms-be - jclark@cumin1001"
  • 22:11 jclark@cumin1001: START - Cookbook sre.dns.netbox
  • 22:10 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1078.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:10 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1079.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:09 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1077.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:09 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1076.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:45 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1079.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:45 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1078.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:45 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1077.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:45 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1076.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:31 cstone: payments-wiki upgraded from b37ab50e to 5284fc99
  • 19:35 inflatador: bking@wdqs1006 rebooting unresponsive host
  • 18:22 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ceph2001.codfw.wmnet with OS bullseye
  • 17:47 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ceph2001.codfw.wmnet with OS bullseye
  • 16:59 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ceph2001.codfw.wmnet with OS bullseye
  • 16:39 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1005.eqiad.wmnet with OS bookworm
  • 16:26 dancy@deploy2002: Installation of scap version "4.65.0" completed for 537 hosts
  • 16:26 dancy@deploy2002: Installing scap version "4.65.0" for 537 hosts
  • 16:25 dancy@deploy2002: install-world aborted: (duration: 00m 50s)
  • 16:24 dancy@deploy2002: Installing scap version "4.65.0" for 569 hosts
  • 16:24 fnegri@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudvirt1046.eqiad.wmnet
  • 16:10 fnegri@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt1046.eqiad.wmnet
  • 16:07 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1005.eqiad.wmnet with reason: host reimage
  • 16:04 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1005.eqiad.wmnet with reason: host reimage
  • 16:01 akosiaris@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:01 akosiaris@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Give AAAA and PTR records to scandium - akosiaris@cumin1001"
  • 16:00 akosiaris@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Give AAAA and PTR records to scandium - akosiaris@cumin1001"
  • 15:58 akosiaris@cumin1001: START - Cookbook sre.dns.netbox
  • 15:58 akosiaris: give AAAA and PTR records to scandium T271142
  • 15:57 akosiaris: give AAAA and PTR records to all rdb hosts (only 50% had it previously)
  • 15:56 dancy@deploy2002: Installing scap version "4.65.0" for 570 hosts
  • 15:55 akosiaris@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:55 akosiaris@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA records to the rest of the 50% of rdb hosts - akosiaris@cumin1001"
  • 15:54 akosiaris@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA records to the rest of the 50% of rdb hosts - akosiaris@cumin1001"
  • 15:52 akosiaris@cumin1001: START - Cookbook sre.dns.netbox
  • 15:51 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb[1009-1010].eqiad.wmnet
  • 15:51 akosiaris@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:51 akosiaris@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb[1009-1010].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - akosiaris@cumin1001"
  • 15:50 akosiaris@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb[1009-1010].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - akosiaris@cumin1001"
  • 15:48 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1005.eqiad.wmnet with OS bookworm
  • 15:45 akosiaris@cumin1001: START - Cookbook sre.dns.netbox
  • 15:42 urbanecm: mwmaint2002: mwscript extensions/Flow/maintenance/FlowFixInconsistentBoards.php --wiki=frwiki # T352550
  • 15:38 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 15:38 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 15:36 akosiaris@deploy2002: Synchronized wmf-config/ProductionServices.php: (no justification provided) (duration: 07m 24s)
  • 15:31 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 15:31 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 15:28 moritzm: added Kamila to pwstore
  • 15:21 akosiaris@cumin1001: START - Cookbook sre.hosts.decommission for hosts rdb[1009-1010].eqiad.wmnet
  • 15:19 topranks: moving esams CR interconnect to 4x10G breakout cable T347403
  • 14:27 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 14:27 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 14:27 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 14:27 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 14:27 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
  • 14:27 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop: apply
  • 14:26 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 14:26 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 14:26 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 14:26 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 14:26 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 14:26 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 14:26 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 14:26 akosiaris: cleanup rdb1009 from all deployment charts
  • 14:26 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 14:26 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 14:26 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 14:25 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 14:25 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 14:20 hashar@deploy2002: Finished deploy [integration/docroot@88f69cc]: doc: link to the Gearman Java library (duration: 00m 05s)
  • 14:20 hashar@deploy2002: Started deploy [integration/docroot@88f69cc]: doc: link to the Gearman Java library
  • 14:18 hashar@deploy2002: Finished deploy [integration/docroot@1c2de6b]: doc: link to Disovery parent pom (duration: 00m 06s)
  • 14:18 hashar@deploy2002: Started deploy [integration/docroot@1c2de6b]: doc: link to Disovery parent pom
  • 14:09 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 14:08 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 14:05 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 14:05 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 14:03 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 14:03 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 13:48 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 13:48 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 13:32 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 13:31 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 13:30 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 13:30 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 13:28 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
  • 13:28 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop: apply
  • 13:27 taavi: run prometheus provision-fs on prometheus2* to create file system for cloud instance T350010
  • 13:13 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 13:13 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 12:39 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 12:39 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts flerovium.eqiad.wmnet
  • 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: flerovium.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:36 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: flerovium.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:34 fnegri@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1046.eqiad.wmnet with OS bookworm
  • 12:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 12:21 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts flerovium.eqiad.wmnet
  • 12:17 XioNoX: add BGP custom field to Netbox - T306649
  • 12:07 fnegri@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1046.eqiad.wmnet with reason: host reimage
  • 12:03 fnegri@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1046.eqiad.wmnet with reason: host reimage
  • 12:03 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Jbond out of all services on: 2211 hosts
  • 12:02 root@cumin2002: START - Cookbook sre.idm.logout Logging Jbond out of all services on: 2211 hosts
  • 11:49 fnegri@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1046.eqiad.wmnet with OS bookworm
  • 11:30 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on cr[1-2]-codfw,cr[1-2]-codfw IPv6 with reason: resetting line card
  • 11:30 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:20:00 on cr[1-2]-codfw,cr[1-2]-codfw IPv6 with reason: resetting line card
  • 11:29 topranks: Reset card 1/0 in cr1-codfw T350159
  • 11:22 topranks: Disabling BGP peering to AS1299 prior to reset of card 1/0 in cr1-codfw T350159
  • 11:09 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Jbond out of all services on: 2 hosts
  • 11:09 root@cumin2002: START - Cookbook sre.idm.logout Logging Jbond out of all services on: 2 hosts
  • 11:04 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Jbond out of all services on: 2 hosts
  • 11:04 root@cumin2002: START - Cookbook sre.idm.logout Logging Jbond out of all services on: 2 hosts
  • 11:00 topranks: Draining cr1-codfw transport to cr3-eqsin to reset card 1/0 T350159
  • 10:59 topranks: Resetting circuit preference for transports landing on card 1/1 cr1-codfw T350159
  • 10:55 jelto@cumin1001: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 10:49 moritzm: installing wireshark security updates on bookworm
  • 10:37 topranks: Moving VRRP acrtive gateway for codfw row A/B vlans from cr1-codfw to cr2-codfw to reconfigure card 1/1 T350159
  • 10:35 topranks: draining codfw<->eqiad transport link to reconfigure card 1/1 in cr1-codfw T350159
  • 10:34 topranks: draining codfw<->eqdfw transport link to reconfigure card 1/1 in cr1-codfw T350159
  • 10:30 akosiaris@deploy2002: Synchronized wmf-config/ProductionServices.php: (no justification provided) (duration: 07m 12s)
  • 10:08 godog: add 60GB to prometheus/k8s in codfw
  • 09:51 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Jbond out of all services on: 2 hosts
  • 09:51 root@cumin2002: START - Cookbook sre.idm.logout Logging Jbond out of all services on: 2 hosts
  • 09:45 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Jbond out of all services on: 2211 hosts
  • 09:44 root@cumin2002: START - Cookbook sre.idm.logout Logging Jbond out of all services on: 2211 hosts
  • 09:20 jelto@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 09:05 jelto@cumin1001: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 08:59 jelto@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 08:57 jelto@cumin1001: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 08:50 jelto@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 07:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1026.eqiad.wmnet with OS bookworm
  • 07:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbproxy1026.eqiad.wmnet with reason: host reimage
  • 07:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dbproxy1026.eqiad.wmnet with reason: host reimage
  • 07:12 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1026.eqiad.wmnet with OS bookworm
  • 06:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2135.codfw.wmnet with OS bookworm
  • 06:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2135.codfw.wmnet with reason: host reimage
  • 06:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2135.codfw.wmnet with reason: host reimage
  • 05:56 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2135.codfw.wmnet with OS bookworm
  • 05:37 marostegui: Failover m3 from db1119 to db1159 - T352360
  • 05:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2134,2160].codfw.wmnet,db[1119,1159,1217].eqiad.wmnet with reason: m3 master switchover T352149
  • 05:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db[2134,2160].codfw.wmnet,db[1119,1159,1217].eqiad.wmnet with reason: m3 master switchover T352149
  • 02:31 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2109.codfw.wmnet with OS bookworm
  • 02:31 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 02:28 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 02:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2107.codfw.wmnet with OS bookworm
  • 02:27 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 02:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2108.codfw.wmnet with OS bookworm
  • 02:27 pt1979@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 02:24 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 02:18 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 02:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2106.codfw.wmnet with OS bookworm
  • 02:17 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 02:16 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 02:16 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2105.codfw.wmnet with OS bookworm
  • 02:16 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 02:11 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 02:10 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2109.codfw.wmnet with reason: host reimage
  • 02:07 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2109.codfw.wmnet with reason: host reimage
  • 02:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2108.codfw.wmnet with reason: host reimage
  • 02:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2107.codfw.wmnet with reason: host reimage
  • 02:01 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2108.codfw.wmnet with reason: host reimage
  • 01:58 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2107.codfw.wmnet with reason: host reimage
  • 01:58 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2106.codfw.wmnet with reason: host reimage
  • 01:56 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ceph2003']
  • 01:56 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ceph2001']
  • 01:54 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2106.codfw.wmnet with reason: host reimage
  • 01:54 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2105.codfw.wmnet with reason: host reimage
  • 01:51 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2105.codfw.wmnet with reason: host reimage
  • 01:49 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2109.codfw.wmnet with OS bookworm
  • 01:43 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2108.codfw.wmnet with OS bookworm
  • 01:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ceph2002']
  • 01:40 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2104.codfw.wmnet with OS bookworm
  • 01:40 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 01:40 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ceph2002']
  • 01:40 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2107.codfw.wmnet with OS bookworm
  • 01:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ceph2002']
  • 01:40 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ceph2003']
  • 01:40 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ceph2002']
  • 01:39 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ceph2001']
  • 01:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ceph2003']
  • 01:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ceph2002']
  • 01:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ceph2001']
  • 01:39 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ceph2002']
  • 01:39 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ceph2001']
  • 01:39 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ceph2003']
  • 01:38 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 01:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ceph2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:36 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2106.codfw.wmnet with OS bookworm
  • 01:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ceph2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ceph2001.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:34 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2103.codfw.wmnet with OS bookworm
  • 01:34 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 01:32 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2105.codfw.wmnet with OS bookworm
  • 01:32 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 01:31 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2102.codfw.wmnet with OS bookworm
  • 01:31 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 01:30 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 01:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2100.codfw.wmnet with OS bookworm
  • 01:29 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 01:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2101.codfw.wmnet with OS bookworm
  • 01:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 01:28 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 01:24 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ceph2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:24 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ceph2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:24 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ceph2001.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:22 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2104.codfw.wmnet with reason: host reimage
  • 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ceph2001-3 to codfw - jhancock@cumin2002"
  • 01:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ceph2001-3 to codfw - jhancock@cumin2002"
  • 01:18 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2104.codfw.wmnet with reason: host reimage
  • 01:17 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 01:14 foks: removing 120 files for legal compliance
  • 01:11 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2103.codfw.wmnet with reason: host reimage
  • 01:09 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2100.codfw.wmnet with reason: host reimage
  • 01:07 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2102.codfw.wmnet with reason: host reimage
  • 01:06 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2100.codfw.wmnet with reason: host reimage
  • 01:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2101.codfw.wmnet with reason: host reimage
  • 01:02 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2101.codfw.wmnet with reason: host reimage
  • 00:59 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2104.codfw.wmnet with OS bookworm
  • 00:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2103.codfw.wmnet with OS bookworm
  • 00:49 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2102.codfw.wmnet with OS bookworm
  • 00:44 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2101.codfw.wmnet with OS bookworm
  • 00:40 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2100.codfw.wmnet with OS bookworm
  • 00:39 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2098.codfw.wmnet with OS bookworm
  • 00:39 pt1979@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 00:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2099.codfw.wmnet with OS bookworm
  • 00:38 pt1979@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 00:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2097.codfw.wmnet with OS bookworm
  • 00:38 pt1979@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 00:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2094.codfw.wmnet with OS bookworm
  • 00:38 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 00:35 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 00:25 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 00:23 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1107.eqiad.wmnet with OS bookworm
  • 00:22 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1107.eqiad.wmnet with OS bookworm
  • 00:19 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 00:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2099.codfw.wmnet with reason: host reimage
  • 00:14 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 00:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2099.codfw.wmnet with reason: host reimage
  • 00:09 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1105.eqiad.wmnet with OS bookworm
  • 00:09 jclark@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1001"
  • 00:08 jclark@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1001"
  • 00:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2098.codfw.wmnet with reason: host reimage
  • 00:05 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1107.eqiad.wmnet with OS bookworm
  • 00:05 jclark@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1001"
  • 00:03 jclark@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1001"
  • 00:02 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2098.codfw.wmnet with reason: host reimage
  • 00:01 krinkle@deploy2002: Synchronized wmf-config/CommonSettings.php: (no justification provided) (duration: 06m 37s)
  • 00:00 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2094.codfw.wmnet with reason: host reimage

Archives

See Server Admin Log/Archives.