Server Admin Log/Archive 77

From Wikitech


2024-03-31

  • 21:05 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 21:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 21:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T352010)', diff saved to https://phabricator.wikimedia.org/P59040 and previous config saved to /var/cache/conftool/dbconfig/20240331-210533-ladsgroup.json
  • 20:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P59039 and previous config saved to /var/cache/conftool/dbconfig/20240331-205025-ladsgroup.json
  • 20:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P59038 and previous config saved to /var/cache/conftool/dbconfig/20240331-203518-ladsgroup.json
  • 20:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T352010)', diff saved to https://phabricator.wikimedia.org/P59037 and previous config saved to /var/cache/conftool/dbconfig/20240331-202010-ladsgroup.json
  • 12:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1236 (T352010)', diff saved to https://phabricator.wikimedia.org/P59036 and previous config saved to /var/cache/conftool/dbconfig/20240331-121815-ladsgroup.json
  • 12:18 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 12:17 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 12:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T352010)', diff saved to https://phabricator.wikimedia.org/P59035 and previous config saved to /var/cache/conftool/dbconfig/20240331-121751-ladsgroup.json
  • 12:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P59034 and previous config saved to /var/cache/conftool/dbconfig/20240331-120243-ladsgroup.json
  • 11:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P59033 and previous config saved to /var/cache/conftool/dbconfig/20240331-114736-ladsgroup.json
  • 11:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T352010)', diff saved to https://phabricator.wikimedia.org/P59032 and previous config saved to /var/cache/conftool/dbconfig/20240331-113228-ladsgroup.json
  • 00:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T352010)', diff saved to https://phabricator.wikimedia.org/P59031 and previous config saved to /var/cache/conftool/dbconfig/20240331-000134-ladsgroup.json
  • 00:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 00:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 00:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T352010)', diff saved to https://phabricator.wikimedia.org/P59030 and previous config saved to /var/cache/conftool/dbconfig/20240331-000112-ladsgroup.json

2024-03-30

  • 23:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P59029 and previous config saved to /var/cache/conftool/dbconfig/20240330-234604-ladsgroup.json
  • 23:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P59028 and previous config saved to /var/cache/conftool/dbconfig/20240330-233056-ladsgroup.json
  • 23:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T352010)', diff saved to https://phabricator.wikimedia.org/P59027 and previous config saved to /var/cache/conftool/dbconfig/20240330-231549-ladsgroup.json
  • 13:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T352010)', diff saved to https://phabricator.wikimedia.org/P59026 and previous config saved to /var/cache/conftool/dbconfig/20240330-133129-ladsgroup.json
  • 13:31 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 13:31 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 13:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T352010)', diff saved to https://phabricator.wikimedia.org/P59025 and previous config saved to /var/cache/conftool/dbconfig/20240330-133116-ladsgroup.json
  • 13:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P59024 and previous config saved to /var/cache/conftool/dbconfig/20240330-131609-ladsgroup.json
  • 13:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P59023 and previous config saved to /var/cache/conftool/dbconfig/20240330-130102-ladsgroup.json
  • 12:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T352010)', diff saved to https://phabricator.wikimedia.org/P59022 and previous config saved to /var/cache/conftool/dbconfig/20240330-124554-ladsgroup.json
  • 05:19 TimStarling: on releases1003 uploaded mediawiki 1.19.0 - 1.19.8, 1.20.0 - 1.20.6 T190369
  • 02:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T352010)', diff saved to https://phabricator.wikimedia.org/P59021 and previous config saved to /var/cache/conftool/dbconfig/20240330-022801-ladsgroup.json
  • 02:27 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 02:27 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 02:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T352010)', diff saved to https://phabricator.wikimedia.org/P59020 and previous config saved to /var/cache/conftool/dbconfig/20240330-022738-ladsgroup.json
  • 02:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P59019 and previous config saved to /var/cache/conftool/dbconfig/20240330-021231-ladsgroup.json
  • 01:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P59018 and previous config saved to /var/cache/conftool/dbconfig/20240330-015723-ladsgroup.json
  • 01:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T352010)', diff saved to https://phabricator.wikimedia.org/P59017 and previous config saved to /var/cache/conftool/dbconfig/20240330-014215-ladsgroup.json

2024-03-29

  • 22:24 amastilovic@deploy1002: Finished deploy [airflow-dags/analytics@67eaa50]: (no justification provided) (duration: 00m 29s)
  • 22:23 amastilovic@deploy1002: Started deploy [airflow-dags/analytics@67eaa50]: (no justification provided)
  • 22:12 tzatziki: removing 1 file for legal compliance
  • 21:56 tzatziki: removing 4 files for legal compliance
  • 20:30 tzatziki: removing 4 files for legal compliance
  • 20:00 tzatziki: removing 1 file for legal compliance
  • 19:53 tzatziki: removing 1 file for legal compliance
  • 16:51 amastilovic@deploy1002: Finished deploy [airflow-dags/analytics@e6892f4]: (no justification provided) (duration: 00m 26s)
  • 16:51 amastilovic@deploy1002: Started deploy [airflow-dags/analytics@e6892f4]: (no justification provided)
  • 16:14 andrewbogott: updated wikitech-static to 1.41.1
  • 16:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T352010)', diff saved to https://phabricator.wikimedia.org/P59016 and previous config saved to /var/cache/conftool/dbconfig/20240329-160707-ladsgroup.json
  • 16:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 16:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 16:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T352010)', diff saved to https://phabricator.wikimedia.org/P59015 and previous config saved to /var/cache/conftool/dbconfig/20240329-160644-ladsgroup.json
  • 15:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P59014 and previous config saved to /var/cache/conftool/dbconfig/20240329-155137-ladsgroup.json
  • 15:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P59013 and previous config saved to /var/cache/conftool/dbconfig/20240329-153629-ladsgroup.json
  • 15:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T352010)', diff saved to https://phabricator.wikimedia.org/P59012 and previous config saved to /var/cache/conftool/dbconfig/20240329-152122-ladsgroup.json
  • 14:00 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1027.eqiad.wmnet with reason: Decommissioning — T354561
  • 14:00 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1027.eqiad.wmnet with reason: Decommissioning — T354561
  • 13:32 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw
  • 13:32 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw
  • 09:30 filippo@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host alert1001.wikimedia.org
  • 09:24 filippo@cumin2002: START - Cookbook sre.puppet.migrate-host for host alert1001.wikimedia.org
  • 09:22 filippo@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host alert1001.wikimedia.org
  • 09:21 filippo@cumin2002: START - Cookbook sre.puppet.migrate-host for host alert1001.wikimedia.org
  • 09:18 filippo@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host alert1001.wikimedia.org
  • 09:18 filippo@cumin2002: START - Cookbook sre.puppet.migrate-host for host alert1001.wikimedia.org
  • 08:36 dcausse: repooling wdqs1013 (T360993)
  • 05:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T352010)', diff saved to https://phabricator.wikimedia.org/P59009 and previous config saved to /var/cache/conftool/dbconfig/20240329-054724-ladsgroup.json
  • 05:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 05:47 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance

2024-03-28

  • 23:57 ejegg: donorwiki upgraded from c7f1325c to 5e39bdc5
  • 23:56 ejegg: payments-wiki upgraded from cca87e29 to 5e39bdc5
  • 21:25 tgr@deploy1002: Finished scap: Backport for Enter deprecation trial for third-party cookie blocking (T359957), Add CommunityConfiguration log channel (T361072) (duration: 19m 30s)
  • 21:14 tgr@deploy1002: urbanecm and tgr: Continuing with sync
  • 21:08 tgr@deploy1002: urbanecm and tgr: Backport for Enter deprecation trial for third-party cookie blocking (T359957), Add CommunityConfiguration log channel (T361072) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:06 tgr@deploy1002: Started scap: Backport for Enter deprecation trial for third-party cookie blocking (T359957), Add CommunityConfiguration log channel (T361072)
  • 20:59 inflatador: bking@mwmaint1002 sudo apt-get install ripgrep (faster recursive grep)
  • 20:58 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.24 refs T360156
  • 20:52 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host dbprov1005.eqiad.wmnet with OS bullseye
  • 20:22 hashar@deploy1002: Finished deploy [integration/docroot@c89a404]: add CodeMirror to opensource.yaml - T359986 (duration: 00m 06s)
  • 20:22 hashar@deploy1002: Started deploy [integration/docroot@c89a404]: add CodeMirror to opensource.yaml - T359986
  • 20:18 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.42.0-wmf.24 refs T360156 (duration: 12m 33s)
  • 20:07 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic2090* for ban elastic2090 before reimage - ryankemper@cumin2002 - T353878
  • 20:07 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic2090* for ban elastic2090 before reimage - ryankemper@cumin2002 - T353878
  • 20:06 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw
  • 20:06 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw
  • 20:06 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.24 refs T360156
  • 19:51 pfischer@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:51 pfischer@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:48 ryankemper: T353878 Updated cross cluster remote seed conf with latest master info: `ryankemper@mwmaint1002:~/elastic$ python push_cross_cluster_conf.py https://search.svc.codfw.wmnet:9443/_cluster/settings --ccc chi=chi_codfw_masters.lst psi=psi_codfw_masters.lst omega=omega_codfw_masters.lst`
  • 19:45 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.24 refs T360156
  • 19:36 pfischer@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:35 pfischer@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:34 damilare: civicrm upgraded from 2e0ac12f to ed776060
  • 19:28 jhuneidi@deploy1002: Finished scap: Backport for objectcache: Restore default keyspace for LocalServerCache service (T358346 T361177) (duration: 34m 00s)
  • 19:16 jhuneidi@deploy1002: tgr and jhuneidi: Continuing with sync
  • 19:14 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov1005.eqiad.wmnet with OS bullseye
  • 19:04 mutante: CI (contint) - replacing envoy SSL cert (puppet CA -> cfssl)
  • 18:58 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host dbprov1005.eqiad.wmnet with OS bullseye
  • 18:56 jhuneidi@deploy1002: tgr and jhuneidi: Backport for objectcache: Restore default keyspace for LocalServerCache service (T358346 T361177) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:54 jhuneidi@deploy1002: Started scap: Backport for objectcache: Restore default keyspace for LocalServerCache service (T358346 T361177)
  • 18:52 pfischer@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:52 pfischer@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:51 pfischer@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:51 pfischer@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:48 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov1005.eqiad.wmnet with OS bullseye
  • 18:25 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host dbprov1005.eqiad.wmnet with OS bullseye
  • 18:11 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 18:11 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 17:39 joal@deploy1002: Finished deploy [airflow-dags/analytics@f64680f]: Regular deploy of Analytics airflow dags [airflow-dags/analytics@f64680fc] (duration: 00m 27s)
  • 17:39 joal@deploy1002: Started deploy [airflow-dags/analytics@f64680f]: Regular deploy of Analytics airflow dags [airflow-dags/analytics@f64680fc]
  • 17:27 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov1005.eqiad.wmnet with OS bullseye
  • 17:27 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic2037.codfw.wmnet
  • 17:27 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:27 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: elastic2037.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
  • 17:26 btullis@deploy1002: Finished deploy [analytics/refinery@9c2ca38] (hadoop-test): Analytics refinery deploy to test git-lfs TEST [analytics/refinery@9c2ca387] (duration: 02m 20s)
  • 17:23 btullis@deploy1002: Started deploy [analytics/refinery@9c2ca38] (hadoop-test): Analytics refinery deploy to test git-lfs TEST [analytics/refinery@9c2ca387]
  • 17:23 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: elastic2037.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
  • 17:22 btullis@deploy1002: Finished deploy [analytics/refinery@9c2ca38] (thin): Analytics refinery deploy to test git-lfs THIN [analytics/refinery@9c2ca387] (duration: 03m 26s)
  • 17:21 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
  • 17:19 btullis@deploy1002: Started deploy [analytics/refinery@9c2ca38] (thin): Analytics refinery deploy to test git-lfs THIN [analytics/refinery@9c2ca387]
  • 17:17 btullis@deploy1002: Finished deploy [analytics/refinery@9c2ca38]: Analytics refinery deploy to test git-lfs [analytics/refinery@9c2ca387] (duration: 00m 19s)
  • 17:17 btullis@deploy1002: Started deploy [analytics/refinery@9c2ca38]: Analytics refinery deploy to test git-lfs [analytics/refinery@9c2ca387]
  • 17:16 ryankemper@cumin2002: START - Cookbook sre.hosts.decommission for hosts elastic2037.codfw.wmnet
  • 17:15 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic[2052-2054].codfw.wmnet
  • 17:15 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:14 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
  • 17:09 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host dbprov1005.eqiad.wmnet with OS bullseye
  • 17:06 cwhite@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash1012.eqiad.wmnet
  • 17:06 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:06 cwhite@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002"
  • 17:05 cwhite@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002"
  • 17:03 cwhite@cumin2002: START - Cookbook sre.dns.netbox
  • 16:56 cwhite@cumin2002: START - Cookbook sre.hosts.decommission for hosts logstash1012.eqiad.wmnet
  • 16:54 cwhite@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts logstash1010.eqiad.wmnet
  • 16:54 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:54 cwhite@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002"
  • 16:53 cwhite@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002"
  • 16:50 cwhite@cumin2002: START - Cookbook sre.dns.netbox
  • 16:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup2003.codfw.wmnet with OS bookworm
  • 16:47 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 16:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 100%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P59001 and previous config saved to /var/cache/conftool/dbconfig/20240328-164639-arnaudb.json
  • 16:45 cwhite@cumin2002: START - Cookbook sre.hosts.decommission for hosts logstash1010.eqiad.wmnet
  • 16:45 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov1006.eqiad.wmnet with OS bullseye
  • 16:43 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 16:39 btullis@deploy1002: Finished deploy [analytics/refinery@9c2ca38]: Regular analytics weekly train [analytics/refinery@9c2ca387] (duration: 09m 13s)
  • 16:33 ryankemper@cumin2002: START - Cookbook sre.hosts.decommission for hosts elastic[2052-2054].codfw.wmnet
  • 16:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 75%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58998 and previous config saved to /var/cache/conftool/dbconfig/20240328-163132-arnaudb.json
  • 16:30 btullis@deploy1002: Started deploy [analytics/refinery@9c2ca38]: Regular analytics weekly train [analytics/refinery@9c2ca387]
  • 16:26 btullis@deploy1002: Finished deploy [analytics/refinery@9c2ca38]: Regular analytics weekly train [analytics/refinery@9c2ca387] (duration: 02m 46s)
  • 16:26 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudbackup2003.codfw.wmnet with reason: host reimage
  • 16:26 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudbackup2003.codfw.wmnet with reason: host reimage
  • 16:25 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 16:25 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 16:23 btullis@deploy1002: Started deploy [analytics/refinery@9c2ca38]: Regular analytics weekly train [analytics/refinery@9c2ca387]
  • 16:17 brennen@deploy1002: Finished scap: testwikis wikis to 1.42.0-wmf.24 refs T360156 (duration: 14m 17s)
  • 16:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 50%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58997 and previous config saved to /var/cache/conftool/dbconfig/20240328-161627-arnaudb.json
  • 16:03 brennen@deploy1002: Started scap: testwikis wikis to 1.42.0-wmf.24 refs T360156
  • 16:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 25%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58994 and previous config saved to /var/cache/conftool/dbconfig/20240328-160121-arnaudb.json
  • 16:00 cwhite@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts logstash1010.eqiad.wmnet
  • 16:00 cwhite@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 15:57 cwhite@cumin2002: START - Cookbook sre.dns.netbox
  • 15:56 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 15:55 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 15:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db1200 (re)pooling @ 100%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58993 and previous config saved to /var/cache/conftool/dbconfig/20240328-155537-arnaudb.json
  • 15:55 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 15:55 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 15:53 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudbackup2003.codfw.wmnet with OS bookworm
  • 15:51 cwhite@cumin2002: START - Cookbook sre.hosts.decommission for hosts logstash1010.eqiad.wmnet
  • 15:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 15%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58992 and previous config saved to /var/cache/conftool/dbconfig/20240328-154615-arnaudb.json
  • 15:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db1200 (re)pooling @ 75%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58991 and previous config saved to /var/cache/conftool/dbconfig/20240328-154031-arnaudb.json
  • 15:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 10%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58990 and previous config saved to /var/cache/conftool/dbconfig/20240328-153109-arnaudb.json
  • 15:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db1200 (re)pooling @ 50%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58989 and previous config saved to /var/cache/conftool/dbconfig/20240328-152525-arnaudb.json
  • 15:23 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host dbprov1006.eqiad.wmnet with OS bullseye
  • 15:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 5%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58987 and previous config saved to /var/cache/conftool/dbconfig/20240328-151603-arnaudb.json
  • 15:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db1200 (re)pooling @ 25%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58986 and previous config saved to /var/cache/conftool/dbconfig/20240328-151019-arnaudb.json
  • 15:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2157.codfw.wmnet with OS bookworm
  • 14:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db1200 (re)pooling @ 16%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58985 and previous config saved to /var/cache/conftool/dbconfig/20240328-145514-arnaudb.json
  • 14:51 klausman@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 14:47 pfischer@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:47 pfischer@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2157.codfw.wmnet with reason: host reimage
  • 14:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db1200 (re)pooling @ 8%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58984 and previous config saved to /var/cache/conftool/dbconfig/20240328-144008-arnaudb.json
  • 14:38 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2157.codfw.wmnet with reason: host reimage
  • 14:30 brouberol@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 14:26 brouberol@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 14:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db1200 (re)pooling @ 4%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58983 and previous config saved to /var/cache/conftool/dbconfig/20240328-142502-arnaudb.json
  • 14:21 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2157.codfw.wmnet with OS bookworm
  • 14:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2157.codfw.wmnet with reason: T360116
  • 14:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2157.codfw.wmnet with reason: T360116
  • 14:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool to reimage db2157 (T360116)', diff saved to https://phabricator.wikimedia.org/P58982 and previous config saved to /var/cache/conftool/dbconfig/20240328-141844-arnaudb.json
  • 14:15 dcausse: re-enabling puppet on wdqs1013
  • 14:10 Dreamy_Jazz: Afternoon UTC backport window done
  • 14:10 brouberol@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 14:10 dreamyjazz@deploy1002: Finished scap: Backport for Move checkuser grant configuration to CheckUser extension (T359537) (duration: 16m 08s)
  • 14:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db1200 (re)pooling @ 2%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58981 and previous config saved to /var/cache/conftool/dbconfig/20240328-140956-arnaudb.json
  • 14:08 brouberol@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 14:05 brouberol@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 14:02 brouberol@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 14:01 klausman@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 14:01 brouberol@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:00 brouberol@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 13:58 dreamyjazz@deploy1002: tgr and dreamyjazz: Continuing with sync
  • 13:56 dreamyjazz@deploy1002: tgr and dreamyjazz: Backport for Move checkuser grant configuration to CheckUser extension (T359537) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db1200 (re)pooling @ 1%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58980 and previous config saved to /var/cache/conftool/dbconfig/20240328-135450-arnaudb.json
  • 13:54 dreamyjazz@deploy1002: Started scap: Backport for Move checkuser grant configuration to CheckUser extension (T359537)
  • 13:45 dreamyjazz@deploy1002: Finished scap: Backport for Add setting to determine if CampaignEvents should use the global DB (T348281), Add virtual domain mapping for CampaignEvents (prod) (T348281), Add virtual domain mapping for CampaignEvents (beta) (T348281) (duration: 18m 49s)
  • 13:42 brouberol@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 13:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1200.eqiad.wmnet with OS bookworm
  • 13:35 klausman@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 13:34 dreamyjazz@deploy1002: dreamyjazz: Continuing with sync
  • 13:29 dreamyjazz@deploy1002: dreamyjazz: Backport for Add setting to determine if CampaignEvents should use the global DB (T348281), Add virtual domain mapping for CampaignEvents (prod) (T348281), Add virtual domain mapping for CampaignEvents (beta) (T348281) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:27 dreamyjazz@deploy1002: Started scap: Backport for Add setting to determine if CampaignEvents should use the global DB (T348281), Add virtual domain mapping for CampaignEvents (prod) (T348281), Add virtual domain mapping for CampaignEvents (beta) (T348281)
  • 13:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1200.eqiad.wmnet with reason: host reimage
  • 13:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1200.eqiad.wmnet with reason: host reimage
  • 13:18 mabualruz@deploy1002: Finished scap: Backport for Revert donatewiki and thankyouwiki for fundraising (T360628) (duration: 15m 55s)
  • 13:17 dcausse: repooling wdqs2009 (test query rate when depooled T360993)
  • 13:07 dcausse: temporarily depooling wdqs2009 (test query rate when depooled T360993)
  • 13:07 mabualruz@deploy1002: ksarabia and mabualruz: Continuing with sync
  • 13:06 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1200.eqiad.wmnet with OS bookworm
  • 13:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1200.eqiad.wmnet with reason: Silence for reimaging
  • 13:05 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1200.eqiad.wmnet with reason: Silence for reimaging
  • 13:05 mabualruz@deploy1002: ksarabia and mabualruz: Backport for Revert donatewiki and thankyouwiki for fundraising (T360628) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:02 mabualruz@deploy1002: Started scap: Backport for Revert donatewiki and thankyouwiki for fundraising (T360628)
  • 12:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool to reimage db1200', diff saved to https://phabricator.wikimedia.org/P58978 and previous config saved to /var/cache/conftool/dbconfig/20240328-125721-arnaudb.json
  • 12:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 100%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58977 and previous config saved to /var/cache/conftool/dbconfig/20240328-122701-arnaudb.json
  • 12:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P58976 and previous config saved to /var/cache/conftool/dbconfig/20240328-122628-ladsgroup.json
  • 12:14 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 12:12 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 12:12 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 12:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 75%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58975 and previous config saved to /var/cache/conftool/dbconfig/20240328-121155-arnaudb.json
  • 12:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P58974 and previous config saved to /var/cache/conftool/dbconfig/20240328-121122-ladsgroup.json
  • 12:10 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 12:04 claime: trafficserver: move 65% of traffic to mw on k8s - T360763
  • 11:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 50%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58973 and previous config saved to /var/cache/conftool/dbconfig/20240328-115649-arnaudb.json
  • 11:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P58972 and previous config saved to /var/cache/conftool/dbconfig/20240328-115616-ladsgroup.json
  • 11:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 25%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58971 and previous config saved to /var/cache/conftool/dbconfig/20240328-114144-arnaudb.json
  • 11:41 mvolz@deploy1002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 11:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P58970 and previous config saved to /var/cache/conftool/dbconfig/20240328-114110-ladsgroup.json
  • 11:40 mvolz@deploy1002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 11:40 mvolz@deploy1002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 11:39 mvolz@deploy1002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 11:36 mvolz@deploy1002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 11:36 mvolz@deploy1002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 11:32 jayme: deployed helmfile.d/admin to staging-codfw,staging-eqiad,codfw,eqiad
  • 11:31 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["76082583"]' 2>&1 | tee -a ~/T315510-enwiki-4; date
  • 11:25 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 11:22 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 11:15 claime: enabling and running puppet on P:restbase - T358213
  • 11:12 claime: enabling and running puppet on restbase1035.eqiad.wmnet - T358213
  • 11:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 8%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58968 and previous config saved to /var/cache/conftool/dbconfig/20240328-111132-arnaudb.json
  • 11:09 claime: enabling and running puppet on restbase2021.codfw.wmnet - T358213
  • 11:04 claime: Disabling puppet on P:restbase - T358213
  • 11:04 claime: RESTbase: Migrate backend traffic to mw-api-int - T358213
  • 11:03 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 11:01 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 11:01 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 10:59 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 10:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 4%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58967 and previous config saved to /var/cache/conftool/dbconfig/20240328-105626-arnaudb.json
  • 10:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 2%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58966 and previous config saved to /var/cache/conftool/dbconfig/20240328-104121-arnaudb.json
  • 10:38 pfischer@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:34 pfischer@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:31 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 10:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 1%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P58965 and previous config saved to /var/cache/conftool/dbconfig/20240328-102615-arnaudb.json
  • 10:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2178.codfw.wmnet with OS bookworm
  • 10:01 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2178.codfw.wmnet with reason: host reimage
  • 09:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2178.codfw.wmnet with reason: host reimage
  • 09:44 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4046.ulsfo.wmnet
  • 09:44 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4046.ulsfo.wmnet
  • 09:41 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2178.codfw.wmnet with OS bookworm
  • 09:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2178.codfw.wmnet with reason: Silence for reimaging
  • 09:40 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2178.codfw.wmnet with reason: Silence for reimaging
  • 09:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool to reimage db2178', diff saved to https://phabricator.wikimedia.org/P58963 and previous config saved to /var/cache/conftool/dbconfig/20240328-093424-arnaudb.json
  • 09:30 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4038.ulsfo.wmnet
  • 09:30 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4046.ulsfo.wmnet
  • 09:27 fabfur: temp depooled cp4038 and cp4046 to install benthos (T358109)
  • 09:26 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4046.ulsfo.wmnet
  • 09:26 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4038.ulsfo.wmnet
  • 08:29 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 08:27 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 08:20 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 08:19 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Verify Upgrade cookbook on GitLab Replica
  • 08:18 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Verify Upgrade cookbook on GitLab Replica
  • 07:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 2:00:00 on db[2115,2215].codfw.wmnet with reason: Downtime until tuesday (T361133)
  • 07:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 5 days, 2:00:00 on db[2115,2215].codfw.wmnet with reason: Downtime until tuesday (T361133)
  • 06:33 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 06:32 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 06:29 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 06:28 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 04:11 tstarling@deploy1002: Finished scap: Backport for Hooks: restore respect of $wgCodeMirrorLineNumberingNamespaces in CM5 (T347211) (duration: 14m 30s)
  • 03:59 tstarling@deploy1002: tstarling: Continuing with sync
  • 03:59 tstarling@deploy1002: tstarling: Backport for Hooks: restore respect of $wgCodeMirrorLineNumberingNamespaces in CM5 (T347211) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 03:56 tstarling@deploy1002: Started scap: Backport for Hooks: restore respect of $wgCodeMirrorLineNumberingNamespaces in CM5 (T347211)
  • 01:48 tstarling@deploy1002: Finished scap: Backport for Fix index usage when searching for page titles (T360865), Fix index usage when searching for page titles (T360865) (duration: 20m 04s)
  • 01:36 tstarling@deploy1002: tstarling: Continuing with sync
  • 01:32 tstarling@deploy1002: tstarling: Backport for Fix index usage when searching for page titles (T360865), Fix index usage when searching for page titles (T360865) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 01:27 tstarling@deploy1002: Started scap: Backport for Fix index usage when searching for page titles (T360865), Fix index usage when searching for page titles (T360865)
  • 01:08 dzahn@cumin1002: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1004.wikimedia.org with reason: security release
  • 00:38 ladsgroup@deploy1002: Finished scap: Backport for Avoid left join when getting templates needing review (T361166) (duration: 14m 56s)
  • 00:26 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 00:25 ladsgroup@deploy1002: ladsgroup: Backport for Avoid left join when getting templates needing review (T361166) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 00:23 ladsgroup@deploy1002: Started scap: Backport for Avoid left join when getting templates needing review (T361166)
  • 00:20 ebernhardson@deploy1002: Finished scap: Backport for cirrus: Move small wiki traffic to eqiad (take two) (duration: 13m 30s)
  • 00:14 ryankemper@cumin2002: START - Cookbook sre.hosts.decommission for hosts elastic[2050-2054].codfw.wmnet
  • 00:09 ebernhardson@deploy1002: ebernhardson: Continuing with sync
  • 00:08 ebernhardson@deploy1002: ebernhardson: Backport for cirrus: Move small wiki traffic to eqiad (take two) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 00:06 ebernhardson@deploy1002: Started scap: Backport for cirrus: Move small wiki traffic to eqiad (take two)
  • 00:05 ebernhardson@deploy1002: Finished scap: Backport for cirrus: Move small wiki traffic to eqiad (duration: 15m 27s)

2024-03-27

  • 23:58 ryankemper: T360993 [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good
  • 23:53 ebernhardson@deploy1002: ebernhardson: Continuing with sync
  • 23:53 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 23:53 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 23:52 ebernhardson@deploy1002: ebernhardson: Backport for cirrus: Move small wiki traffic to eqiad synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:49 ebernhardson@deploy1002: Started scap: Backport for cirrus: Move small wiki traffic to eqiad
  • 23:26 dzahn@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: security release
  • 23:21 TimStarling: on releases1003: uploaded 80 missing old MediaWiki releases T190369
  • 23:15 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1026.eqiad.wmnet with reason: Decommissioning — T354561
  • 23:15 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1026.eqiad.wmnet with reason: Decommissioning — T354561
  • 23:04 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host elastic2088.codfw.wmnet with OS bullseye
  • 22:30 ryankemper: T360993 [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`
  • 22:30 ryankemper: T360993 [WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`
  • 22:30 ryankemper: T360993 [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`
  • 22:28 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@143ca33]: 0.3.138 (duration: 11m 24s)
  • 22:17 ryankemper: T360993 [WDQS Deploy] Tests passing following deploy of `0.3.138` on canary `wdqs1003`; proceeding to rest of fleet
  • 22:17 ryankemper@deploy1002: Started deploy [wdqs/wdqs@143ca33]: 0.3.138
  • 22:16 ryankemper: T360993 [WDQS Deploy] Gearing up for deploy of wdqs `0.3.138`. Pre-deploy tests passing on canary `wdqs1003`
  • 21:46 bking@cumin2002: START - Cookbook sre.hosts.decommission for hosts elastic[2038-2048,2050-2054].codfw.wmnet
  • 21:41 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
  • 20:26 jhuneidi@deploy1002: Finished scap: Backport for Scope temp user reserved pattern to temp users (T361021 T349506), Updates config to deploy vector 2022 (T360628) (duration: 18m 57s)
  • 20:15 jhuneidi@deploy1002: ksarabia and jhuneidi and tchanders: Continuing with sync
  • 20:10 jhuneidi@deploy1002: ksarabia and jhuneidi and tchanders: Backport for Scope temp user reserved pattern to temp users (T361021 T349506), Updates config to deploy vector 2022 (T360628) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:08 jhuneidi@deploy1002: Started scap: Backport for Scope temp user reserved pattern to temp users (T361021 T349506), Updates config to deploy vector 2022 (T360628)
  • 19:41 mutante: ticket.wikimedia.org - replacing envoy cert on backends
  • 18:54 jynus: increasing volume size of backup2011 T334069
  • 18:38 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.42.0-wmf.24 refs T360156 (duration: 12m 38s)
  • 18:34 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov1006.eqiad.wmnet with OS bullseye
  • 18:25 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.24 refs T360156
  • 17:12 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host dbprov1006.eqiad.wmnet with OS bullseye
  • 16:38 Emperor: depool and restart swift-proxy on ms-fe2013 then repool T360913
  • 16:37 Emperor: depool and restart swift-proxy on ms-fe2012 then repool T360913
  • 16:37 Emperor: depool and restart swift-proxy on ms-fe2011 then repool T360913
  • 16:34 Emperor: restart swift-proxy on ms-fe2010 then repool T360913
  • 16:31 Emperor: depool and restart swift-proxy on moss-fe2001 then repool T360913
  • 16:28 denisse@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host alert2001.wikimedia.org
  • 16:22 denisse@cumin2002: START - Cookbook sre.puppet.migrate-host for host alert2001.wikimedia.org
  • 16:21 denisse@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host alert2001.wikimedia.org
  • 16:21 denisse@cumin2002: START - Cookbook sre.puppet.migrate-host for host alert2001.wikimedia.org
  • 16:12 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 22:00:00 on db[2115,2215].codfw.wmnet with reason: Downtime for analysis
  • 16:12 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 22:00:00 on db[2115,2215].codfw.wmnet with reason: Downtime for analysis
  • 16:10 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:09 jayme@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 16:08 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:07 jayme@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 16:06 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:05 jayme@deploy1002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 15:55 inflatador: bking@cumin2002 running puppet against A:wdqs-main to apply nginx changes T360993
  • 15:53 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
  • 15:53 jayme@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop: apply
  • 15:51 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12 days, 0:00:00 on elastic2038.codfw.wmnet with reason: T358882
  • 15:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 12 days, 0:00:00 on elastic2038.codfw.wmnet with reason: T358882
  • 15:51 arnaudb@cumin1002: END (ERROR) - Cookbook sre.mysql.clone (exit_code=97) Will create a clone of db2115.codfw.wmnet onto db2215.codfw.wmnet
  • 15:51 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 15:51 claime: 50% of backend RESTbase traffic to mw-api-int - T358213
  • 15:50 jayme@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 15:50 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 15:50 jayme@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 15:43 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 15:43 jayme@deploy1002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 15:35 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@1a343bf] (releasing): deploying fix for T361084 to all targets (duration: 01m 03s)
  • 15:34 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@1a343bf] (releasing): deploying fix for T361084 to all targets
  • 15:33 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@1a343bf] (releasing): deploying fix for T361084 to all targets (duration: 00m 19s)
  • 15:33 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@1a343bf] (releasing): deploying fix for T361084 to all targets
  • 15:23 brouberol@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-tool1009.eqiad.wmnet
  • 15:23 brouberol@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:23 brouberol@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-tool1009.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin2002"
  • 15:23 logmsgbot: andrewtavis-wmde@deploy1002 Finished deploy [airflow-dags/wmde@36dee63]: (no justification provided) (duration: 00m 08s)
  • 15:23 logmsgbot: andrewtavis-wmde@deploy1002 Started deploy [airflow-dags/wmde@36dee63]: (no justification provided)
  • 15:21 brouberol@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-tool1009.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin2002"
  • 15:19 brouberol@cumin2002: START - Cookbook sre.dns.netbox
  • 15:17 claime: enabling and running puppet on P:restbase - T358213
  • 15:14 brouberol@cumin2002: START - Cookbook sre.hosts.decommission for hosts an-tool1009.eqiad.wmnet
  • 15:14 claime: enabling and running puppet on restbase1035.eqiad.wmnet - T358213
  • 15:12 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@1a343bf] (releasing): testing fix for T361084 (duration: 00m 20s)
  • 15:12 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@1a343bf] (releasing): testing fix for T361084
  • 15:11 claime: enabling and running puppet on restbase2021.codfw.wmnet - T358213
  • 15:08 claime: Disabling puppet on P:restbase - T358213
  • 14:54 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/proton: apply
  • 14:52 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/proton: apply
  • 14:45 effie: Day 8: Pool codfw for user traffic - T357547
  • 14:42 jiji@cumin1002: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) status all services in all: None - None
  • 14:42 jiji@cumin1002: START - Cookbook sre.discovery.datacenter status all services in all: None - None
  • 14:41 jiji@cumin1002: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: Pool active/active services on codfw - T357547
  • 14:21 jiji@cumin1002: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: Pool active/active services on codfw - T357547
  • 14:19 effie: Day 8: Pool active/active services on codfw - T357547
  • 14:17 Dreamy_Jazz: Afternoon UTC backport window done (extended by 17 mins)
  • 14:17 dreamyjazz@deploy1002: Finished scap: Backport for Prevent new user names matching the temporary account pattern (T361021 T349506) (duration: 14m 28s)
  • 14:09 jiji@cumin1002: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) status all services in all: None - None
  • 14:09 jiji@cumin1002: START - Cookbook sre.discovery.datacenter status all services in all: None - None
  • 14:06 dreamyjazz@deploy1002: dreamyjazz and tchanders: Continuing with sync
  • 14:05 dreamyjazz@deploy1002: dreamyjazz and tchanders: Backport for Prevent new user names matching the temporary account pattern (T361021 T349506) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 100%: Post clone (dst)', diff saved to https://phabricator.wikimedia.org/P58957 and previous config saved to /var/cache/conftool/dbconfig/20240327-140334-arnaudb.json
  • 14:02 dreamyjazz@deploy1002: Started scap: Backport for Prevent new user names matching the temporary account pattern (T361021 T349506)
  • 13:59 dreamyjazz@deploy1002: Finished scap: Backport for Removing MachineVision events, extension is being sunsetted (T347970), Sunsetting MachineVision extension, so remove config (T352884) (duration: 50m 49s)
  • 13:59 godog: bounce prometheus@k8s-aux in eqiad - T343529
  • 13:57 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1096.eqiad.wmnet with OS bullseye
  • 13:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 75%: Post clone (dst)', diff saved to https://phabricator.wikimedia.org/P58956 and previous config saved to /var/cache/conftool/dbconfig/20240327-134828-arnaudb.json
  • 13:41 dreamyjazz@deploy1002: dreamyjazz and cparle: Continuing with sync
  • 13:41 dreamyjazz@deploy1002: dreamyjazz and cparle: Backport for Removing MachineVision events, extension is being sunsetted (T347970), Sunsetting MachineVision extension, so remove config (T352884) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 50%: Post clone (dst)', diff saved to https://phabricator.wikimedia.org/P58955 and previous config saved to /var/cache/conftool/dbconfig/20240327-133322-arnaudb.json
  • 13:32 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1096.eqiad.wmnet with reason: host reimage
  • 13:31 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2115.codfw.wmnet onto db2215.codfw.wmnet
  • 13:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2115 in db2215 for T355422', diff saved to https://phabricator.wikimedia.org/P58954 and previous config saved to /var/cache/conftool/dbconfig/20240327-133015-arnaudb.json
  • 13:29 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1096.eqiad.wmnet with reason: host reimage
  • 13:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: provisionning db2215.codfw.wmnet - T355422
  • 13:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: provisionning db2215.codfw.wmnet - T355422
  • 13:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2115.codfw.wmnet with reason: provisionning db2215.codfw.wmnet - T355422
  • 13:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2115.codfw.wmnet with reason: provisionning db2215.codfw.wmnet - T355422
  • 13:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 25%: Post clone (dst)', diff saved to https://phabricator.wikimedia.org/P58952 and previous config saved to /var/cache/conftool/dbconfig/20240327-131816-arnaudb.json
  • 13:17 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@9df0d43] (releasing): (no justification provided) (duration: 00m 20s)
  • 13:17 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@9df0d43] (releasing): (no justification provided)
  • 13:15 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1096.eqiad.wmnet with OS bullseye
  • 13:09 dreamyjazz@deploy1002: Started scap: Backport for Removing MachineVision events, extension is being sunsetted (T347970), Sunsetting MachineVision extension, so remove config (T352884)
  • 13:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 20%: Post clone (dst)', diff saved to https://phabricator.wikimedia.org/P58951 and previous config saved to /var/cache/conftool/dbconfig/20240327-130310-arnaudb.json
  • 13:02 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maint T352010
  • 13:02 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maint T352010
  • 12:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 15%: Post clone (dst)', diff saved to https://phabricator.wikimedia.org/P58950 and previous config saved to /var/cache/conftool/dbconfig/20240327-124805-arnaudb.json
  • 12:37 brouberol@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 12:37 brouberol@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 12:36 brouberol@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 12:36 brouberol@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 12:36 brouberol@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:35 brouberol@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 12:35 brouberol@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 12:34 brouberol@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 12:33 brouberol: redeploying external-services in all k8s clusters to account for the newly exposed ml-cassandra cluster - T360428
  • 12:33 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 10%: Post clone (dst)', diff saved to https://phabricator.wikimedia.org/P58949 and previous config saved to /var/cache/conftool/dbconfig/20240327-123258-arnaudb.json
  • 12:32 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:30 brouberol@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:30 brouberol@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:28 brouberol@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 12:27 brouberol@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 12:26 brouberol@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:26 brouberol@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 12:25 brouberol@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 12:24 brouberol@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 12:21 arnaudb@cumin1002: dbctl commit (dc=all): 'db2119 (re)pooling @ 100%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58948 and previous config saved to /var/cache/conftool/dbconfig/20240327-122136-arnaudb.json
  • 12:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 5%: Post clone (dst)', diff saved to https://phabricator.wikimedia.org/P58947 and previous config saved to /var/cache/conftool/dbconfig/20240327-121752-arnaudb.json
  • 12:15 sukhe: running authdns-update to repool esams
  • 12:07 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=esams,cluster=cache_text
  • 12:06 arnaudb@cumin1002: dbctl commit (dc=all): 'db2119 (re)pooling @ 75%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58946 and previous config saved to /var/cache/conftool/dbconfig/20240327-120630-arnaudb.json
  • 12:04 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 8 hosts
  • 12:04 sukhe@cumin2002: START - Cookbook sre.hosts.remove-downtime for 8 hosts
  • 11:51 arnaudb@cumin1002: dbctl commit (dc=all): 'db2119 (re)pooling @ 50%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58945 and previous config saved to /var/cache/conftool/dbconfig/20240327-115125-arnaudb.json
  • 11:44 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=registry2004.codfw.wmnet
  • 11:41 elukey: run `apt-get clean` on registry2004 to free some space on the root partition
  • 11:39 elukey@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM registry2004.codfw.wmnet
  • 11:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db2120 (re)pooling @ 100%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58944 and previous config saved to /var/cache/conftool/dbconfig/20240327-113911-arnaudb.json
  • 11:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db2120 (re)pooling @ 75%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58942 and previous config saved to /var/cache/conftool/dbconfig/20240327-112405-arnaudb.json
  • 11:16 elukey@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM registry2003.codfw.wmnet
  • 11:15 elukey: expand vram for registry200[3,4] from 4G to 6G - T360637
  • 11:13 brouberol@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:13 brouberol@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 11:12 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=registry2003.codfw.wmnet
  • 11:11 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on registry2004.codfw.wmnet with reason: Increase tmpfs for nginx
  • 11:11 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on registry2004.codfw.wmnet with reason: Increase tmpfs for nginx
  • 11:11 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on registry2003.codfw.wmnet with reason: Increase tmpfs for nginx
  • 11:10 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on registry2003.codfw.wmnet with reason: Increase tmpfs for nginx
  • 11:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db2120 (re)pooling @ 50%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58941 and previous config saved to /var/cache/conftool/dbconfig/20240327-110858-arnaudb.json
  • 10:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2120 (re)pooling @ 25%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58940 and previous config saved to /var/cache/conftool/dbconfig/20240327-105353-arnaudb.json
  • 10:51 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2120.codfw.wmnet onto db2220.codfw.wmnet
  • 10:31 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@661e531] (releasing): (no justification provided) (duration: 00m 40s)
  • 10:31 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@661e531] (releasing): (no justification provided)
  • 10:28 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on releases1003.eqiad.wmnet with reason: Troubleshooting jenkins update
  • 10:28 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on releases1003.eqiad.wmnet with reason: Troubleshooting jenkins update
  • 10:25 brouberol@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 10:25 brouberol@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 10:23 brouberol@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 10:23 brouberol@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 10:20 klausman@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 10:19 klausman@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 10:19 brouberol@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:19 brouberol@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 10:14 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@661e531] (releasing): (no justification provided) (duration: 01m 46s)
  • 10:12 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@661e531] (releasing): (no justification provided)
  • 10:11 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:10 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 10:10 brouberol@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 10:09 brouberol@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 10:04 jynus: powercycling backup1005
  • 10:02 jnuche@deploy1002: deploy aborted: (no justification provided) (duration: 01m 21s)
  • 10:01 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@661e531] (releasing): (no justification provided)
  • 09:45 sukhe: poweroff A:esams and A:cp-text
  • 09:22 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2120.codfw.wmnet onto db2220.codfw.wmnet
  • 09:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2120 in db2220 for T355422', diff saved to https://phabricator.wikimedia.org/P58938 and previous config saved to /var/cache/conftool/dbconfig/20240327-092030-arnaudb.json
  • 09:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2220.codfw.wmnet with reason: provisionning db2220.codfw.wmnet - T355422
  • 09:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2220.codfw.wmnet with reason: provisionning db2220.codfw.wmnet - T355422
  • 09:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: provisionning db2220.codfw.wmnet - T355422
  • 09:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: provisionning db2220.codfw.wmnet - T355422
  • 09:15 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2119.codfw.wmnet onto db2219.codfw.wmnet
  • 09:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2119 in db2219 for T355422', diff saved to https://phabricator.wikimedia.org/P58937 and previous config saved to /var/cache/conftool/dbconfig/20240327-091444-arnaudb.json
  • 09:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: provisionning db2219.codfw.wmnet - T355422
  • 09:13 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: provisionning db2219.codfw.wmnet - T355422
  • 09:12 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: provisionning db2219.codfw.wmnet - T355422
  • 09:12 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: provisionning db2219.codfw.wmnet - T355422
  • 09:07 hashar: Downgraded release Jenkins back to 2.426.3
  • 08:54 hashar@deploy1002: Finished deploy [releng/jenkins-deploy@b3ccf85] (releasing): Upgrade Jenkins from 2.426.3 to 2.440.2 on release hosts # T360759 (duration: 05m 51s)
  • 08:48 hashar@deploy1002: Started deploy [releng/jenkins-deploy@b3ccf85] (releasing): Upgrade Jenkins from 2.426.3 to 2.440.2 on release hosts # T360759
  • 08:38 hashar: UTC morning backport window completed
  • 08:37 hashar@deploy1002: Finished scap: Backport for Add webrequest.frontend.rc0 stream (T314956 T351117) (duration: 20m 59s)
  • 08:25 hashar@deploy1002: otto and hashar: Continuing with sync
  • 08:20 hashar@deploy1002: otto and hashar: Backport for Add webrequest.frontend.rc0 stream (T314956 T351117) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:16 hashar@deploy1002: Started scap: Backport for Add webrequest.frontend.rc0 stream (T314956 T351117)
  • 07:14 fabfur@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 8 hosts with reason: preparing for new disk
  • 07:14 fabfur@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 8 hosts with reason: preparing for new disk
  • 07:11 kart_: Updated MinT to 2024-03-26-120044-production (T347930, T355304, T349487)
  • 07:09 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
  • 07:00 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
  • 06:57 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
  • 06:48 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
  • 06:38 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
  • 06:32 kartik@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
  • 05:57 fabfur: running authdns-update on dns1004 to depool ESAMS (T360430)
  • 04:55 eileen: civicrm upgraded from 143aa0bf to 2e0ac12f
  • 01:35 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic2037*,elastic2038*,elastic2041*,elastic2042*,elastic2045*,elastic2046*,elastic2047*,elastic2050*,elastic2051*,elastic2052*,elastic2039*,elastic2040*,elastic2043*,elastic2044*,elastic2048*,elastic2053*,elastic2054* for prepare for decom of hosts - ryankemper@cumin2002 - T358882
  • 01:35 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic2037*,elastic2038*,elastic2041*,elastic2042*,elastic2045*,elastic2046*,elastic2047*,elastic2050*,elastic2051*,elastic2052*,elastic2039*,elastic2040*,elastic2043*,elastic2044*,elastic2048*,elastic2053*,elastic2054* for prepare for decom of hosts - ryankemper@cumin2002 - T358882
  • 01:31 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbprov2005.codfw.wmnet with OS bullseye
  • 01:10 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov2005.codfw.wmnet with reason: host reimage
  • 01:07 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov2005.codfw.wmnet with reason: host reimage
  • 01:06 ryankemper: T358882 Updated remote cluster seeds for new master state
  • 01:06 ryankemper: [WDQS] Restarted `wdqs-blazegraph` and `wdqs-updater` on `wdqs1013` and depooled to catch up on lag
  • 00:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye

2024-03-26

  • 23:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T352010)', diff saved to https://phabricator.wikimedia.org/P58936 and previous config saved to /var/cache/conftool/dbconfig/20240326-234806-ladsgroup.json
  • 23:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 23:47 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 23:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T352010)', diff saved to https://phabricator.wikimedia.org/P58935 and previous config saved to /var/cache/conftool/dbconfig/20240326-234743-ladsgroup.json
  • 23:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P58934 and previous config saved to /var/cache/conftool/dbconfig/20240326-233235-ladsgroup.json
  • 23:30 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbprov2005.codfw.wmnet with OS bullseye
  • 23:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P58932 and previous config saved to /var/cache/conftool/dbconfig/20240326-231728-ladsgroup.json
  • 23:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T352010)', diff saved to https://phabricator.wikimedia.org/P58931 and previous config saved to /var/cache/conftool/dbconfig/20240326-230220-ladsgroup.json
  • 22:58 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
  • 22:56 reedy@deploy1002: Finished scap: SecurePoll PopulateEditCount fix (duration: 25m 49s)
  • 22:55 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
  • 22:39 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
  • 22:32 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
  • 22:32 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
  • 22:30 reedy@deploy1002: Started scap: SecurePoll PopulateEditCount fix
  • 22:24 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
  • 22:24 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
  • 22:20 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
  • 22:15 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye
  • 21:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: cycle some masters - ryankemper@cumin2002 - T358882
  • 21:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbprov2005.codfw.wmnet with OS bullseye
  • 21:38 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 21:07 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov1006.eqiad.wmnet with OS bullseye
  • 21:03 catrope@deploy1002: Finished scap: Backport for Add autopatrolled, rollbacker and suppressredirect user groups for ckbwiktionary (T360228) (duration: 17m 37s)
  • 20:57 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2112.codfw.wmnet with reason: Maintenance
  • 20:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2112.codfw.wmnet with reason: Maintenance
  • 20:54 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 17 hosts with reason: Maint T343718
  • 20:54 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 17 hosts with reason: Maint T343718
  • 20:53 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 20:53 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 20:51 catrope@deploy1002: aram and catrope: Continuing with sync
  • 20:51 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 12 hosts with reason: Maint T352010
  • 20:51 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 12 hosts with reason: Maint T352010
  • 20:50 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2113.codfw.wmnet with reason: Maintenance
  • 20:50 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2113.codfw.wmnet with reason: Maintenance
  • 20:49 catrope@deploy1002: aram and catrope: Backport for Add autopatrolled, rollbacker and suppressredirect user groups for ckbwiktionary (T360228) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:46 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 13 hosts with reason: Maint T343718
  • 20:46 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 13 hosts with reason: Maint T343718
  • 20:45 catrope@deploy1002: Started scap: Backport for Add autopatrolled, rollbacker and suppressredirect user groups for ckbwiktionary (T360228)
  • 20:43 catrope@deploy1002: Finished scap: Backport for CodexHTMLForm: Fix margins around links in login form (T360945) (duration: 22m 09s)
  • 20:41 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 20:41 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 20:41 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 13 hosts with reason: Maint T352010
  • 20:40 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on 13 hosts with reason: Maint T352010
  • 20:30 catrope@deploy1002: catrope: Continuing with sync
  • 20:27 catrope@deploy1002: catrope: Backport for CodexHTMLForm: Fix margins around links in login form (T360945) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:21 catrope@deploy1002: Started scap: Backport for CodexHTMLForm: Fix margins around links in login form (T360945)
  • 20:09 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: cycle some masters - ryankemper@cumin2002 - T358882
  • 19:48 mutante: phabricator - added GMikesell-WMF to WMF-NDA because that goes together with the wmf LDAP group (https://wikitech.wikimedia.org/wiki/SRE/Clinic_Duty/Access_requests#WMF_Group) - T358922
  • 19:46 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host dbprov1006.eqiad.wmnet with OS bullseye
  • 19:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:21 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:21 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:21 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 19:20 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:20 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov2005.codfw.wmnet with reason: host reimage
  • 19:02 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov2005.codfw.wmnet with reason: host reimage
  • 18:50 ejegg: donorwiki upgraded from 27d326b7 to c7f1325c
  • 18:47 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye
  • 18:47 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov1006.eqiad.wmnet with OS bullseye
  • 18:42 cstone: civicrm upgraded from b8a84b22 to 143aa0bf
  • 18:36 denisse@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host alert2001.wikimedia.org
  • 18:36 denisse@cumin2002: START - Cookbook sre.puppet.migrate-host for host alert2001.wikimedia.org
  • 18:34 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.24 refs T360156
  • 18:21 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 18:21 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 18:20 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 18:20 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 18:00 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov2005.codfw.wmnet with OS bullseye
  • 17:45 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 17:44 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 17:22 vriley@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['dbprov1006']
  • 17:22 vriley@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov1006']
  • 17:22 vriley@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['dbprov1006']
  • 17:22 vriley@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov1006']
  • 17:21 vriley@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['dbprov1006']
  • 17:21 vriley@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov1006']
  • 17:13 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host dbprov1006.eqiad.wmnet with OS bullseye
  • 17:06 jynus: add georgemikesell to wmf ldap group T358922
  • 16:54 jebe@deploy1002: Finished deploy [analytics/refinery@07a0290] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@07a0290a] (duration: 03m 38s)
  • 16:51 jebe@deploy1002: Started deploy [analytics/refinery@07a0290] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@07a0290a]
  • 16:50 jebe@deploy1002: Finished deploy [analytics/refinery@07a0290] (thin): Regular analytics weekly train THIN [analytics/refinery@07a0290a] (duration: 00m 04s)
  • 16:50 jebe@deploy1002: Started deploy [analytics/refinery@07a0290] (thin): Regular analytics weekly train THIN [analytics/refinery@07a0290a]
  • 16:50 jebe@deploy1002: Finished deploy [analytics/refinery@07a0290]: Regular analytics weekly train [analytics/refinery@07a0290a] (duration: 00m 14s)
  • 16:49 jebe@deploy1002: Started deploy [analytics/refinery@07a0290]: Regular analytics weekly train [analytics/refinery@07a0290a]
  • 16:40 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye
  • 16:39 sukhe: restart pybal on lvs2013 and lvs2014
  • 16:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 100%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58929 and previous config saved to /var/cache/conftool/dbconfig/20240326-163744-arnaudb.json
  • 16:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 75%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58928 and previous config saved to /var/cache/conftool/dbconfig/20240326-162238-arnaudb.json
  • 16:18 denisse: Importing karma 0.119 to reprepro - T333615
  • 16:12 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbprov2005.codfw.wmnet with OS bullseye
  • 16:09 sukhe: restart pybal on lvs2013
  • 16:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 50%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58927 and previous config saved to /var/cache/conftool/dbconfig/20240326-160733-arnaudb.json
  • 16:01 jebe@deploy1002: Finished deploy [analytics/refinery@07a0290]: Regular analytics weekly train [analytics/refinery@07a0290a] (duration: 01m 18s)
  • 16:00 jebe@deploy1002: Started deploy [analytics/refinery@07a0290]: Regular analytics weekly train [analytics/refinery@07a0290a]
  • 15:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 25%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58926 and previous config saved to /var/cache/conftool/dbconfig/20240326-155227-arnaudb.json
  • 15:46 jebe@deploy1002: Finished deploy [analytics/refinery@07a0290]: Regular analytics weekly train [analytics/refinery@07a0290a] (duration: 02m 26s)
  • 15:43 jebe@deploy1002: Started deploy [analytics/refinery@07a0290]: Regular analytics weekly train [analytics/refinery@07a0290a]
  • 15:43 jebe@deploy1002: Finished deploy [analytics/refinery@07a0290]: Regular analytics weekly train [analytics/refinery@07a0290a] (duration: 13m 05s)
  • 15:39 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-coord1002.eqiad.wmnet
  • 15:39 brouberol@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:39 brouberol@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-coord1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1002"
  • 15:39 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye
  • 15:37 brouberol@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-coord1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1002"
  • 15:35 gmodena@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 15:34 gmodena@deploy1002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 15:33 gmodena@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 15:33 gmodena@deploy1002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 15:32 brouberol@cumin1002: START - Cookbook sre.dns.netbox
  • 15:32 gmodena@deploy1002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 15:32 gmodena@deploy1002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 15:30 gmodena@deploy1002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
  • 15:30 gmodena@deploy1002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
  • 15:30 jebe@deploy1002: Started deploy [analytics/refinery@07a0290]: Regular analytics weekly train [analytics/refinery@07a0290a]
  • 15:28 gmodena@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 15:27 gmodena@deploy1002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 15:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:26 gmodena@deploy1002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 15:26 gmodena@deploy1002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 15:25 brouberol@cumin1002: START - Cookbook sre.hosts.decommission for hosts an-coord1002.eqiad.wmnet
  • 15:23 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-coord1001.eqiad.wmnet
  • 15:23 brouberol@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:23 brouberol@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-coord1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1002"
  • 15:22 brouberol@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-coord1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1002"
  • 15:18 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:15 brouberol@cumin1002: START - Cookbook sre.dns.netbox
  • 15:09 brouberol@cumin1002: START - Cookbook sre.hosts.decommission for hosts an-coord1001.eqiad.wmnet
  • 14:56 jnuche@deploy1002: Installation of scap version "4.73.2" completed for 364 hosts
  • 14:55 jnuche@deploy1002: Installing scap version "4.73.2" for 364 hosts
  • 14:26 jayme@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:25 jayme@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:25 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 14:24 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 14:24 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 14:22 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 14:21 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:21 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 14:20 claime: Deploying split listener for 10% of backend restbase traffic to mw-api-int - T358213
  • 14:19 claime: enabling and running puppet on P:restbase - T358213
  • 14:19 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:19 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 14:18 TheresNoTime: UTC afternoon backport window done
  • 14:15 samtar@deploy1002: Finished scap: Backport for dewiki: Enable mobile page tabs for everyone (T360246), frwiki: update legacy vector logo (T359741) (duration: 17m 23s)
  • 14:13 vriley@cumin1002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts parse1014.eqiad.wmnet
  • 14:04 samtar@deploy1002: anzx and samtar: Continuing with sync
  • 14:03 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:02 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 14:02 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:01 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 14:01 vriley@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts parse1014.eqiad.wmnet
  • 14:00 samtar@deploy1002: anzx and samtar: Backport for dewiki: Enable mobile page tabs for everyone (T360246), frwiki: update legacy vector logo (T359741) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:58 samtar@deploy1002: Started scap: Backport for dewiki: Enable mobile page tabs for everyone (T360246), frwiki: update legacy vector logo (T359741)
  • 13:55 samtar@deploy1002: Finished scap: Backport for knwikisource, knwiktionary: update logo, wordmark (T360022) (duration: 15m 53s)
  • 13:43 samtar@deploy1002: anzx and samtar: Continuing with sync
  • 13:41 samtar@deploy1002: anzx and samtar: Backport for knwikisource, knwiktionary: update logo, wordmark (T360022) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T352010)', diff saved to https://phabricator.wikimedia.org/P58921 and previous config saved to /var/cache/conftool/dbconfig/20240326-133932-ladsgroup.json
  • 13:39 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:39 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:39 samtar@deploy1002: Started scap: Backport for knwikisource, knwiktionary: update logo, wordmark (T360022)
  • 13:39 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 13:38 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 13:36 samtar@deploy1002: Finished scap: Backport for Add throttle rule for editathon (T360533) (duration: 13m 40s)
  • 13:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:27 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dbprov2005 dns - pt1979@cumin2002"
  • 13:26 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dbprov2005 dns - pt1979@cumin2002"
  • 13:25 samtar@deploy1002: samtar and zoranzoki21: Continuing with sync
  • 13:25 samtar@deploy1002: samtar and zoranzoki21: Backport for Add throttle rule for editathon (T360533) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:22 samtar@deploy1002: Started scap: Backport for Add throttle rule for editathon (T360533)
  • 13:20 samtar@deploy1002: Finished scap: Backport for Update mediawiki.web_ui_actions stream config (T360955) (duration: 18m 03s)
  • 13:19 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 13:09 samtar@deploy1002: phuedx and samtar: Continuing with sync
  • 13:04 samtar@deploy1002: phuedx and samtar: Backport for Update mediawiki.web_ui_actions stream config (T360955) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:02 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:02 samtar@deploy1002: Started scap: Backport for Update mediawiki.web_ui_actions stream config (T360955)
  • 13:01 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 12:59 samtar@deploy1002: Finished scap: Backport for [officewiki, testwiki]: enable CodeMirrorV6 (T357795) (duration: 17m 40s)
  • 12:54 claime: enabling and running puppet on restbase1035.eqiad.wmnet - T358213
  • 12:54 claime: enabling and running puppet on restbase2021.codfw.wmnet - T358213
  • 12:48 TheresNoTime: noting that `host='mwdebug2001.codfw.wmnet', port=443): Read timed out.` during scap `check_testservers_baremetal`, retry worked P58919
  • 12:47 samtar@deploy1002: musikanimal and samtar: Continuing with sync
  • 12:46 samtar@deploy1002: musikanimal and samtar: Backport for [officewiki, testwiki]: enable CodeMirrorV6 (T357795) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:41 samtar@deploy1002: Started scap: Backport for [officewiki, testwiki]: enable CodeMirrorV6 (T357795)
  • 12:06 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:06 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:24 claime: enabling and running puppet on restbase1035.eqiad.wmnet - T358213
  • 11:19 claime: enabling and running puppet on restbase2021.codfw.wmnet - T358213
  • 11:15 claime: Stopping puppet on P:restbase to deploy 1005756 - T358213
  • 11:11 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 11:10 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 11:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 16 hosts with reason: Maint T343718
  • 11:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 16 hosts with reason: Maint T343718
  • 10:46 urbanecm@deploy1002: Finished scap: Backport for [beta] eswiki: Enable CommunityConfiguration extension (T357766), [beta] eswiki: Use CommunityConfiguration extension for GrowthExperiments (T357766) (duration: 13m 41s)
  • 10:43 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 10:39 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 10:38 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 10:37 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 10:35 urbanecm@deploy1002: urbanecm: Continuing with sync
  • 10:35 urbanecm@deploy1002: urbanecm: Backport for [beta] eswiki: Enable CommunityConfiguration extension (T357766), [beta] eswiki: Use CommunityConfiguration extension for GrowthExperiments (T357766) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:33 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset: apply
  • 10:33 urbanecm@deploy1002: Started scap: Backport for [beta] eswiki: Enable CommunityConfiguration extension (T357766), [beta] eswiki: Use CommunityConfiguration extension for GrowthExperiments (T357766)
  • 10:33 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset: apply
  • 10:31 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 10:31 urbanecm@deploy1002: Finished scap: Backport for Add CommunityConfiguration extension (T357766), Add wmgUseCommunityConfiguration (T357766) (duration: 48m 57s)
  • 10:30 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 10:18 dcausse: stopping blazegraph on wdqs1013, (wdqs->wikidata maxlag propagation not working as expected)
  • 10:18 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 10:18 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 10:13 urbanecm@deploy1002: urbanecm: Continuing with sync
  • 10:13 urbanecm@deploy1002: urbanecm: Backport for Add CommunityConfiguration extension (T357766), Add wmgUseCommunityConfiguration (T357766) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:08 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-tool1010.eqiad.wmnet
  • 10:08 brouberol@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:08 brouberol@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-tool1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1002"
  • 10:01 brouberol@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-tool1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1002"
  • 09:42 urbanecm@deploy1002: Started scap: Backport for Add CommunityConfiguration extension (T357766), Add wmgUseCommunityConfiguration (T357766)
  • 09:38 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:38 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:37 brouberol@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:37 brouberol@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:12 brouberol@cumin1002: START - Cookbook sre.dns.netbox
  • 09:04 brouberol@cumin1002: START - Cookbook sre.hosts.decommission for hosts an-tool1010.eqiad.wmnet
  • 09:02 hashar@deploy1002: Finished scap: Backport for zhwikivoyage: Enable NewUserMessage extension (T360175) (duration: 18m 29s)
  • 08:51 hashar@deploy1002: hashar and s8321414: Continuing with sync
  • 08:46 hashar@deploy1002: hashar and s8321414: Backport for zhwikivoyage: Enable NewUserMessage extension (T360175) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:44 hashar@deploy1002: Started scap: Backport for zhwikivoyage: Enable NewUserMessage extension (T360175)
  • 08:41 dcausse: depooling and restarting blazegraph on wdqs1013 (stuck for 2 days)
  • 08:41 kart_: Updated cxserver to 2024-03-21-114859-production (T353510)
  • 08:31 brouberol: I'm going to apply kafka log compaction for {eqiad,codfw}.mediawiki.currussearch.page_rerender.v1 on kafka-main-codfw only (current replica) - T354794
  • 08:28 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 08:28 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 08:25 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 08:24 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 08:24 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 08:23 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 08:19 kartik@deploy1002: Finished scap: Backport for Enable ContentTranslation by default for myvwiki (T353510) (duration: 15m 48s)
  • 08:19 brouberol: deleting AQS eqiad VIP (10.2.2.12/32) from Netbox - T358793
  • 08:18 brouberol: deleting AQS codfw VIP (10.2.1.12/32) from Netbox - T358793
  • 08:08 kartik@deploy1002: kartik: Continuing with sync
  • 08:06 kartik@deploy1002: kartik: Backport for Enable ContentTranslation by default for myvwiki (T353510) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:03 kartik@deploy1002: Started scap: Backport for Enable ContentTranslation by default for myvwiki (T353510)
  • 05:51 dancy@deploy1002: Finished scap: testwikis wikis to 1.42.0-wmf.24 refs T360156 (duration: 26m 52s)
  • 05:24 dancy@deploy1002: Started scap: testwikis wikis to 1.42.0-wmf.24 refs T360156
  • 03:06 mwpresync@deploy1002: Started scap: testwikis wikis to 1.42.0-wmf.24 refs T360156
  • 03:03 mwpresync@deploy1002: Pruned MediaWiki: 1.42.0-wmf.21 (duration: 03m 33s)
  • 01:10 zabe: zabe@mwmaint1002:~$ mwscript extensions/Translate/scripts/moveTranslatableBundle.php --wiki metawiki "Communications" "Wikimedia Foundation/Communications" "Zabe" --reason "per request T360970" # T360970
  • 01:05 denisse: Starting logrotate.service on logstash2003 - T153940
  • 01:05 denisse: Starting logrotate.service on logstash2003

2024-03-25

  • 22:36 tstarling@deploy1002: Finished scap: Backport for Special:BlockList: apply simpler conditions when listing user blocks (T360864) (duration: 14m 38s)
  • 22:36 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@530e786]: Keep analytics_test up to date [airflow-dags/analytics_test@530e7863] (duration: 00m 10s)
  • 22:36 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@530e786]: Keep analytics_test up to date [airflow-dags/analytics_test@530e7863]
  • 22:35 aqu@deploy1002: Finished deploy [airflow-dags/analytics@530e786]: Refine through Airflow POC [airflow-dags/analytics@530e7863] (duration: 00m 28s)
  • 22:34 aqu@deploy1002: Started deploy [airflow-dags/analytics@530e786]: Refine through Airflow POC [airflow-dags/analytics@530e7863]
  • 22:25 tstarling@deploy1002: tstarling: Continuing with sync
  • 22:24 tstarling@deploy1002: tstarling: Backport for Special:BlockList: apply simpler conditions when listing user blocks (T360864) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:22 tstarling@deploy1002: Started scap: Backport for Special:BlockList: apply simpler conditions when listing user blocks (T360864)
  • 22:04 cwhite@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts logstash1011.eqiad.wmnet
  • 22:04 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:04 cwhite@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002"
  • 22:03 mutante: phabricator - added DBu-WMF (Danny Bu) to WMF-NDA - T356920
  • 22:02 cwhite@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002"
  • 22:00 cwhite@cumin2002: START - Cookbook sre.dns.netbox
  • 21:54 cwhite@cumin2002: START - Cookbook sre.hosts.decommission for hosts logstash1011.eqiad.wmnet
  • 21:34 cjming: end of UTC late backport window
  • 21:33 cjming@deploy1002: Finished scap: Backport for cirrus: Transition remaining cloudelastic wikis to streaming updater (T358518) (duration: 13m 48s)
  • 21:22 cjming@deploy1002: ebernhardson and cjming: Continuing with sync
  • 21:22 cjming@deploy1002: ebernhardson and cjming: Backport for cirrus: Transition remaining cloudelastic wikis to streaming updater (T358518) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:19 cjming@deploy1002: Started scap: Backport for cirrus: Transition remaining cloudelastic wikis to streaming updater (T358518)
  • 21:17 cjming@deploy1002: Finished scap: Backport for Remove X-Webkit-CSP-Report-Only response header from foundationwiki (T357479) (duration: 14m 10s)
  • 21:06 mutante: Phabricator - added @Arian_Bozorg and @Fring to WMF-NDA group after confirming they have an NDA on file but had to be added to the legal spreadsheet (T358578)
  • 21:06 cjming@deploy1002: hartman and cjming: Continuing with sync
  • 21:05 cjming@deploy1002: hartman and cjming: Backport for Remove X-Webkit-CSP-Report-Only response header from foundationwiki (T357479) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:03 cjming@deploy1002: Started scap: Backport for Remove X-Webkit-CSP-Report-Only response header from foundationwiki (T357479)
  • 20:59 cjming@deploy1002: Finished scap: Backport for Cirrus: testcommonswiki only needs 1 shard (duration: 15m 05s)
  • 20:48 cjming@deploy1002: cjming and ebernhardson: Continuing with sync
  • 20:46 cjming@deploy1002: cjming and ebernhardson: Backport for Cirrus: testcommonswiki only needs 1 shard synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:45 urandom: depooling restbase10[19-21].eqiad.wmnet — T360597
  • 20:44 cjming@deploy1002: Started scap: Backport for Cirrus: testcommonswiki only needs 1 shard
  • 20:42 cjming@deploy1002: Finished scap: Backport for Guard against undefined $container element in initMobile.js (T360781) (duration: 17m 45s)
  • 20:33 urandom: pool restbase10[34-42] — T360597
  • 20:31 cjming@deploy1002: cjming and jdrewniak: Continuing with sync
  • 20:27 cjming@deploy1002: cjming and jdrewniak: Backport for Guard against undefined $container element in initMobile.js (T360781) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:25 cjming@deploy1002: Started scap: Backport for Guard against undefined $container element in initMobile.js (T360781)
  • 20:16 zabe: zabe@mwmaint1002:~$ mwscript namespaceDupes.php --wiki thwikibooks --move-talk --fix # T360715
  • 20:11 urandom: pool restbase10[31-33] — T360597
  • 20:09 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase1042.eqiad.wmnet on all recursors
  • 20:09 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase1042.eqiad.wmnet on all recursors
  • 20:09 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase1041.eqiad.wmnet on all recursors
  • 20:09 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase1041.eqiad.wmnet on all recursors
  • 20:09 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase1040.eqiad.wmnet on all recursors
  • 20:09 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase1040.eqiad.wmnet on all recursors
  • 20:09 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase1039.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase1039.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase1038.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase1038.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase1037.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase1037.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase1036.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase1036.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase1035.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase1035.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase1034.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase1034.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase1033.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase1033.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase1032.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase1032.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase1031.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase1031.eqiad.wmnet on all recursors
  • 20:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase2035.codfw.wmnet on all recursors
  • 20:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase2035.codfw.wmnet on all recursors
  • 20:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase2034.codfw.wmnet on all recursors
  • 20:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase2034.codfw.wmnet on all recursors
  • 20:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase2033.codfw.wmnet on all recursors
  • 20:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase2033.codfw.wmnet on all recursors
  • 20:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase2032.codfw.wmnet on all recursors
  • 20:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase2032.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase2031.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase2031.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase2030.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase2030.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase2029.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase2029.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase2028.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:07 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove restbase node IPv6 dns records - cmooney@cumin1002"
  • 20:07 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase2028.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase2027.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase2027.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase2026.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase2026.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase2025.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase2025.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase2024.codfw.wmnet on all recursors
  • 20:07 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase2024.codfw.wmnet on all recursors
  • 20:06 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove restbase node IPv6 dns records - cmooney@cumin1002"
  • 20:06 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase2024.codfw.wmnet on all recursors
  • 20:05 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache restbase2024.codfw.wmnet on all recursors
  • 20:04 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 20:02 mutante: deploying change to prometheus-apache-exporter that will make it work on all distro versions incl bookworm, due to changed argument syntax
  • 19:54 zabe: Remove wikibase-otherprojects from user preferences (user_properties) # T342264
  • 19:29 sukhe: depool elastic2037: host is pooled but decommed
  • 19:28 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2037.codfw.wmnet
  • 19:28 brouberol: removing VIP from AQS hosts - T358793
  • 19:02 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1025.eqiad.wmnet with reason: Decommissioning — T354561
  • 19:02 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1025.eqiad.wmnet with reason: Decommissioning — T354561
  • 18:46 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts dbprov2005.codfw.wmnet
  • 18:46 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:46 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
  • 18:45 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
  • 18:45 eoghan@cumin1002: END (FAIL) - Cookbook sre.gitlab.failover (exit_code=93) Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 18:44 sukhe: sudo cumin -b1 -s60 "A:dns-rec and not P{dns6001*}" "run-puppet-agent --enable 'merging CR 1013382'"
  • 18:42 eoghan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 'https://gitlab-replica.wikimedia.org/ https://gitlab-replica-old.wikimedia.org/' on all recursors
  • 18:42 eoghan@cumin1002: START - Cookbook sre.dns.wipe-cache 'https://gitlab-replica.wikimedia.org/ https://gitlab-replica-old.wikimedia.org/' on all recursors
  • 18:40 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 18:35 pt1979@cumin2002: START - Cookbook sre.hosts.decommission for hosts dbprov2005.codfw.wmnet
  • 18:34 sukhe: sudo cumin "A:dnsbox" "disable-puppet 'merging CR 1013382'"
  • 18:27 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov2005.codfw.wmnet with OS bullseye
  • 18:27 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye
  • 18:27 bearloga@deploy1002: Finished deploy [airflow-dags/analytics_product@5e40c6f]: (no justification provided) (duration: 00m 08s)
  • 18:26 bearloga@deploy1002: Started deploy [airflow-dags/analytics_product@5e40c6f]: (no justification provided)
  • 18:00 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbprov2006.codfw.wmnet with OS bullseye
  • 18:00 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 17:50 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudbackup2003.codfw.wmnet with OS bookworm
  • 17:43 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2116.codfw.wmnet onto db2216.codfw.wmnet
  • 17:22 jgiannelos@deploy1002: Finished deploy [restbase/deploy@897fc7e]: Deploy latest restbase commit to restbase1031 (duration: 01m 26s)
  • 17:21 jgiannelos@deploy1002: Started deploy [restbase/deploy@897fc7e]: Deploy latest restbase commit to restbase1031
  • 17:15 jgiannelos@deploy1002: Finished deploy [restbase/deploy@897fc7e]: Deploy latest restbase commit to restbase1024 (duration: 01m 22s)
  • 17:13 jgiannelos@deploy1002: Started deploy [restbase/deploy@897fc7e]: Deploy latest restbase commit to restbase1024
  • 17:11 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 17:09 urandom: restarting restbase service, restbase1031 — T360597
  • 17:08 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 12m 47s)
  • 17:06 urandom: restarting restbase service, restbase1024 — T360597
  • 17:01 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudbackup2003.codfw.wmnet with OS bookworm
  • 16:56 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov2006.codfw.wmnet with reason: host reimage
  • 16:55 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 13m 42s)
  • 16:53 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov2006.codfw.wmnet with reason: host reimage
  • 16:47 urandom: correction: depooling restbase10[31-33] — T360597
  • 16:47 urandom: pooling restbase10[31-33] — T360597
  • 16:39 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2006.codfw.wmnet with OS bullseye
  • 16:36 urandom: pooling restbase10[19-21] — T360597
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 100%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58914 and previous config saved to /var/cache/conftool/dbconfig/20240325-162627-arnaudb.json
  • 16:13 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 12m 15s)
  • 16:12 brouberol: restarting pybal on lvs1019.eqiad.wmnet - T358793
  • 16:11 urandom: depooling restbase10[34-42] — T360597
  • 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 75%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58913 and previous config saved to /var/cache/conftool/dbconfig/20240325-161121-arnaudb.json
  • 16:09 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host dbprov2006.codfw.wmnet with OS bullseye
  • 16:05 brouberol: restarting pybal on lvs1020.eqiad.wmnet - T358793
  • 16:01 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 13m 00s)
  • 15:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 50%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58912 and previous config saved to /var/cache/conftool/dbconfig/20240325-155613-arnaudb.json
  • 15:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 25%: Post clone (src)', diff saved to https://phabricator.wikimedia.org/P58911 and previous config saved to /var/cache/conftool/dbconfig/20240325-154107-arnaudb.json
  • 15:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudbackup2003.codfw.wmnet with OS bookworm
  • 15:31 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2116.codfw.wmnet onto db2216.codfw.wmnet
  • 15:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2116 in db2216 for T355422', diff saved to https://phabricator.wikimedia.org/P58910 and previous config saved to /var/cache/conftool/dbconfig/20240325-152958-arnaudb.json
  • 15:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 15:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 15:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: provisionning db2216.codfw.wmnet - T355422
  • 15:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: provisionning db2216.codfw.wmnet - T355422
  • 15:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: provisionning db2216.codfw.wmnet - T355422
  • 15:27 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: provisionning db2216.codfw.wmnet - T355422
  • 15:27 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 15:27 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 15:26 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 15:26 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 15:26 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2114.codfw.wmnet onto db2214.codfw.wmnet
  • 15:26 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 15:26 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 15:26 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1230.eqiad.wmnet with reason: Maintenance
  • 15:25 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1230.eqiad.wmnet with reason: Maintenance
  • 15:25 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 15:25 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 15:25 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 15:25 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 15:25 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 15:25 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 15:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 15:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 15:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 15:24 sukhe: depool elastic2037: host is in insetup and in process of being decomissioned
  • 15:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 15:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 15:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 15:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 15:24 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2038.codfw.wmnet
  • 15:23 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 15:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 15:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 15:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 15:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 15:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 15:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 15:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 15:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 15:21 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 15:21 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 15:21 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 15:21 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 15:21 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 15:21 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 15:21 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 15:21 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 15:20 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 15:20 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 15:20 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 15:20 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 15:14 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 13 hosts with reason: Maint T343718
  • 15:14 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 13 hosts with reason: Maint T343718
  • 15:13 brouberol: restarting pybal on lvs2013.codfw.wmnet - T358793
  • 15:02 eoghan@cumin1002: START - Cookbook sre.gitlab.failover Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 15:00 brouberol: restarting pybal on lvs2014.codfw.wmnet - T358793
  • 15:00 eoghan@cumin1002: END (FAIL) - Cookbook sre.gitlab.failover (exit_code=93) Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 15:00 eoghan@cumin1002: START - Cookbook sre.gitlab.failover Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 14:58 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2006.codfw.wmnet with OS bullseye
  • 14:57 elukey: increase tmpfs for /var/lib/nginx on registry100[3,4] and restart nginx - T360637
  • 14:52 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on registry1004.eqiad.wmnet with reason: Increase tmpfs for nginx
  • 14:52 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on registry1004.eqiad.wmnet with reason: Increase tmpfs for nginx
  • 14:51 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on registry1003.eqiad.wmnet with reason: Increase tmpfs for nginx
  • 14:51 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on registry1003.eqiad.wmnet with reason: Increase tmpfs for nginx
  • 14:41 eoghan@cumin1002: END (FAIL) - Cookbook sre.gitlab.failover (exit_code=93) Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 14:40 eoghan@cumin1002: START - Cookbook sre.gitlab.failover Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 14:25 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2114.codfw.wmnet onto db2214.codfw.wmnet
  • 14:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2114 in db2214 for T355422', diff saved to https://phabricator.wikimedia.org/P58909 and previous config saved to /var/cache/conftool/dbconfig/20240325-142344-arnaudb.json
  • 14:22 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2214.codfw.wmnet with reason: provisionning db2214.codfw.wmnet - T355422
  • 14:22 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudbackup2003.codfw.wmnet with OS bookworm
  • 14:22 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2214.codfw.wmnet with reason: provisionning db2214.codfw.wmnet - T355422
  • 14:22 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: provisionning db2214.codfw.wmnet - T355422
  • 14:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: provisionning db2214.codfw.wmnet - T355422
  • 14:20 ayounsi@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2006.codfw.wmnet
  • 14:20 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2006.codfw.wmnet with OS bookworm
  • 14:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudbackup2003.codfw.wmnet with OS bookworm
  • 14:10 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudbackup2003.codfw.wmnet with OS bookworm
  • 13:58 hashar: UTC afternoon backport window completed
  • 13:56 hashar@deploy1002: Finished scap: (no justification provided) (duration: 12m 46s)
  • 13:55 godog: finish rolling out rsyslog-exporter to remaining hosts in codfw and eqiad - T357616
  • 13:53 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 16 hosts with reason: Maint T352010
  • 13:53 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 16 hosts with reason: Maint T352010
  • 13:52 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Maint T352010
  • 13:52 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 16 hosts with reason: Maint T352010
  • 13:45 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 13:44 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 13:43 hashar@deploy1002: Started scap: (no justification provided)
  • 13:39 hashar@deploy1002: Finished scap: Backport for Use more compact PHP7 syntax where possible (duration: 16m 15s)
  • 13:33 jgiannelos@deploy1002: Finished deploy [restbase/deploy@897fc7e]: (no justification provided) (duration: 01m 16s)
  • 13:32 jgiannelos@deploy1002: Started deploy [restbase/deploy@897fc7e]: (no justification provided)
  • 13:28 hashar@deploy1002: thiemowmde and hashar: Continuing with sync
  • 13:28 godog: bounce prometheus@k8s on prometheus2006 to diagnose OOM - T354399
  • 13:25 hashar@deploy1002: thiemowmde and hashar: Backport for Use more compact PHP7 syntax where possible synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:23 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 13:23 hashar@deploy1002: Started scap: Backport for Use more compact PHP7 syntax where possible
  • 13:18 hashar@deploy1002: Finished scap: Backport for throttle: Add throttle rule for editathon at Illinois Tech (T358494) (duration: 15m 28s)
  • 13:17 godog: bounce prometheus@k8s on prometheus2005 to diagnose OOM - T354399
  • 13:11 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbprov2005.codfw.wmnet with OS bullseye
  • 13:08 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 13:07 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 13:07 hashar@deploy1002: hashar and ammarpad: Continuing with sync
  • 13:06 hashar@deploy1002: hashar and ammarpad: Backport for throttle: Add throttle rule for editathon at Illinois Tech (T358494) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:04 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
  • 13:03 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/proton: apply
  • 13:02 hashar@deploy1002: Started scap: Backport for throttle: Add throttle rule for editathon at Illinois Tech (T358494)
  • 13:02 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
  • 13:00 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/proton: apply
  • 13:00 claime: doubling replicas for proton in eqiad again
  • 12:59 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
  • 12:58 klausman@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 12:57 klausman@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 12:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye
  • 12:51 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
  • 12:50 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/proton: apply
  • 12:50 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
  • 12:50 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/proton: apply
  • 12:49 claime: doubling replicas for proton in eqiad
  • 12:34 cgoubert@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw2336.codfw.wmnet|mw2337.codfw.wmnet|mw2386.codfw.wmnet|mw2387.codfw.wmnet|mw2388.codfw.wmnet|mw2389.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 12:34 claime: Pooling and uncordoning mw2336.codfw.wmnet,mw2337.codfw.wmnet,mw2386.codfw.wmnet,mw2387.codfw.wmnet,mw2388.codfw.wmnet,mw2389.codfw.wmnet - T351074
  • 12:32 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset: apply
  • 12:32 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset: apply
  • 12:26 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS bookworm
  • 12:25 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
  • 12:25 claime: Running homer 'cr*codfw*' commit 'T351074'
  • 12:25 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
  • 12:24 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
  • 12:24 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
  • 12:24 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:23 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 12:23 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
  • 12:22 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host testvm2006.codfw.wmnet
  • 12:22 ayounsi@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 12:22 slyngs: Switch IDP/SSO-servers to Bookworm
  • 12:19 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 12:19 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
  • 12:18 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:17 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 12:15 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2006.codfw.wmnet
  • 12:15 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:14 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 12:10 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
  • 12:10 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host testvm2006.codfw.wmnet with OS bookworm
  • 12:01 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS bookworm
  • 12:00 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2006.codfw.wmnet
  • 12:00 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host testvm2006.codfw.wmnet with OS bookworm
  • 11:53 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 11:52 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 11:52 claime: Migrating eqiad changeprop to mw-api-int - T360767
  • 11:24 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2336.codfw.wmnet with OS bullseye
  • 11:23 klausman@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 11:23 klausman: bumping concurrency of ORESFetchScoreJob up to help with removing backlog
  • 11:22 klausman@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 11:21 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2388.codfw.wmnet with OS bullseye
  • 11:20 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2386.codfw.wmnet with OS bullseye
  • 11:19 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4045.ulsfo.wmnet
  • 11:19 fabfur: *repooling* cp4045 with Benthos (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1013526) (T358109)
  • 11:12 cgoubert@deploy1002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 11:11 cgoubert@deploy1002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 11:11 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2337.codfw.wmnet with reason: host reimage
  • 11:11 claime: Migrating changeprop staging to mw-api-int - T360767
  • 11:10 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4045.ulsfo.wmnet
  • 11:09 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2389.codfw.wmnet with reason: host reimage
  • 11:08 fabfur: depooling cp4045 to install && test benthos (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1013526) (T358109)
  • 11:07 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2387.codfw.wmnet with reason: host reimage
  • 11:04 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2336.codfw.wmnet with reason: host reimage
  • 11:04 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 11:04 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 11:04 ladsgroup@deploy1002: Finished scap: Backport for Set four more wikis to read new in pagelinks migration (T351237) (duration: 13m 13s)
  • 11:03 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2388.codfw.wmnet with reason: host reimage
  • 11:01 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:01 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:01 brouberol@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:01 brouberol@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 11:00 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2386.codfw.wmnet with reason: host reimage
  • 10:59 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2389.codfw.wmnet with reason: host reimage
  • 10:58 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2388.codfw.wmnet with reason: host reimage
  • 10:58 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2387.codfw.wmnet with reason: host reimage
  • 10:57 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2337.codfw.wmnet with reason: host reimage
  • 10:57 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2336.codfw.wmnet with reason: host reimage
  • 10:57 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2386.codfw.wmnet with reason: host reimage
  • 10:52 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 10:52 ladsgroup@deploy1002: ladsgroup: Backport for Set four more wikis to read new in pagelinks migration (T351237) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:50 ladsgroup@deploy1002: Started scap: Backport for Set four more wikis to read new in pagelinks migration (T351237)
  • 10:45 Dreamy_Jazz: Restarting MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 10:44 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host mw2389.codfw.wmnet with OS bullseye
  • 10:43 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host mw2388.codfw.wmnet with OS bullseye
  • 10:43 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host mw2387.codfw.wmnet with OS bullseye
  • 10:42 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host mw2386.codfw.wmnet with OS bullseye
  • 10:42 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host mw2337.codfw.wmnet with OS bullseye
  • 10:41 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host mw2336.codfw.wmnet with OS bullseye
  • 10:25 claime: Depooling mw2336.codfw.wmnet,mw2337.codfw.wmnet,mw2386.codfw.wmnet,mw2387.codfw.wmnet,mw2388.codfw.wmnet,mw2389.codfw.wmnet - T351074
  • 10:02 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 10:02 brouberol@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:01 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 10:01 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 10:01 brouberol@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 10:01 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 10:01 brouberol@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 10:01 brouberol@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 10:00 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:00 brouberol@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 10:00 brouberol@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 10:00 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:57 brouberol@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:57 brouberol@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:57 brouberol@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:47 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
  • 09:44 brouberol@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:44 brouberol@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:44 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
  • 09:43 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:43 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:43 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:43 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 09:42 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 09:41 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:41 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:34 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 09:32 hashar: Cancelling the Gerrit 3.8 upgrade
  • 09:03 hashar@deploy1002: Finished scap: Backport for Set wgUploadNavigationUrl for is.wikibooks (T360431) (duration: 13m 29s)
  • 08:56 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 08:56 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 08:53 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS bookworm
  • 08:52 hashar@deploy1002: ammarpad and hashar: Continuing with sync
  • 08:52 hashar@deploy1002: ammarpad and hashar: Backport for Set wgUploadNavigationUrl for is.wikibooks (T360431) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:49 hashar@deploy1002: Started scap: Backport for Set wgUploadNavigationUrl for is.wikibooks (T360431)
  • 08:42 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
  • 08:42 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
  • 08:41 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
  • 08:41 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
  • 08:41 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:41 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
  • 08:40 hashar@deploy1002: Finished scap: Backport for Remove Nearby extension and Minerva donate button for nowikimedia (T360782 T360783) (duration: 18m 38s)
  • 08:40 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
  • 08:38 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 08:38 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
  • 08:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit2002.wikimedia.org with reason: Gerrit update
  • 08:32 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on gerrit2002.wikimedia.org with reason: Gerrit update
  • 08:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1003.wikimedia.org with reason: Gerrit update
  • 08:31 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on gerrit1003.wikimedia.org with reason: Gerrit update
  • 08:29 hashar@deploy1002: jhsoby and hashar: Continuing with sync
  • 08:29 hashar@deploy1002: jhsoby and hashar: Backport for Remove Nearby extension and Minerva donate button for nowikimedia (T360782 T360783) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:22 hashar@deploy1002: Started scap: Backport for Remove Nearby extension and Minerva donate button for nowikimedia (T360782 T360783)
  • 07:42 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 07:42 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 03:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:06 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: Switch block schema to read-new/write-both mode T355034 (duration: 12m 53s)
  • 01:45 tstarling@deploy1002: Finished scap: Backport for block: Fix exception in ApiQueryBlocks when specified users are not blocked (T360088) (duration: 28m 51s)
  • 01:33 tstarling@deploy1002: tstarling: Continuing with sync
  • 01:33 tstarling@deploy1002: tstarling: Backport for block: Fix exception in ApiQueryBlocks when specified users are not blocked (T360088) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 01:16 tstarling@deploy1002: Started scap: Backport for block: Fix exception in ApiQueryBlocks when specified users are not blocked (T360088)

2024-03-24

  • 23:59 denisse: restarting apache2 on logstash1023 - T337818
  • 06:16 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:16 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:12 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:12 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:04 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:01 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:01 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:28 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:28 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:05 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:05 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:01 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:01 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:43 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:33 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:12 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:12 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:33 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-23

  • 21:40 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:40 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:19 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:16 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:16 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:04 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:13 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:13 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:11 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:11 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:48 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:48 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:15 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:15 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:30 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:53 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:53 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:13 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:13 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:04 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:40 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:22 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:22 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:11 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:11 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:07 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:07 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:04 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:27 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:27 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-22

  • 23:58 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudbackup2003.codfw.wmnet with OS bookworm
  • 23:30 mutante: Phabricator - added to group WMF-NDA for private tickets: @roti_WMDE , @Siko_WMDE , @Tobi_WMDE_SW , @thiemowmde , @WMDECyn , @WMDE-Fisch per T358578 and "NDA and MOU" spreadsheet
  • 23:20 mutante: Phabricator - added to group WMF-NDA for private tickets: @Ifrahkhanyaree_WMDE , @jon_amar-WMDE , @lilients_WMDE , @RickiJay-WMDE per T358578 and "NDA and MOU" spreadsheet
  • 23:15 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudbackup2003.codfw.wmnet with OS bookworm
  • 23:13 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=93) for host cloudbackup2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:10 mutante: Phabricator - added to group WMF-NDA for private tickets: @Dima_Koushha_WMDE, @elal, @danshick_wmde, @gabriel-wmde per T358578 and "NDA and MOU" spreadsheet
  • 23:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:05 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:05 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:54 mutante: Phabricator - added to group WMF-NDA for private tickets: @adee_wmde, @AbbanWMDE, @Andrew-WMDE per T358578 and "NDA and MOU" spreadsheet
  • 22:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:30 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudbackup2003.codfw.wmnet with OS bookworm
  • 18:35 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup2004.codfw.wmnet with OS bookworm
  • 18:35 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 18:30 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 18:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudbackup2004.codfw.wmnet with reason: host reimage
  • 18:13 dancy@deploy1002: Finished scap: Backport for Exclude night-mode lint from signature validation (T360796) (duration: 25m 55s)
  • 18:11 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudbackup2004.codfw.wmnet with reason: host reimage
  • 18:11 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:11 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:02 dancy@deploy1002: dancy and cscott: Continuing with sync
  • 18:02 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:56 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:56 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:53 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:53 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:51 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudbackup2004.codfw.wmnet with OS bookworm
  • 17:51 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudbackup2003.codfw.wmnet with OS bookworm
  • 17:50 dancy@deploy1002: dancy and cscott: Backport for Exclude night-mode lint from signature validation (T360796) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:49 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:49 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudbackup2003']
  • 17:48 dancy@deploy1002: Started scap: Backport for Exclude night-mode lint from signature validation (T360796)
  • 17:46 Emperor: depool ms-fe2010
  • 17:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:43 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudbackup2003']
  • 17:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['cloudbackup2003']
  • 17:42 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudbackup2003']
  • 17:42 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudbackup2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 17:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host cloudbackup2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 17:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudbackup2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 17:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudbackup2004']
  • 17:31 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudbackup2004']
  • 17:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:25 eoghan@cumin1002: END (FAIL) - Cookbook sre.gitlab.failover (exit_code=99) Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 17:17 topranks: changing IPv6 anycast GW IP on codfw row A/B switches
  • 17:15 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:15 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update IPs for lsw irb interfaces codfw row a b private vlans - cmooney@cumin1002"
  • 17:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host cloudbackup2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 17:14 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update IPs for lsw irb interfaces codfw row a b private vlans - cmooney@cumin1002"
  • 17:12 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 17:12 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts sretest2003.codfw.wmnet
  • 17:12 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:12 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cmooney@cumin1002"
  • 17:10 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cmooney@cumin1002"
  • 17:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:10 cmooney@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 17:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:08 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 17:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:07 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:05 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:01 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:41 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 16:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host cloudbackup2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:36 cmooney@cumin1002: START - Cookbook sre.hosts.decommission for hosts sretest2003.codfw.wmnet
  • 16:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudbackup2003 to codfw - jhancock@cumin2002"
  • 16:32 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 16:32 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudbackup2003 to codfw - jhancock@cumin2002"
  • 16:29 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 16:01 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:01 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old asw-b-codfw entries - cmooney@cumin1002"
  • 16:00 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old asw-b-codfw entries - cmooney@cumin1002"
  • 15:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:30 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:15 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:15 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:05 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 15:04 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Remove asw-b-codfw from synced hiera data - cmooney@cumin1002 - T360776"
  • 15:04 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.23 refs T354441
  • 14:52 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Remove asw-b-codfw from synced hiera data - cmooney@cumin1002 - T360776"
  • 14:40 eoghan@cumin1002: START - Cookbook sre.gitlab.failover Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 14:37 eoghan@cumin1002: END (FAIL) - Cookbook sre.gitlab.failover (exit_code=93) Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 14:37 eoghan@cumin1002: START - Cookbook sre.gitlab.failover Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 14:35 eoghan@cumin1002: END (FAIL) - Cookbook sre.gitlab.failover (exit_code=93) Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 14:35 eoghan@cumin1002: START - Cookbook sre.gitlab.failover Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 14:35 eoghan@cumin1002: END (ERROR) - Cookbook sre.gitlab.failover (exit_code=93) Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 14:20 urandom: restarting Cassandra decommission of restbase1024-{b,c} — T360548
  • 14:11 topranks: disabling LAG from asw-b-codfw to ssw-aX-codfw T360776
  • 14:07 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on asw-b-codfw with reason: prepping to decom switch stack
  • 14:07 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on asw-b-codfw with reason: prepping to decom switch stack
  • 13:31 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:31 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:29 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:29 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:28 brouberol@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 13:28 brouberol@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 13:23 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:23 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:17 elukey: `elukey@cumin1002:~$ sudo cumin 'stat100[4,5,8,9]*' 'kill `pgrep -u kcv-wikimf`'` to unblock puppet on various stat nodes
  • 13:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:07 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:07 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:06 brouberol@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 13:06 brouberol@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 12:44 eoghan@cumin1002: START - Cookbook sre.gitlab.failover Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 12:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:03 reedy@deploy1002: Synchronized php-1.42.0-wmf.23/includes/htmlform/fields/HTMLHiddenField.php: T360717 (duration: 13m 06s)
  • 11:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:59 btullis@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts an-worker1168.eqiad.wmnet
  • 10:59 btullis@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts an-worker1168.eqiad.wmnet
  • 10:56 klausman@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 10:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:39 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-worker1168.eqiad.wmnet with reason: Investigating disk errors
  • 10:38 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-worker1168.eqiad.wmnet with reason: Investigating disk errors
  • 10:36 btullis@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts an-worker1168.eqiad.wmnet
  • 10:36 btullis@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts an-worker1168.eqiad.wmnet
  • 10:34 btullis@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts an-worker1168.eqiad.wmnet
  • 10:34 btullis@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts an-worker1168.eqiad.wmnet
  • 10:29 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:29 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:23 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 10:17 moritzm: uploaded jenkins 2.440.2 to apt.wikimedia.org T360759
  • 10:16 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:16 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 10:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts apt1001.wikimedia.org
  • 10:06 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: apt1001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 10:04 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:02 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: apt1001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 09:56 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:56 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:53 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 09:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:49 brouberol@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts apt1001.wikimedia.org
  • 09:47 brouberol@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts apt2001.wikimedia.org
  • 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: apt2001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 09:38 jnuche@deploy1002: Installation of scap version "4.73.1" completed for 371 hosts
  • 09:37 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: apt2001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 09:37 jnuche@deploy1002: Installing scap version "4.73.1" for 371 hosts
  • 09:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 09:30 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts apt2001.wikimedia.org
  • 09:20 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 09:19 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 09:06 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 08:58 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 08:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:57 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 08:57 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 08:56 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 08:18 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 08:18 jayme@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 08:14 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 08:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:07 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:03 jayme@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 08:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:02 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:54 slyngs: Enable Bookworm IDP/CAS/SSO servers
  • 07:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:33 ryankemper: T358882 Also updated cross-cluster seeds for ports `9243` and `9443`. Everything should be as expected now.
  • 06:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:22 ryankemper: T358882 Updating cross-cluster seeds to bring into concordance with newly added masters: `ryankemper@mwmaint1002:~/elastic$ python push_cross_cluster_conf.py https://search.svc.codfw.wmnet:9643/_cluster/settings --ccc chi=chi_codfw_masters.lst psi=psi_codfw_masters.lst omega=omega_codfw_masters.lst`
  • 06:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:20 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:07 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:46 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:01 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:31 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Turn it off, and then back on again (schema agreement/reachability)? — T360548 - eevans@cumin1002
  • 00:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:05 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:05 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-21

  • 23:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:54 aqu@deploy1002: Finished deploy [airflow-dags/analytics@582ad55]: Add params to canary events pipeline [airflow-dags/analytics@582ad55c] (duration: 00m 25s)
  • 23:54 aqu@deploy1002: Started deploy [airflow-dags/analytics@582ad55]: Add params to canary events pipeline [airflow-dags/analytics@582ad55c]
  • 23:53 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:50 reedy@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.42.0-wmf.22 refs T354441
  • 23:49 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:49 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:46 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:46 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:17 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Turn it off, and then back on again (schema agreement/reachability)? — T360548 - eevans@cumin1002
  • 23:13 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:13 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:56 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase10[29,32,37-39,25-27,30,33,40-42].eqiad.wmnet: Turn it off, and then back on again (schema agreement/reachability)? — T360548 - eevans@cumin1002
  • 22:42 aqu@deploy1002: Finished deploy [airflow-dags/analytics@9607731]: Add canary events generation dag in Airflow [airflow-dags/analytics@9607731b] (duration: 00m 29s)
  • 22:41 aqu@deploy1002: Started deploy [airflow-dags/analytics@9607731]: Add canary events generation dag in Airflow [airflow-dags/analytics@9607731b]
  • 22:41 mutante: etherpad - switching cert provider to cfssl
  • 22:40 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:40 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old private1-a-codfw entries - cmooney@cumin1002"
  • 22:39 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old private1-a-codfw entries - cmooney@cumin1002"
  • 22:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: introduce new masters - bking@cumin2002 - T353878
  • 22:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbprov2005.codfw.wmnet with OS bullseye
  • 22:05 ladsgroup@deploy1002: rebuilt and synchronized wikiversions files: (no justification provided)
  • 21:44 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dns-rec and not P{dns1004*} and A:dnsbox
  • 21:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:39 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye
  • 21:29 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 21:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 21:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 21:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246 (T356166)', diff saved to https://phabricator.wikimedia.org/P58894 and previous config saved to /var/cache/conftool/dbconfig/20240321-212811-marostegui.json
  • 21:23 topranks: deleting irb.2017 interface from ssw1-a1-codfw and ssw1-a8-codfw
  • 21:20 cjming: end of UTC late backport window
  • 21:20 cjming@deploy1002: Finished scap: Backport for ext-EventStreamConfig: Remove mediawiki.web_ui_scroll_migrated sampling config (T352342) (duration: 14m 24s)
  • 21:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246', diff saved to https://phabricator.wikimedia.org/P58893 and previous config saved to /var/cache/conftool/dbconfig/20240321-211303-marostegui.json
  • 21:08 cjming@deploy1002: cjming and phuedx: Continuing with sync
  • 21:08 cjming@deploy1002: cjming and phuedx: Backport for ext-EventStreamConfig: Remove mediawiki.web_ui_scroll_migrated sampling config (T352342) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:06 cjming@deploy1002: Started scap: Backport for ext-EventStreamConfig: Remove mediawiki.web_ui_scroll_migrated sampling config (T352342)
  • 21:06 topranks: deleting VRRP GW for 10.192.0.1 / private1-a-codfw from codfw core routers and adding to leaf switches row A T351532
  • 21:04 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: introduce new masters - bking@cumin2002 - T353878
  • 21:03 bking@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=elastic2089\.codfw\.wmnet
  • 21:03 cjming@deploy1002: Finished scap: Backport for Support legacy message box styles markup in JavaScript (T360633) (duration: 35m 07s)
  • 21:02 bking@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=elastic209[0-9]\.codfw\.wmnet
  • 21:02 bking@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=elastic20[89]\.codfw\.wmnet
  • 21:01 topranks: adding routes to codfw row a hosts towards spine switch IPs on private1-a-codfw T351532
  • 21:00 bking@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=elastic210[0-9]\.codfw\.wmnet
  • 21:00 bking@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=elastic20[89-99]\.codfw\.wmnet
  • 20:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246', diff saved to https://phabricator.wikimedia.org/P58892 and previous config saved to /var/cache/conftool/dbconfig/20240321-205756-marostegui.json
  • 20:50 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase10[29,32,37-39,25-27,30,33,40-42].eqiad.wmnet: Turn it off, and then back on again (schema agreement/reachability)? — T360548 - eevans@cumin1002
  • 20:47 cjming@deploy1002: cjming and jdlrobson: Continuing with sync
  • 20:45 cjming@deploy1002: cjming and jdlrobson: Backport for Support legacy message box styles markup in JavaScript (T360633) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:43 topranks: deleting irb.2001 and irb.2002 interfaces from codfw spine switches
  • 20:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246 (T356166)', diff saved to https://phabricator.wikimedia.org/P58891 and previous config saved to /var/cache/conftool/dbconfig/20240321-204249-marostegui.json
  • 20:37 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: introduce new masters - bking@cumin2002 - T353878
  • 20:35 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:35 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old private1-b-codfw entries - cmooney@cumin1002"
  • 20:35 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: introduce new masters - bking@cumin2002 - T353878
  • 20:34 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old private1-b-codfw entries - cmooney@cumin1002"
  • 20:27 cjming@deploy1002: Started scap: Backport for Support legacy message box styles markup in JavaScript (T360633)
  • 20:27 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 20:16 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:16 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:14 topranks: deleting irb.2018 interfaces from codfw spine switches T351534
  • 20:11 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:11 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old private1-b-codfw entries - cmooney@cumin1002"
  • 20:11 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase10[28,32,34-36].eqiad.wmnet: Turn it off, and then back on again (schema agreement/reachability)? — T360548 - eevans@cumin1002
  • 20:10 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old private1-b-codfw entries - cmooney@cumin1002"
  • 19:59 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 19:22 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase10[28,32,34-36].eqiad.wmnet: Turn it off, and then back on again (schema agreement/reachability)? — T360548 - eevans@cumin1002
  • 19:17 topranks: remove VRRP GW IP for vlan 2018 from codfw core routers and add to EVPN switches irb.2018 interface T351534
  • 19:09 topranks: adding routes to codfw row b hosts towards spine switch IPs on private1-b-codfw T351534
  • 19:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1246 (T356166)', diff saved to https://phabricator.wikimedia.org/P58889 and previous config saved to /var/cache/conftool/dbconfig/20240321-190723-marostegui.json
  • 19:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1246.eqiad.wmnet with reason: Maintenance
  • 19:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1246.eqiad.wmnet with reason: Maintenance
  • 19:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 19:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 19:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T356166)', diff saved to https://phabricator.wikimedia.org/P58888 and previous config saved to /var/cache/conftool/dbconfig/20240321-190640-marostegui.json
  • 18:55 topranks: removing IPv6 VRRP config on codfw core routers for vlan 2018 private1-b-codfw T351534
  • 18:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P58887 and previous config saved to /var/cache/conftool/dbconfig/20240321-185132-marostegui.json
  • 18:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P58886 and previous config saved to /var/cache/conftool/dbconfig/20240321-183625-marostegui.json
  • 18:30 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbprov2005.codfw.wmnet with OS bullseye
  • 18:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T356166)', diff saved to https://phabricator.wikimedia.org/P58884 and previous config saved to /var/cache/conftool/dbconfig/20240321-182117-marostegui.json
  • 18:16 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.23 refs T354441
  • 18:13 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye
  • 18:00 reedy@deploy1002: Synchronized php-1.42.0-wmf.23/extensions/ConfirmEdit/maintenance/GenerateFancyCaptchas.php: T360653 (duration: 16m 00s)
  • 17:06 urandom: restarting decommissions (restbase1024-{b,c}) — T360548
  • 16:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T356166)', diff saved to https://phabricator.wikimedia.org/P58882 and previous config saved to /var/cache/conftool/dbconfig/20240321-165240-marostegui.json
  • 16:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 16:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 16:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T356166)', diff saved to https://phabricator.wikimedia.org/P58881 and previous config saved to /var/cache/conftool/dbconfig/20240321-165215-marostegui.json
  • 16:46 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=registry1004.eqiad.wmnet
  • 16:46 elukey@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM registry1004.eqiad.wmnet
  • 16:44 elukey: edit /etc/network/interfaces on registry1004 (ens5 => ens13) - T360637
  • 16:39 elukey@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM registry1004.eqiad.wmnet
  • 16:38 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=registry1004.eqiad.wmnet
  • 16:38 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=registry1003.eqiad.wmnet
  • 16:38 elukey@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM registry1003.eqiad.wmnet
  • 16:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P58880 and previous config saved to /var/cache/conftool/dbconfig/20240321-163708-marostegui.json
  • 16:36 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov2005.codfw.wmnet with OS bullseye
  • 16:36 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye
  • 16:35 elukey: edit /etc/network/interfaces on registry1003 (ens5 => ens13) - T360637
  • 16:34 sukhe@cumin1002: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dns-rec and not P{dns1004*} and A:dnsbox
  • 16:33 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbprov2005.codfw.wmnet with OS bullseye
  • 16:27 elukey@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM registry1003.eqiad.wmnet
  • 16:25 elukey: expand vram for registry100[3,4] from 4G to 6G - T360637
  • 16:25 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=registry1003.eqiad.wmnet
  • 16:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P58879 and previous config saved to /var/cache/conftool/dbconfig/20240321-162200-marostegui.json
  • 16:17 sukhe@cumin1002: END (ERROR) - Cookbook sre.dns.roll-reboot (exit_code=97) rolling reboot on A:dnsbox
  • 16:17 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye
  • 16:14 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbprov2005.codfw.wmnet with OS bullseye
  • 16:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:07 urandom: disabling read-repair (Cassandra) for restbase tables — T360548
  • 16:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T356166)', diff saved to https://phabricator.wikimedia.org/P58878 and previous config saved to /var/cache/conftool/dbconfig/20240321-160653-marostegui.json
  • 16:04 sukhe@cumin1002: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox
  • 15:50 claime: cgoubert@deploy1002:~$ sudo chown imagecatalog:imagecatalog /srv/deployment/imagecatalog/catalog.sqlite
  • 15:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudbackup2004']
  • 15:33 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye
  • 15:32 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dbprov2005.codfw.wmnet with OS bullseye
  • 15:27 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudbackup2004']
  • 15:27 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['cloudbackup2004']
  • 15:26 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudbackup2004']
  • 15:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudbackup2004.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T356166)', diff saved to https://phabricator.wikimedia.org/P58875 and previous config saved to /var/cache/conftool/dbconfig/20240321-152134-marostegui.json
  • 15:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 15:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 15:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 15:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 15:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T356166)', diff saved to https://phabricator.wikimedia.org/P58874 and previous config saved to /var/cache/conftool/dbconfig/20240321-152051-marostegui.json
  • 15:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host cloudbackup2004.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P58873 and previous config saved to /var/cache/conftool/dbconfig/20240321-150544-marostegui.json
  • 15:04 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudbackup2004.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P58872 and previous config saved to /var/cache/conftool/dbconfig/20240321-145036-marostegui.json
  • 14:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 17 hosts with reason: Schema change T356166
  • 14:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 17 hosts with reason: Schema change T356166
  • 14:36 moritzm: installing glibc security updates on bullseye
  • 14:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T356166)', diff saved to https://phabricator.wikimedia.org/P58871 and previous config saved to /var/cache/conftool/dbconfig/20240321-143528-marostegui.json
  • 14:21 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host cloudbackup2004.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:19 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:19 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudbackup2004 to codfw - jhancock@cumin2002"
  • 14:18 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudbackup2004 to codfw - jhancock@cumin2002"
  • 14:16 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:05 klausman@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 14:02 moritzm: installing squid security updates
  • 13:51 damilare: SmashPig upgraded from 47cd65d9 to 5d275ad6
  • 13:50 sukhe: upgrading pdns-rec to 4.8.7-1 on dns* and doh* hosts
  • 13:16 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 13:14 jiji@deploy1002: Finished scap: Check new deployment server (deploy1002) post switchover - March 2024 (duration: 35m 20s)
  • 13:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1222 (T356166)', diff saved to https://phabricator.wikimedia.org/P58869 and previous config saved to /var/cache/conftool/dbconfig/20240321-130213-marostegui.json
  • 13:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 13:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 13:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T356166)', diff saved to https://phabricator.wikimedia.org/P58868 and previous config saved to /var/cache/conftool/dbconfig/20240321-130151-marostegui.json
  • 12:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P58867 and previous config saved to /var/cache/conftool/dbconfig/20240321-124644-marostegui.json
  • 12:39 jiji@deploy1002: Started scap: Check new deployment server (deploy1002) post switchover - March 2024
  • 12:37 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov2005.codfw.wmnet with reason: host reimage
  • 12:34 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov2005.codfw.wmnet with reason: host reimage
  • 12:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P58866 and previous config saved to /var/cache/conftool/dbconfig/20240321-123135-marostegui.json
  • 12:19 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye
  • 12:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T356166)', diff saved to https://phabricator.wikimedia.org/P58865 and previous config saved to /var/cache/conftool/dbconfig/20240321-121628-marostegui.json
  • 12:15 akosiaris@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
  • 12:00 effie: disable puppet on deployment servers
  • 11:46 akosiaris@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
  • 11:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P58860 and previous config saved to /var/cache/conftool/dbconfig/20240321-112108-marostegui.json
  • 11:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P58857 and previous config saved to /var/cache/conftool/dbconfig/20240321-110600-marostegui.json
  • 10:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1153.eqiad.wmnet with OS bookworm
  • 10:55 akosiaris@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
  • 10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T356166)', diff saved to https://phabricator.wikimedia.org/P58856 and previous config saved to /var/cache/conftool/dbconfig/20240321-105052-marostegui.json
  • 10:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1153.eqiad.wmnet with reason: host reimage
  • 10:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1153.eqiad.wmnet with reason: host reimage
  • 10:34 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host idp1003.wikimedia.org
  • 10:34 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host idp1003.wikimedia.org with OS bookworm
  • 10:32 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 10:29 akosiaris@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
  • 10:28 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1153.eqiad.wmnet with OS bookworm
  • 10:20 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on idp1003.wikimedia.org with reason: host reimage
  • 10:17 slyngshede@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on idp1003.wikimedia.org with reason: host reimage
  • 10:13 akosiaris@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-codfw
  • 10:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T356166)', diff saved to https://phabricator.wikimedia.org/P58855 and previous config saved to /var/cache/conftool/dbconfig/20240321-101119-marostegui.json
  • 10:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 10:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 10:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T356166)', diff saved to https://phabricator.wikimedia.org/P58854 and previous config saved to /var/cache/conftool/dbconfig/20240321-101056-marostegui.json
  • 10:05 slyngshede@cumin1002: START - Cookbook sre.hosts.reimage for host idp1003.wikimedia.org with OS bookworm
  • 10:05 slyngshede@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM idp1003.wikimedia.org - slyngshede@cumin1002"
  • 10:04 slyngshede@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM idp1003.wikimedia.org - slyngshede@cumin1002"
  • 10:03 slyngshede@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) idp1003.wikimedia.org on all recursors
  • 10:03 slyngshede@cumin1002: START - Cookbook sre.dns.wipe-cache idp1003.wikimedia.org on all recursors
  • 10:03 slyngshede@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:03 slyngshede@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM idp1003.wikimedia.org - slyngshede@cumin1002"
  • 10:02 slyngshede@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM idp1003.wikimedia.org - slyngshede@cumin1002"
  • 10:01 Emperor: update ceph-reef packages to 18.2.2 on apt.wm.org
  • 10:00 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 10:00 fabfur: repooling cp4037 for about ~30m (this is last time I'll notice here, no need for this in the future) (T358109)
  • 09:59 slyngshede@cumin1002: START - Cookbook sre.dns.netbox
  • 09:59 slyngshede@cumin1002: START - Cookbook sre.ganeti.makevm for new host idp1003.wikimedia.org
  • 09:59 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host idp2003.wikimedia.org
  • 09:59 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host idp2003.wikimedia.org with OS bookworm
  • 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P58853 and previous config saved to /var/cache/conftool/dbconfig/20240321-095548-marostegui.json
  • 09:46 akosiaris@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw
  • 09:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P58852 and previous config saved to /var/cache/conftool/dbconfig/20240321-094041-marostegui.json
  • 09:40 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on idp2003.wikimedia.org with reason: host reimage
  • 09:38 slyngshede@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on idp2003.wikimedia.org with reason: host reimage
  • 09:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T356166)', diff saved to https://phabricator.wikimedia.org/P58851 and previous config saved to /var/cache/conftool/dbconfig/20240321-092533-marostegui.json
  • 09:22 slyngshede@cumin1002: START - Cookbook sre.hosts.reimage for host idp2003.wikimedia.org with OS bookworm
  • 09:18 slyngshede@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM idp2003.wikimedia.org - slyngshede@cumin1002"
  • 09:17 slyngshede@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM idp2003.wikimedia.org - slyngshede@cumin1002"
  • 09:17 slyngshede@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) idp2003.wikimedia.org on all recursors
  • 09:17 slyngshede@cumin1002: START - Cookbook sre.dns.wipe-cache idp2003.wikimedia.org on all recursors
  • 09:17 slyngshede@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:17 slyngshede@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM idp2003.wikimedia.org - slyngshede@cumin1002"
  • 09:16 slyngshede@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM idp2003.wikimedia.org - slyngshede@cumin1002"
  • 09:12 slyngshede@cumin1002: START - Cookbook sre.dns.netbox
  • 09:12 slyngshede@cumin1002: START - Cookbook sre.ganeti.makevm for new host idp2003.wikimedia.org
  • 09:10 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 08:40 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 08:40 fabfur: repooling cp4037 for about ~30m (T358109)
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T356166)', diff saved to https://phabricator.wikimedia.org/P58850 and previous config saved to /var/cache/conftool/dbconfig/20240321-074032-marostegui.json
  • 07:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 07:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T356166)', diff saved to https://phabricator.wikimedia.org/P58849 and previous config saved to /var/cache/conftool/dbconfig/20240321-074009-marostegui.json
  • 07:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:33 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:28 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:28 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P58848 and previous config saved to /var/cache/conftool/dbconfig/20240321-072501-marostegui.json
  • 07:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:23 oblivian@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
  • 07:22 oblivian@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
  • 07:19 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P58847 and previous config saved to /var/cache/conftool/dbconfig/20240321-070954-marostegui.json
  • 07:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:01 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T356166)', diff saved to https://phabricator.wikimedia.org/P58846 and previous config saved to /var/cache/conftool/dbconfig/20240321-065446-marostegui.json
  • 06:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T356166)', diff saved to https://phabricator.wikimedia.org/P58845 and previous config saved to /var/cache/conftool/dbconfig/20240321-065232-marostegui.json
  • 06:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 06:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 06:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 06:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 17 hosts with reason: Schema change T356166
  • 06:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 17 hosts with reason: Schema change T356166
  • 06:29 marostegui: dbmaint deploy schema change s1 codfw T355609
  • 06:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on 12 hosts with reason: Schema change T356166
  • 06:28 marostegui: dbmaint deploy schema change s3 codfw T356166
  • 06:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 10:00:00 on 12 hosts with reason: Schema change T356166
  • 06:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on 17 hosts with reason: Schema change T356166
  • 06:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 10:00:00 on 17 hosts with reason: Schema change T356166
  • 06:25 marostegui: dbmaint deploy schema change s1 codfw T356166
  • 06:25 marostegui: dbmaint deploy schema change s2 codfw T356166
  • 06:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on es[2023-2025].codfw.wmnet with reason: Migrate to 10.6
  • 06:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on es[2023-2025].codfw.wmnet with reason: Migrate to 10.6
  • 05:36 kart_: Updated cxserver to 2024-03-20-072017-production (T352739)
  • 05:33 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 05:33 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:32 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:32 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 05:31 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:31 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 05:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:33 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:11 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:27 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:27 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:16 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:16 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:12 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:12 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:46 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:46 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:20 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-20

  • 23:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:33 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:08 dancy@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.23 refs T354441 (duration: 13m 06s)
  • 21:55 dancy@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.23 refs T354441
  • 21:46 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: sync
  • 21:45 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: sync
  • 21:39 catrope@deploy2002: Finished scap: Backport for Make night theme available on shwiki, exclude additional actions (T359183 T359152) (duration: 17m 10s)
  • 21:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:28 catrope@deploy2002: catrope and jdlrobson: Continuing with sync
  • 21:25 catrope@deploy2002: catrope and jdlrobson: Backport for Make night theme available on shwiki, exclude additional actions (T359183 T359152) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:23 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:23 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:22 catrope@deploy2002: Started scap: Backport for Make night theme available on shwiki, exclude additional actions (T359183 T359152)
  • 21:20 catrope@deploy2002: Finished scap: Backport for htmlform: Fix double escaping in Label div (T360381) (duration: 15m 54s)
  • 21:15 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:08 catrope@deploy2002: catrope: Continuing with sync
  • 21:07 catrope@deploy2002: catrope: Backport for htmlform: Fix double escaping in Label div (T360381) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:04 catrope@deploy2002: Started scap: Backport for htmlform: Fix double escaping in Label div (T360381)
  • 21:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 catrope@deploy2002: Finished scap: Backport for Add inline background color (T359205 T360565) (duration: 42m 57s)
  • 20:47 catrope@deploy2002: catrope and zabe: Continuing with sync
  • 20:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:45 catrope@deploy2002: catrope and zabe: Backport for Add inline background color (T359205 T360565) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:23 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:22 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:20 catrope@deploy2002: Started scap: Backport for Add inline background color (T359205 T360565)
  • 20:15 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:15 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:13 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:12 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:49 bearloga@deploy2002: Finished deploy [airflow-dags/analytics_product@49dac10]: (no justification provided) (duration: 00m 05s)
  • 19:49 bearloga@deploy2002: Started deploy [airflow-dags/analytics_product@49dac10]: (no justification provided)
  • 19:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:45 bearloga@deploy2002: Finished deploy [airflow-dags/analytics_product@49dac10]: (no justification provided) (duration: 00m 08s)
  • 19:45 bearloga@deploy2002: Started deploy [airflow-dags/analytics_product@49dac10]: (no justification provided)
  • 19:26 denisse: Starting the Prometheus service - T354399
  • 19:26 denisse: Moving the WAL directory to start with a fresh WAL - T354399
  • 19:22 denisse: stopping the Prometheus service on all Prometheus instances to remediate Thanos Sidecar issues - T354399
  • 19:21 denisse: stoping Prometheus on all instances to remediate Thanos Sidecar issues.
  • 19:16 marostegui@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P58844 and previous config saved to /var/cache/conftool/dbconfig/20240320-191649-root.json
  • 19:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P58843 and previous config saved to /var/cache/conftool/dbconfig/20240320-191412-root.json
  • 19:01 marostegui@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P58842 and previous config saved to /var/cache/conftool/dbconfig/20240320-190143-root.json
  • 18:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P58841 and previous config saved to /var/cache/conftool/dbconfig/20240320-185906-root.json
  • 18:56 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.23 refs T354441
  • 18:54 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2108.codfw.wmnet with OS bullseye
  • 18:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P58840 and previous config saved to /var/cache/conftool/dbconfig/20240320-184637-root.json
  • 18:44 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P58839 and previous config saved to /var/cache/conftool/dbconfig/20240320-184400-root.json
  • 18:37 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2108.codfw.wmnet with reason: host reimage
  • 18:35 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2108.codfw.wmnet with reason: host reimage
  • 18:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:31 marostegui@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P58838 and previous config saved to /var/cache/conftool/dbconfig/20240320-183131-root.json
  • 18:30 urbanecm@deploy2002: Finished scap: Backport for Revert "NewcomerTaskStore: update the task queue before finishing loading" (T360469 T359992) (duration: 16m 02s)
  • 18:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P58837 and previous config saved to /var/cache/conftool/dbconfig/20240320-182855-root.json
  • 18:19 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2108.codfw.wmnet with OS bullseye
  • 18:18 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 18:17 urbanecm@deploy2002: urbanecm: Backport for Revert "NewcomerTaskStore: update the task queue before finishing loading" (T360469 T359992) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P58836 and previous config saved to /var/cache/conftool/dbconfig/20240320-181626-root.json
  • 18:14 urbanecm@deploy2002: Started scap: Backport for Revert "NewcomerTaskStore: update the task queue before finishing loading" (T360469 T359992)
  • 18:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P58835 and previous config saved to /var/cache/conftool/dbconfig/20240320-181350-root.json
  • 18:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2096.codfw.wmnet
  • 18:06 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:06 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2096.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 18:05 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2096.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 18:03 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 18:01 marostegui@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P58834 and previous config saved to /var/cache/conftool/dbconfig/20240320-180120-root.json
  • 17:59 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2096.codfw.wmnet
  • 17:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P58833 and previous config saved to /var/cache/conftool/dbconfig/20240320-175844-root.json
  • 17:57 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2096 from dbctl', diff saved to https://phabricator.wikimedia.org/P58832 and previous config saved to /var/cache/conftool/dbconfig/20240320-175702-marostegui.json
  • 17:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P58831 and previous config saved to /var/cache/conftool/dbconfig/20240320-174614-root.json
  • 17:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P58830 and previous config saved to /var/cache/conftool/dbconfig/20240320-174339-root.json
  • 17:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on 16 hosts with reason: Schema change T356166
  • 17:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 10:00:00 on 16 hosts with reason: Schema change T356166
  • 17:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2107.codfw.wmnet with OS bullseye
  • 17:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on 13 hosts with reason: Schema change T355609
  • 17:22 marostegui: dbmaint deploy schema change s5 codfw T356166
  • 17:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 10:00:00 on 13 hosts with reason: Schema change T355609
  • 17:21 marostegui: dbmaint deploy schema change s8 codfw T355609
  • 17:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on 16 hosts with reason: Schema change T355609
  • 17:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 10:00:00 on 16 hosts with reason: Schema change T355609
  • 17:16 dancy@deploy2002: Finished scap: Backport for mime: Register `.owl` as application/rdf+xml (T171807 T359643) (duration: 15m 24s)
  • 17:15 marostegui: dbmaint deploy schema change s2 codfw T356166
  • 17:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repool es2024 T358746', diff saved to https://phabricator.wikimedia.org/P58829 and previous config saved to /var/cache/conftool/dbconfig/20240320-171356-root.json
  • 17:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2113.codfw.wmnet with reason: Maintenance
  • 17:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2113.codfw.wmnet with reason: Maintenance
  • 17:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2107.codfw.wmnet with reason: host reimage
  • 17:04 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2107.codfw.wmnet with reason: host reimage
  • 17:04 dancy@deploy2002: dancy: Continuing with sync
  • 17:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2024 T358746', diff saved to https://phabricator.wikimedia.org/P58828 and previous config saved to /var/cache/conftool/dbconfig/20240320-170413-root.json
  • 17:03 dancy@deploy2002: dancy: Backport for mime: Register `.owl` as application/rdf+xml (T171807 T359643) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repool es2025 T358746', diff saved to https://phabricator.wikimedia.org/P58827 and previous config saved to /var/cache/conftool/dbconfig/20240320-170332-root.json
  • 17:01 dancy@deploy2002: Started scap: Backport for mime: Register `.owl` as application/rdf+xml (T171807 T359643)
  • 16:57 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2025 T358746', diff saved to https://phabricator.wikimedia.org/P58826 and previous config saved to /var/cache/conftool/dbconfig/20240320-165710-root.json
  • 16:48 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2107.codfw.wmnet with OS bullseye
  • 16:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 16:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 16:31 sukhe: rolling restart of confd on A:dnsbox to resolve state state files issue
  • 16:22 jiji@deploy2002: Finished scap: Backport for debug.json: List primary DC servers first (switchover #5) (T357547) (duration: 24m 27s)
  • 16:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 27 hosts with reason: Schema change
  • 16:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 27 hosts with reason: Schema change
  • 16:17 marostegui: dbmaint deploy schema change s6 codfw T356166
  • 16:17 marostegui: dbmaint deploy schema change s8 codfw T356166
  • 16:10 jiji@deploy2002: jiji: Continuing with sync
  • 16:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on 37 hosts with reason: Remove circular replication in s1 T358200
  • 16:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:10:00 on 37 hosts with reason: Remove circular replication in s1 T358200
  • 16:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on 36 hosts with reason: Remove circular replication in s4 T358200
  • 16:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:10:00 on 36 hosts with reason: Remove circular replication in s4 T358200
  • 16:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on 34 hosts with reason: Remove circular replication in s8 T358200
  • 16:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:10:00 on 34 hosts with reason: Remove circular replication in s8 T358200
  • 16:03 jiji@deploy2002: jiji: Backport for debug.json: List primary DC servers first (switchover #5) (T357547) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on 27 hosts with reason: Remove circular replication in s5 T358200
  • 16:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:10:00 on 27 hosts with reason: Remove circular replication in s5 T358200
  • 15:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 31 hosts with reason: Remove circular replication in s7 T358200
  • 15:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 31 hosts with reason: Remove circular replication in s7 T358200
  • 15:58 jiji@deploy2002: Started scap: Backport for debug.json: List primary DC servers first (switchover #5) (T357547)
  • 15:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 26 hosts with reason: Remove circular replication in s3 T358200
  • 15:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 26 hosts with reason: Remove circular replication in s3 T358200
  • 15:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 29 hosts with reason: Remove circular replication in s2 T358200
  • 15:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 29 hosts with reason: Remove circular replication in s2 T358200
  • 15:53 moritzm: installing usb.ids bugfix updates from Bookworm point release
  • 15:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 27 hosts with reason: Remove circular replication in s6 T358200
  • 15:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 27 hosts with reason: Remove circular replication in s6 T358200
  • 15:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: Remove circular replication in es4 T358200
  • 15:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 6 hosts with reason: Remove circular replication in es4 T358200
  • 15:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: Remove circular replication in es5 T358200
  • 15:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 6 hosts with reason: Remove circular replication in es5 T358200
  • 15:48 moritzm: installing usbutils bugfix updates from Bookworm point release
  • 15:45 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 15:30 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 15:30 fabfur: repooling cp4037 for a little longer than last time (T358109)
  • 15:28 moritzm: installing squid security updates
  • 15:18 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 15:18 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 15:18 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 15:17 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 15:17 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 15:17 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 15:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 16 hosts with reason: Remove circular replication in x1 T358200
  • 15:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 16 hosts with reason: Remove circular replication in x1 T358200
  • 14:59 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki viwiki --current --all --touched-after=20230613000000 --start '["17099868"]' 2>&1 | tee ~/T315510-viwiki-4; date # in tmux; note the changed mwmaint host :)
  • 14:50 jiji@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Datacenter Switchover - T357547 (duration: 61m 28s)
  • 14:45 Dreamy_Jazz: Starting MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 14:36 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0)
  • 14:29 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters
  • 14:27 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0)
  • 14:27 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl
  • 14:26 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0)
  • 14:24 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance
  • 14:23 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner (exit_code=0)
  • 14:23 root@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 14:21 root@deploy2002: helmfile [codfw] [canary] DONE helmfile.d/services/mw-jobrunner : sync
  • 14:21 root@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
  • 14:21 root@deploy2002: helmfile [codfw] [canary] START helmfile.d/services/mw-jobrunner : sync
  • 14:21 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner
  • 14:18 marostegui: Test write T357547
  • 14:18 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0)
  • 14:18 jiji@cumin1002: MediaWiki read-only period ends at: 2024-03-20 14:18:32.727570
  • 14:15 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.02-set-readonly
  • 14:13 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0)
  • 14:13 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance
  • 14:03 jiji@cumin1002: END (FAIL) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=99)
  • 14:03 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance
  • 14:02 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0)
  • 13:57 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl
  • 13:56 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0)
  • 13:56 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks
  • 13:55 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0)
  • 13:55 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.00-disable-puppet
  • 13:48 jiji@deploy2002: Locking from deployment [ALL REPOSITORIES]: Datacenter Switchover - T357547
  • 13:43 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 13:42 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 13:42 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 13:42 akosiaris: update chageprop-jobqueue to include rdb101{1,2} IPv6 related netpols
  • 13:41 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 13:32 Lucas_WMDE: <moritzm> 13:16 UTC: installing libuv1 security updates on bullseye [re-log, original message wasn’t logged]
  • 13:20 jayme: manually scaled up changeprop replicas in eqiad from 12 to 15
  • 13:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:08 moritzm: installing imagemagick security updates
  • 13:02 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 12:59 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host idp-test2002.wikimedia.org with OS bookworm
  • 12:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset: apply
  • 12:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset: apply
  • 12:57 moritzm: installing tiff security updates
  • 12:52 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 12:50 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 12:48 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 12:48 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 12:43 claime: Depooled swift-rw from codfw
  • 11:24 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 11:23 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 11:22 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 11:22 claime: deploying new namespace limits for changeprop
  • 11:22 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 11:10 godog: bounce apache2 on logstash1031 - T337818
  • 11:10 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 11:10 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 10:54 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 10:50 brouberol: superset.wikimedia.org is now migrated to the DSE k8s cluster, CAS errors have receeded
  • 10:31 claime: rolling back changeprop to previous version
  • 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for idm-test1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002
  • 10:25 jmm@cumin2002: START - Cookbook sre.puppet.renew-cert for idm-test1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002
  • 10:22 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for idm-test1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002
  • 10:22 jmm@cumin2002: START - Cookbook sre.puppet.renew-cert for idm-test1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002
  • 10:22 brouberol: migrating superset to Kubernetes. Some CAS errors are expected during ~15 minutes
  • 10:17 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2003.codfw.wmnet with OS bookworm
  • 10:17 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
  • 10:16 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: sync
  • 10:16 claime: roll-restarting changeprop in eqiad
  • 10:11 cgoubert@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=(mw1368.eqiad.wmnet|mw1369.eqiad.wmnet|mw1370.eqiad.wmnet|mw1478.eqiad.wmnet|mw1479.eqiad.wmnet),cluster=kubernetes,service=kubesvc
  • 10:08 taavi: revoke labweb.discovery.wmnet cergen cert, migrated to cfssl
  • 10:07 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host idp-test1003.wikimedia.org with OS bullseye
  • 10:03 Lucas_WMDE: STOP persistRevisionThreadItems on viwiki for T315510, will restart after DC switch is done (resume at: --start '["17099868"]')
  • 10:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2003.codfw.wmnet with reason: host reimage
  • 09:59 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2003.codfw.wmnet with reason: host reimage
  • 09:57 claime: running homer 'cr*eqiad*' commit 'T351074'
  • 09:56 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1369.eqiad.wmnet with OS bullseye
  • 09:54 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on idp-test1003.wikimedia.org with reason: host reimage
  • 09:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 200132
  • 09:54 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 200132
  • 09:53 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1478.eqiad.wmnet with OS bullseye
  • 09:52 slyngshede@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on idp-test1003.wikimedia.org with reason: host reimage
  • 09:52 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1479.eqiad.wmnet with OS bullseye
  • 09:50 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1370.eqiad.wmnet with OS bullseye
  • 09:48 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1368.eqiad.wmnet with OS bullseye
  • 09:43 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2003.codfw.wmnet with OS bookworm
  • 09:39 slyngshede@cumin1002: START - Cookbook sre.hosts.reimage for host idp-test1003.wikimedia.org with OS bullseye
  • 09:37 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1369.eqiad.wmnet with reason: host reimage
  • 09:35 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1478.eqiad.wmnet with reason: host reimage
  • 09:32 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1479.eqiad.wmnet with reason: host reimage
  • 09:30 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1370.eqiad.wmnet with reason: host reimage
  • 09:28 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1368.eqiad.wmnet with reason: host reimage
  • 09:27 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1370.eqiad.wmnet with reason: host reimage
  • 09:26 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1479.eqiad.wmnet with reason: host reimage
  • 09:26 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1369.eqiad.wmnet with reason: host reimage
  • 09:26 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1478.eqiad.wmnet with reason: host reimage
  • 09:26 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1368.eqiad.wmnet with reason: host reimage
  • 09:19 Emperor: rolling-restart memcached on swift-fe-eqiad
  • 09:13 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1479.eqiad.wmnet with OS bullseye
  • 09:13 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1478.eqiad.wmnet with OS bullseye
  • 09:12 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1370.eqiad.wmnet with OS bullseye
  • 09:12 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1369.eqiad.wmnet with OS bullseye
  • 09:12 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1368.eqiad.wmnet with OS bullseye
  • 09:02 mvernon@cumin1002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-eqiad
  • 08:59 kart_: Update MinT to 2024-03-20-072303-production (T353791, T340956)
  • 08:59 mvernon@cumin1002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
  • 08:58 claime: Depooling mw1368.eqiad.wmnet,mw1369.eqiad.wmnet,mw1370.eqiad.wmnet,mw1478.eqiad.wmnet,mw1479.eqiad.wmnet - T351074
  • 08:57 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
  • 08:51 moritzm: installing systemd updates from bookworm point release
  • 08:47 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
  • 08:29 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
  • 08:23 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
  • 08:21 kartik@deploy2002: Finished scap: Backport for Enable Content/Section translation on some Wikipedias (T353510) (duration: 17m 06s)
  • 08:09 kartik@deploy2002: kartik: Continuing with sync
  • 08:07 kartik@deploy2002: kartik: Backport for Enable Content/Section translation on some Wikipedias (T353510) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:04 kartik@deploy2002: Started scap: Backport for Enable Content/Section translation on some Wikipedias (T353510)
  • 08:00 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
  • 07:56 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
  • 07:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:12 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:12 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:56 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:56 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:53 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:48 aokoth@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 06:48 aokoth@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 06:48 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:48 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:48 aokoth@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 06:47 aokoth@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 06:41 aokoth@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 06:39 aokoth@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 06:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:23 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:23 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:08 kart_: Updated cxserver to 2024-03-18-111401-production (T353510)
  • 06:08 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 06:08 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 06:07 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 06:07 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 06:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:02 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:56 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:56 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:55 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:54 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 05:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:40 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:40 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:01 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:01 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:20 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:23 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:23 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-19

  • 23:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:12 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 22:09 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 20:59 jdrewniak@deploy2002: Finished scap: Backport for Enable night mode on pilot wikis in AMC mode (T359152) (duration: 16m 45s)
  • 20:47 jdrewniak@deploy2002: jdrewniak and jdlrobson: Continuing with sync
  • 20:46 jdrewniak@deploy2002: jdrewniak and jdlrobson: Backport for Enable night mode on pilot wikis in AMC mode (T359152) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:43 jdrewniak@deploy2002: Started scap: Backport for Enable night mode on pilot wikis in AMC mode (T359152)
  • 20:39 jdrewniak@deploy2002: Finished scap: Backport for The new class should be present alongside the old class for all page views (T359983) (duration: 15m 11s)
  • 20:26 jdrewniak@deploy2002: jdrewniak and jdlrobson: Continuing with sync
  • 20:26 jdrewniak@deploy2002: jdrewniak and jdlrobson: Backport for The new class should be present alongside the old class for all page views (T359983) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:23 jdrewniak@deploy2002: Started scap: Backport for The new class should be present alongside the old class for all page views (T359983)
  • 18:52 dancy@deploy2002: Finished scap: Backport for Remove /w/COPYING and /w/CREDITS symlinks (T359643) (duration: 14m 57s)
  • 18:40 dancy@deploy2002: dancy: Continuing with sync
  • 18:40 dancy@deploy2002: dancy: Backport for Remove /w/COPYING and /w/CREDITS symlinks (T359643) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:39 dzahn@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 18:37 dancy@deploy2002: Started scap: Backport for Remove /w/COPYING and /w/CREDITS symlinks (T359643)
  • 18:37 dzahn@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 18:36 dzahn@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 18:34 dzahn@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 18:33 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 18:31 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 18:21 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.23 refs T354441
  • 18:16 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns1004.wikimedia.org,service=authdns-update
  • 18:16 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns1004.wikimedia.org,service=authdns-update
  • 18:06 sukhe: running dummy authdns-update on dns1004 and dns6001
  • 17:22 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 17:19 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 17:19 fabfur: repooling cp4037 for brief time (T358109)
  • 17:19 sukhe: sudo cumin -b1 -s120 "A:dns-rec and not P{dns6001*}" "run-puppet-agent --enable 'merging CR 1009261'"
  • 17:16 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns6001.wikimedia.org,service=authdns-update
  • 17:15 sukhe: running dummy authdns-update
  • 17:14 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: cluster=dnsbox,name=dns6001.wikimedia.org,service=authdns-update
  • 17:10 sukhe: sudo cumin "A:dns-rec" "disable-puppet 'merging CR 1009261'"
  • 16:55 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 16:54 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 16:54 fabfur: repooling cp4037 for brief time (T358109)
  • 16:53 oblivian@deploy2002: Finished scap: null k8s-only deployment to test scap-master-sync (take 2) (duration: 09m 01s)
  • 16:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:45 herron: kafka-logging1001:~# kafka reassign-partitions -reassignment-json-file udp_localhost-warning.json --execute --throttle 50000000 T326419
  • 16:44 oblivian@deploy2002: Started scap: null k8s-only deployment to test scap-master-sync (take 2)
  • 16:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:37 dancy@deploy2002: Installation of scap version "4.73.0" completed for 373 hosts
  • 16:36 dancy@deploy2002: Installing scap version "4.73.0" for 373 hosts
  • 16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove asw-a-codfw mgmt DNS - pt1979@cumin2002"
  • 16:32 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove asw-a-codfw mgmt DNS - pt1979@cumin2002"
  • 16:30 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 16:28 oblivian@deploy2002: Finished scap: null k8s-only deployment to test scap-master-sync (duration: 08m 47s)
  • 16:19 oblivian@deploy2002: Started scap: null k8s-only deployment to test scap-master-sync
  • 16:10 moritzm: installing dpdk security updates
  • 16:02 moritzm: installing wireshark security updates
  • 15:54 herron: kafka-logging1001:~# kafka reassign-partitions -reassignment-json-file udp_localhost-info.json --execute --throttle 50000000 T326419
  • 15:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2098.codfw.wmnet
  • 15:39 herron: kafka-logging1001:~# kafka reassign-partitions -reassignment-json-file udp_localhost-err.json --execute --throttle 50000000 T326419
  • 15:37 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 15:37 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 15:36 effie: restart kartotherian on codfw
  • 15:36 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 15:36 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 15:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2098.codfw.wmnet
  • 15:35 herron: kafka-logging1001:~# kafka reassign-partitions -reassignment-json-file rsyslog-warning.json --execute --throttle 50000000 T326419
  • 15:32 brennen@deploy2002: Finished deploy [phabricator/deployment@9617e09]: deploy to phab1004 for T358610 (duration: 01m 19s)
  • 15:31 brennen@deploy2002: Started deploy [phabricator/deployment@9617e09]: deploy to phab1004 for T358610
  • 15:31 brennen@deploy2002: Finished deploy [phabricator/deployment@9617e09]: deploy to phab2002 for T358610 (duration: 00m 23s)
  • 15:30 brennen@deploy2002: Started deploy [phabricator/deployment@9617e09]: deploy to phab2002 for T358610
  • 15:30 brennen: starting phabricator/phorge update (T358610)
  • 15:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator/Phorge update
  • 15:29 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator/Phorge update
  • 15:29 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator/Phorge update
  • 15:29 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator/Phorge update
  • 15:26 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 15:25 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 15:23 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 15:22 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: mariadb::misc::db_inventory
  • 15:20 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 15:19 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 15:19 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 15:18 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 15:17 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 15:17 claime: Raising mw-web and mw-api-ext replicas for additional read-only traffic - T357547
  • 15:15 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 15:14 fabfur: repooling cp4037 for brief time (T358109)
  • 15:12 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 15:12 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 15:12 herron: kafka-logging1001:~# kafka reassign-partitions -reassignment-json-file rsyslog-notice.json --execute --throttle 50000000 T326419
  • 15:12 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 15:12 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 15:05 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: mariadb::misc::db_inventory
  • 15:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: mariadb::misc::phabricator
  • 15:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:00 effie: restart kartotherian on eqiad
  • 14:58 effie: pooling kartotherian on codfw back
  • 14:57 jiji@cumin1002: conftool action : set/pooled=true; selector: dnsdisc=kartotherian,name=codfw
  • 14:57 jiji@cumin1002: conftool action : set/pooled=true; selector: dnsdisc=kartotherian,name=eqiad
  • 14:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: mariadb::misc::phabricator
  • 14:42 effie: Traffic+Services switchover complete, codfw is depooled - Τ357547
  • 14:40 jiji@cumin1002: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) depool all services in codfw: Northward DC Switchover, March 2024 - T357547
  • 14:22 effie: depooling services from codfw - T357547
  • 14:16 jiji@cumin1002: START - Cookbook sre.discovery.datacenter depool all services in codfw: Northward DC Switchover, March 2024 - T357547
  • 14:07 effie: Completely depool codfw from user traffic - T357547
  • 13:55 herron: kafka-logging1001:~# kafka reassign-partitions -reassignment-json-file rsyslog-info.json --execute --throttle 50000000 T326419
  • 13:50 herron: kafka-logging1001:~# kafka reassign-partitions -reassignment-json-file mediawiki.httpd.accesslog-sampled.json --execute --throttle 50000000 T326419
  • 13:45 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
  • 13:44 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop: apply
  • 13:43 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 13:43 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 13:42 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 13:42 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 13:42 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 13:41 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 13:41 claime: Deploying changeprop and changeprop-jobqueue - T353876
  • 13:38 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 13:38 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 13:19 hashar: Restarting CI Jenkins
  • 13:08 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1024.eqiad.wmnet with reason: Decommissioning — T354561
  • 13:08 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1024.eqiad.wmnet with reason: Decommissioning — T354561
  • 13:07 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 13:07 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 13:07 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 13:07 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 13:06 claime: manually adding 20 replicas to mw-parsoid to help with big reparse
  • 12:59 cmooney@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 12:59 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 12:59 cmooney@cumin1002: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox-canary
  • 12:49 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 12:49 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 12:48 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 12:48 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 12:48 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 12:11 slyngs: Switch idp-test to upgraded Bookworm host
  • 11:45 cgoubert@deploy2002: Finished scap: mediawiki: Route /w/CREDITS and /w/COPYING to /w/static.php - gerrit:1012627 - T359643 (duration: 06m 59s)
  • 11:40 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:19 oblivian@deploy2002: dancy and oblivian: Continuing with sync
  • 11:18 oblivian@deploy2002: dancy and oblivian: Backport for static.php: Handle COPYING and CREDITS files (T359643) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:16 oblivian@deploy2002: Started scap: Backport for static.php: Handle COPYING and CREDITS files (T359643)
  • 11:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:02 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:53 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:53 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:29 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:29 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:34 hashar@deploy2002: Finished scap: Backport for wikitech: close parenthesis in log message (T307558) (duration: 15m 19s)
  • 09:22 hashar@deploy2002: hashar: Continuing with sync
  • 09:21 hashar@deploy2002: hashar: Backport for wikitech: close parenthesis in log message (T307558) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:19 hashar@deploy2002: Started scap: Backport for wikitech: close parenthesis in log message (T307558)
  • 08:58 hashar@deploy2002: Finished scap: Backport for wikitech: fix curl_exec a falsey value (T307558) (duration: 14m 36s)
  • 08:46 hashar@deploy2002: hashar: Continuing with sync
  • 08:45 hashar@deploy2002: hashar: Backport for wikitech: fix curl_exec a falsey value (T307558) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:43 hashar@deploy2002: Started scap: Backport for wikitech: fix curl_exec a falsey value (T307558)
  • 08:42 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host idp-test1003.wikimedia.org with OS bookworm
  • 08:33 hashar@deploy2002: Finished scap: Backport for wikitech: fix handling of Gerrit status code (T307558) (duration: 15m 31s)
  • 08:24 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on idp-test1003.wikimedia.org with reason: host reimage
  • 08:22 slyngshede@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on idp-test1003.wikimedia.org with reason: host reimage
  • 08:21 hashar@deploy2002: hashar: Continuing with sync
  • 08:20 hashar@deploy2002: hashar: Backport for wikitech: fix handling of Gerrit status code (T307558) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:17 hashar@deploy2002: Started scap: Backport for wikitech: fix handling of Gerrit status code (T307558)
  • 08:09 slyngshede@cumin1002: START - Cookbook sre.hosts.reimage for host idp-test1003.wikimedia.org with OS bookworm
  • 08:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:56 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:56 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:48 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:48 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:43 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:43 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 100%: After HW issues', diff saved to https://phabricator.wikimedia.org/P58816 and previous config saved to /var/cache/conftool/dbconfig/20240319-073655-root.json
  • 07:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:27 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:22 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:22 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 75%: After HW issues', diff saved to https://phabricator.wikimedia.org/P58815 and previous config saved to /var/cache/conftool/dbconfig/20240319-072149-root.json
  • 07:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 50%: After HW issues', diff saved to https://phabricator.wikimedia.org/P58814 and previous config saved to /var/cache/conftool/dbconfig/20240319-070643-root.json
  • 07:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 25%: After HW issues', diff saved to https://phabricator.wikimedia.org/P58813 and previous config saved to /var/cache/conftool/dbconfig/20240319-065137-root.json
  • 06:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 10%: After HW issues', diff saved to https://phabricator.wikimedia.org/P58812 and previous config saved to /var/cache/conftool/dbconfig/20240319-063632-root.json
  • 06:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 5%: After HW issues', diff saved to https://phabricator.wikimedia.org/P58811 and previous config saved to /var/cache/conftool/dbconfig/20240319-062126-root.json
  • 06:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 1%: After HW issues', diff saved to https://phabricator.wikimedia.org/P58810 and previous config saved to /var/cache/conftool/dbconfig/20240319-060620-root.json
  • 05:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:56 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:49 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:49 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:33 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:05 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:23 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:03 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.23 refs T354441 (duration: 57m 18s)
  • 03:15 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:15 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:05 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.23 refs T354441
  • 03:03 mwpresync@deploy2002: Pruned MediaWiki: 1.42.0-wmf.20 (duration: 03m 18s)
  • 01:07 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:07 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:29 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:29 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:25 mutante: moscovium - systemctl start logrotate T360391
  • 00:19 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-18

  • 23:56 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:56 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:49 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:19 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:17 urbanecm@deploy2002: Synchronized private/PrivateSettings.php: Update T250887 mitigations (duration: 13m 27s)
  • 22:15 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:15 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:43 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:43 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:17 damilare: SmashPig upgraded from cc80c042 to 47cd65d9
  • 20:22 urbanecm@deploy2002: Finished scap: Backport for Throttle: add event (T360357), [phase 4] Projects with < 50 user scripts no longer share skin scripts (T301212) (duration: 16m 26s)
  • 20:10 urbanecm@deploy2002: rhinosf1 and urbanecm and jdlrobson: Continuing with sync
  • 20:08 urbanecm@deploy2002: rhinosf1 and urbanecm and jdlrobson: Backport for Throttle: add event (T360357), [phase 4] Projects with < 50 user scripts no longer share skin scripts (T301212) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:05 urbanecm@deploy2002: Started scap: Backport for Throttle: add event (T360357), [phase 4] Projects with < 50 user scripts no longer share skin scripts (T301212)
  • 19:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:19 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:05 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:05 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:16 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 18:15 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 17:56 herron: kafka-logging1001:~# kafka reassign-partitions -reassignment-json-file mediawiki.httpd.accesslog.json --execute --throttle 50000000 T326419
  • 17:46 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:46 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:33 damilare: civicrm upgraded from 87d72bd7 to b8a84b22
  • 16:30 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 16:28 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 16:27 fabfur: repooling cp4037 for very short time to collect HAProxy logs with Benthos (T358109)
  • 16:25 ejegg: donorwiki upgraded from 6ea55e72 to 27d326b7
  • 16:22 hashar@deploy2002: Finished deploy [integration/docroot@b2c74b7]: doc: add Blubber - T352262 (duration: 00m 06s)
  • 16:22 hashar@deploy2002: Started deploy [integration/docroot@b2c74b7]: doc: add Blubber - T352262
  • 16:11 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-restbase (exit_code=0) rolling restart_daemons on P{restbase104*} and A:restbase
  • 16:10 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-restbase rolling restart_daemons on P{restbase104*} and A:restbase
  • 16:06 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-restbase (exit_code=0) rolling restart_daemons on P{restbase103*} and A:restbase
  • 16:00 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-restbase rolling restart_daemons on P{restbase103*} and A:restbase
  • 16:00 moritzm: installing squid security updates
  • 15:57 moritzm: uploaded cas 6.6.12+wmf12u4 (rebuild with/for tomcat9) T357748
  • 15:10 sukhe: sudo cumin -b1 -s120 "A:dns-rec and not P{dns6001*}" "run-puppet-agent --enable 'merging CR 1009316'"
  • 15:07 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns6001.wikimedia.org
  • 15:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:02 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:00 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns6001.wikimedia.org
  • 15:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:59 sukhe: disable puppet on A:dns-rec to merge CR 1009316
  • 14:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:48 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:48 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:31 urbanecm@deploy2002: Finished scap: Backport for [Growth] frwiki: Enable personalized-praise module (T360152), skwiki: Create autopatrolled and patroller groups (T353980), skwiki: Enable RC patrol (T353980) (duration: 14m 36s)
  • 14:24 taavi: re-enable puppet for 1004082 rollout after testing on cp3066, cp3080
  • 14:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:20 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:19 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 14:18 urbanecm@deploy2002: urbanecm: Backport for [Growth] frwiki: Enable personalized-praise module (T360152), skwiki: Create autopatrolled and patroller groups (T353980), skwiki: Enable RC patrol (T353980) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:17 taavi: disable puppet on A:cp to rollout 1004082
  • 14:16 urbanecm@deploy2002: Started scap: Backport for [Growth] frwiki: Enable personalized-praise module (T360152), skwiki: Create autopatrolled and patroller groups (T353980), skwiki: Enable RC patrol (T353980)
  • 14:15 urbanecm@deploy2002: Finished scap: Backport for [Growth] frwiki: Enable personalized praise backend (T360152) (duration: 17m 50s)
  • 14:13 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-restbase (exit_code=0) rolling restart_daemons on P{restbase102[6-9]*} and A:restbase
  • 14:11 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-restbase rolling restart_daemons on P{restbase102[6-9]*} and A:restbase
  • 14:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:05 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-restbase (exit_code=0) rolling restart_daemons on P{restbase102[4-5]*} and A:restbase
  • 14:04 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-restbase rolling restart_daemons on P{restbase102[4-5]*} and A:restbase
  • 14:03 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 14:00 urbanecm@deploy2002: urbanecm: Backport for [Growth] frwiki: Enable personalized praise backend (T360152) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:00 jmm@cumin2002: END (FAIL) - Cookbook sre.misc-clusters.roll-restart-restbase (exit_code=1) rolling restart_daemons on A:restbase-eqiad
  • 13:58 urbanecm@deploy2002: Started scap: Backport for [Growth] frwiki: Enable personalized praise backend (T360152)
  • 13:54 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@b3ccf85] (releasing): (no justification provided) (duration: 00m 54s)
  • 13:54 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-restbase rolling restart_daemons on A:restbase-eqiad
  • 13:53 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@b3ccf85] (releasing): (no justification provided)
  • 13:45 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-restbase (exit_code=0) rolling restart_daemons on A:restbase-codfw
  • 13:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:42 Dreamy_Jazz: Restarted MediaModeration scanning script for commonswiki - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 13:41 moritzm: installing tar security updates
  • 13:40 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host idp-test1003.wikimedia.org with OS bookworm
  • 13:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:36 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-restbase rolling restart_daemons on A:restbase-codfw
  • 13:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:29 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:29 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:24 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on idp-test1003.wikimedia.org with reason: host reimage
  • 13:23 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:23 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:22 slyngshede@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on idp-test1003.wikimedia.org with reason: host reimage
  • 13:12 slyngshede@cumin1002: START - Cookbook sre.hosts.reimage for host idp-test1003.wikimedia.org with OS bookworm
  • 13:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:49 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp-test1003.wikimedia.org
  • 12:46 slyngshede@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM idp-test1003.wikimedia.org
  • 12:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:12 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:12 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:08 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 12:08 fabfur: disabling puppet and depooling cp4037 to gradually apply new HAProxy/Benthos configuration (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1011453) T358109
  • 12:05 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:05 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:49 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:40 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:40 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:20 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:16 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:13 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:13 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:53 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:52 jmm@cumin2002: END (FAIL) - Cookbook sre.aqs.roll-restart-reboot (exit_code=1) rolling restart_daemons on A:aqs-codfw
  • 10:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:43 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:43 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:38 jmm@cumin2002: START - Cookbook sre.aqs.roll-restart-reboot rolling restart_daemons on A:aqs-codfw
  • 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot-master (exit_code=0) rolling restart_daemons on A:maps-master
  • 10:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:20 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot-master rolling restart_daemons on A:maps-master
  • 10:18 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
  • 10:13 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad
  • 10:05 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
  • 10:00 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw
  • 09:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:53 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:53 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:43 moritzm: installing libuv1 security updates
  • 09:30 kostajh: UTC morning deploys done
  • 09:29 kharlan@deploy2002: Finished scap: Backport for throttle: Allow for overriding temp account creation limits (T357777), throttle: Add throttle rule for editathon (T360145) (duration: 42m 23s)
  • 09:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:17 kharlan@deploy2002: ammarpad and kharlan: Continuing with sync
  • 09:16 kharlan@deploy2002: ammarpad and kharlan: Backport for throttle: Allow for overriding temp account creation limits (T357777), throttle: Add throttle rule for editathon (T360145) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:47 kharlan@deploy2002: Started scap: Backport for throttle: Allow for overriding temp account creation limits (T357777), throttle: Add throttle rule for editathon (T360145)
  • 08:37 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging PEarley (WMF) out of all services on: 2210 hosts
  • 08:37 root@cumin2002: START - Cookbook sre.idm.logout Logging PEarley (WMF) out of all services on: 2210 hosts
  • 08:05 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:05 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:49 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:49 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:25 kart_: Updated cxserver to 2024-03-18-053939-production (T350773)
  • 06:24 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 06:23 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 06:23 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 06:22 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 06:07 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 06:07 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 05:46 _joe_: restarted rsyslog on mw1374
  • 05:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:40 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:40 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:27 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:27 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:27 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:27 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:23 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:20 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:18 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:12 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:12 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:46 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:23 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:23 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:13 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:13 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-17

  • 23:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:46 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:46 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:40 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:40 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:30 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:12 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:11 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:49 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:49 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:33 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:15 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:15 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:07 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:07 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:13 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:13 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:04 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1023.eqiad.wmnet with reason: Decommissioning — T354561
  • 19:04 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1023.eqiad.wmnet with reason: Decommissioning — T354561
  • 19:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:43 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:43 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:30 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:01 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:40 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:40 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:56 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:49 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:48 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:53 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:53 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:15 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:15 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:18 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:53 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:53 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:22 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:22 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:20 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:15 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:15 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:19 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-16

  • 23:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:13 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:13 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:05 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:05 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:18 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:23 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:19 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:11 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:11 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:07 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:05 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:02 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:16 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:16 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:02 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:46 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:43 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:40 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:28 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:28 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:04 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:30 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:56 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:56 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:49 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:49 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:18 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:48 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:48 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:43 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:43 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:33 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:23 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:23 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:18 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:29 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:29 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:56 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:56 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:46 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:46 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:16 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:16 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:43 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:43 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:40 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:29 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:29 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:07 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:07 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-15

  • 23:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:22 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:22 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:20 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:11 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:11 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:28 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:28 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:22 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:22 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:19 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:48 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:48 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:48 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:48 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:20 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:56 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:42 damilare: localsettings revision changed from 07e3839c to b3dbab1d
  • 19:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:40 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:40 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:53 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:53 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:25 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
  • 17:10 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@611b85b] (releasing): (no justification provided) (duration: 00m 39s)
  • 17:09 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@611b85b] (releasing): (no justification provided)
  • 17:01 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@78fbb55] (releasing): (no justification provided) (duration: 00m 40s)
  • 17:01 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@78fbb55] (releasing): (no justification provided)
  • 15:53 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:53 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:09 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@7a56c9a] (releasing): (no justification provided) (duration: 00m 41s)
  • 15:08 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@7a56c9a] (releasing): (no justification provided)
  • 14:58 jnuche@deploy2002: Installation of scap version "4.72.0" completed for 373 hosts
  • 14:57 jnuche@deploy2002: Installing scap version "4.72.0" for 373 hosts
  • 12:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:13 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:13 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:20 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:45 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1022.eqiad.wmnet with reason: Decommissioning — T354561
  • 09:45 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1022.eqiad.wmnet with reason: Decommissioning — T354561
  • 09:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:22 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:22 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:18 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:12 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:12 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:02 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-14

  • 22:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:45 ejegg: fundraising civicrm upgraded from e40ebb2a to 87d72bd7
  • 22:44 ejegg: donorwiki upgraded from 32789f89 to 6ea55e72
  • 22:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:30 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:22 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:22 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:17 cstone: SmashPig upgraded from e8dec926 to cc80c042
  • 22:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:34 krinkle@deploy2002: Synchronized src/XWikimediaDebug.php: Support cookies in XWikimediaDebug, I5e33e90fd, T350094 (duration: 12m 08s)
  • 21:21 mutante: LDAP - removed kcv-wikimf from group wmf (T358658)
  • 21:19 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:18 cstone: civicrm upgraded from ee785ecd to e40ebb2a
  • 21:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:14 reedy@deploy2002: Synchronized wmf-config/CommonSettings.php: T303135 (duration: 12m 24s)
  • 17:15 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:42 hashar@deploy2002: Finished scap: Backport for wikitech: allow unblocking inactive accounts (T307558) (duration: 13m 04s)
  • 16:38 bearloga@deploy2002: Finished deploy [airflow-dags/analytics_product@bae55a9]: (no justification provided) (duration: 00m 08s)
  • 16:38 bearloga@deploy2002: Started deploy [airflow-dags/analytics_product@bae55a9]: (no justification provided)
  • 16:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:31 hashar@deploy2002: hashar: Continuing with sync
  • 16:31 hashar@deploy2002: hashar: Backport for wikitech: allow unblocking inactive accounts (T307558) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:29 hashar@deploy2002: Started scap: Backport for wikitech: allow unblocking inactive accounts (T307558)
  • 16:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:29 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ time mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki viwiki --current --all --touched-after=20230613000000 --start '["14615874"]' 2>&1 | tee ~/T315510-viwiki-3; date
  • 15:28 Lucas_WMDE: STOP persistRevisionThreadItems on viwiki for T315510, restarting to pick up wmf.22
  • 15:07 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:07 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:12 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:11 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for cswiki, commonswiki: lift IP cap (T360103) (duration: 12m 36s)
  • 14:01 logmsgbot: lucaswerkmeister-wmde@deploy2002 samtar and lucaswerkmeister-wmde: Continuing with sync
  • 14:01 logmsgbot: lucaswerkmeister-wmde@deploy2002 samtar and lucaswerkmeister-wmde: Backport for cswiki, commonswiki: lift IP cap (T360103) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:59 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for cswiki, commonswiki: lift IP cap (T360103)
  • 13:58 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for REST: ignore request body on GET requests (T359509) (duration: 22m 06s)
  • 13:48 logmsgbot: lucaswerkmeister-wmde@deploy2002 jgiannelos and lucaswerkmeister-wmde: Continuing with sync
  • 13:38 logmsgbot: lucaswerkmeister-wmde@deploy2002 jgiannelos and lucaswerkmeister-wmde: Backport for REST: ignore request body on GET requests (T359509) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for REST: ignore request body on GET requests (T359509)
  • 13:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:33 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:29 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:29 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:06 Dreamy_Jazz: Stopped MediaModeration scanning script on group2 wikis
  • 11:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:10 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 11:09 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 11:09 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 11:08 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 11:07 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 11:07 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 10:01 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:01 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:53 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:53 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 37 hosts with reason: Enabling circular replication
  • 09:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 37 hosts with reason: Enabling circular replication
  • 09:35 marostegui: enable eqiad -> codfw replication on s1 T358199
  • 09:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 29 hosts with reason: Enabling circular replication
  • 09:26 marostegui: enable eqiad -> codfw replication on s2 T358199
  • 09:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 29 hosts with reason: Enabling circular replication
  • 09:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:23 hashar@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.22 refs T354440
  • 09:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 26 hosts with reason: Enabling circular replication
  • 09:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 26 hosts with reason: Enabling circular replication
  • 09:21 marostegui: enable eqiad -> codfw replication on s3 T358199
  • 09:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 36 hosts with reason: Enabling circular replication
  • 09:16 marostegui: enable eqiad -> codfw replication on s4 T358199
  • 09:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 36 hosts with reason: Enabling circular replication
  • 09:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 27 hosts with reason: Enabling circular replication
  • 09:09 marostegui: enable eqiad -> codfw replication on s5 T358199
  • 09:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 27 hosts with reason: Enabling circular replication
  • 09:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 27 hosts with reason: Enabling circular replication
  • 09:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 27 hosts with reason: Enabling circular replication
  • 09:06 marostegui: enable eqiad -> codfw replication on s6 T358199
  • 09:04 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2115 (re)pooling @ 100%: Post reimage', diff saved to https://phabricator.wikimedia.org/P58794 and previous config saved to /var/cache/conftool/dbconfig/20240314-085530-arnaudb.json
  • 08:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 31 hosts with reason: Enabling circular replication
  • 08:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 31 hosts with reason: Enabling circular replication
  • 08:54 marostegui: enable eqiad -> codfw replication on s7 T358199
  • 08:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 34 hosts with reason: Enabling circular replication
  • 08:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 34 hosts with reason: Enabling circular replication
  • 08:50 marostegui: enable eqiad -> codfw replication on s8 T358199
  • 08:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2115 (re)pooling @ 75%: Post reimage', diff saved to https://phabricator.wikimedia.org/P58793 and previous config saved to /var/cache/conftool/dbconfig/20240314-084024-arnaudb.json
  • 08:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 16 hosts with reason: Enabling circular replication
  • 08:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 16 hosts with reason: Enabling circular replication
  • 08:37 marostegui: enable eqiad -> codfw replication on x1 T358199
  • 08:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: Enabling circular replication
  • 08:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 6 hosts with reason: Enabling circular replication
  • 07:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2115.codfw.wmnet with OS bookworm
  • 07:38 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2115.codfw.wmnet with reason: host reimage
  • 07:35 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2115.codfw.wmnet with reason: host reimage
  • 07:20 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2115.codfw.wmnet with OS bookworm
  • 07:17 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2115.codfw.wmnet with reason: Silence for reimaging
  • 07:17 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2115.codfw.wmnet with reason: Silence for reimaging
  • 07:15 kart_: Updated cxserver to 2024-03-14-065833-production (T350773)
  • 07:14 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 07:13 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 07:13 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 07:12 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 07:06 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 07:05 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 06:48 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 06:48 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2196 to x1 primary and set section read-write T359919', diff saved to https://phabricator.wikimedia.org/P58789 and previous config saved to /var/cache/conftool/dbconfig/20240314-064513-root.json
  • 06:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:23 arnaudb: Starting x1 codfw failover from db2115 to db2196 - T359919
  • 06:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2196 with weight 0 T359919', diff saved to https://phabricator.wikimedia.org/P58788 and previous config saved to /var/cache/conftool/dbconfig/20240314-060644-arnaudb.json
  • 06:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 T359919
  • 06:06 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 16 hosts with reason: Primary switchover x1 T359919
  • 05:52 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:52 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 04:23 tstarling@deploy2002: Synchronized wmf-config/CommonSettings.php: reverting for now due to slow query T355034 (duration: 12m 28s)
  • 02:04 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:23 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:23 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:07 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:07 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:40 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:40 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-13

  • 23:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:12 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:12 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:07 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:07 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:41 dancy@deploy2002: backport Cancelled
  • 20:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:39 jdrewniak@deploy2002: Finished scap: Backport for Fix Issue with localization of special page titles in exclusion logic (T359958) (duration: 16m 49s)
  • 20:28 jdrewniak@deploy2002: jdlrobson and jdrewniak: Continuing with sync
  • 20:24 jdrewniak@deploy2002: jdlrobson and jdrewniak: Backport for Fix Issue with localization of special page titles in exclusion logic (T359958) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:22 jdrewniak@deploy2002: Started scap: Backport for Fix Issue with localization of special page titles in exclusion logic (T359958)
  • 19:49 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:49 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:49 cstone: civicrm upgraded from 3cab2177 to ee785ecd
  • 18:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 100%: After recloning db1246', diff saved to https://phabricator.wikimedia.org/P58786 and previous config saved to /var/cache/conftool/dbconfig/20240313-183256-root.json
  • 18:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 75%: After recloning db1246', diff saved to https://phabricator.wikimedia.org/P58785 and previous config saved to /var/cache/conftool/dbconfig/20240313-181751-root.json
  • 18:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 50%: After recloning db1246', diff saved to https://phabricator.wikimedia.org/P58784 and previous config saved to /var/cache/conftool/dbconfig/20240313-180245-root.json
  • 17:52 ejegg: donorwiki upgraded from 5755ea82 to 32789f89
  • 17:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 25%: After recloning db1246', diff saved to https://phabricator.wikimedia.org/P58783 and previous config saved to /var/cache/conftool/dbconfig/20240313-174740-root.json
  • 17:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:43 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 10%: After recloning db1246', diff saved to https://phabricator.wikimedia.org/P58782 and previous config saved to /var/cache/conftool/dbconfig/20240313-173234-root.json
  • 17:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 5%: After recloning db1246', diff saved to https://phabricator.wikimedia.org/P58781 and previous config saved to /var/cache/conftool/dbconfig/20240313-171728-root.json
  • 17:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 1%: After recloning db1246', diff saved to https://phabricator.wikimedia.org/P58779 and previous config saved to /var/cache/conftool/dbconfig/20240313-170222-root.json
  • 17:00 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
  • 16:52 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1182.eqiad.wmnet onto db1246.eqiad.wmnet
  • 16:48 root@cumin1002: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) status all services in all: None - None
  • 16:48 root@cumin1002: START - Cookbook sre.discovery.datacenter status all services in all: None - None
  • 16:33 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1021.eqiad.wmnet with reason: Decommissioning — T354561
  • 16:33 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1021.eqiad.wmnet with reason: Decommissioning — T354561
  • 16:32 root@cumin1002: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) status all services in all: None - None
  • 16:32 root@cumin1002: START - Cookbook sre.discovery.datacenter status all services in all: None - None
  • 16:07 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:07 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:01 hashar@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.22 refs T354440 (duration: 11m 46s)
  • 15:49 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.22 refs T354440
  • 15:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for WikiModule: Fix pages merging (T360014) (duration: 13m 27s)
  • 15:27 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db1182.eqiad.wmnet onto db1246.eqiad.wmnet
  • 15:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1246.eqiad.wmnet with OS bookworm
  • 15:25 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - marostegui@cumin1002"
  • 15:24 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - marostegui@cumin1002"
  • 15:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
  • 15:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for WikiModule: Fix pages merging (T360014) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:19 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for WikiModule: Fix pages merging (T360014)
  • 15:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1246.eqiad.wmnet with reason: host reimage
  • 15:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1246.eqiad.wmnet with reason: host reimage
  • 14:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Cloning db1246
  • 14:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Cloning db1246
  • 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1182 to clone db1246', diff saved to https://phabricator.wikimedia.org/P58778 and previous config saved to /var/cache/conftool/dbconfig/20240313-145603-marostegui.json
  • 14:49 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm
  • 14:49 marostegui@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1246.eqiad.wmnet with OS bookworm
  • 14:33 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm
  • 14:17 Dreamy_Jazz: Starting MediaModeration scanning scripts on group2 wikis and commonswiki - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 14:16 Dreamy_Jazz: Afternoon UTC backport window done
  • 14:16 dreamyjazz@deploy2002: Finished scap: Backport for Use wgCanonicalServer instead of wgSitename in intro text of email (T359979), Use wgCanonicalServer instead of wgSitename in intro text of email (T359979) (duration: 14m 56s)
  • 14:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: HW issues
  • 14:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: HW issues
  • 14:05 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 14:03 dreamyjazz@deploy2002: dreamyjazz: Backport for Use wgCanonicalServer instead of wgSitename in intro text of email (T359979), Use wgCanonicalServer instead of wgSitename in intro text of email (T359979) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:01 dreamyjazz@deploy2002: Started scap: Backport for Use wgCanonicalServer instead of wgSitename in intro text of email (T359979), Use wgCanonicalServer instead of wgSitename in intro text of email (T359979)
  • 13:57 dreamyjazz@deploy2002: Finished scap: Backport for Set ShowRollbackConfirmationDefaultUserOptions on arwiki to false (T355213) (duration: 15m 33s)
  • 13:52 Dreamy_Jazz: Stopping MediaModeration scanning scripts for backport
  • 13:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:46 dreamyjazz@deploy2002: dreamyjazz and gergesshamon: Continuing with sync
  • 13:46 jclark@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1246.eqiad.wmnet
  • 13:46 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1246.eqiad.wmnet
  • 13:46 jclark@cumin1002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts db1246.eqiad.wmnet
  • 13:46 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1246.eqiad.wmnet
  • 13:45 jclark@cumin1002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts db1246.eqiad.wmnet
  • 13:45 marostegui: Disable GTID on eqiad s1 master T358199
  • 13:45 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1246.eqiad.wmnet
  • 13:44 dreamyjazz@deploy2002: dreamyjazz and gergesshamon: Backport for Set ShowRollbackConfirmationDefaultUserOptions on arwiki to false (T355213) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:43 marostegui: Disable GTID on eqiad s2 master T358199
  • 13:41 dreamyjazz@deploy2002: Started scap: Backport for Set ShowRollbackConfirmationDefaultUserOptions on arwiki to false (T355213)
  • 13:40 marostegui: Disable GTID on eqiad s3 master T358199
  • 13:39 marostegui: Disable GTID on eqiad s4 master T358199
  • 13:38 dreamyjazz@deploy2002: Finished scap: Backport for Use iterator_to_array when calling ::assertCount (T360017), Use iterator_to_array when calling ::assertCount (T360017) (duration: 13m 01s)
  • 13:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:36 marostegui: Disable GTID on eqiad s5 master T358199
  • 13:35 marostegui: Disable GTID on eqiad s6 master T358199
  • 13:31 marostegui: Disable GTID on eqiad s7 master T358199
  • 13:29 marostegui: Disable GTID on eqiad s8 master T358199
  • 13:27 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 13:27 dreamyjazz@deploy2002: dreamyjazz: Backport for Use iterator_to_array when calling ::assertCount (T360017), Use iterator_to_array when calling ::assertCount (T360017) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:25 dreamyjazz@deploy2002: Started scap: Backport for Use iterator_to_array when calling ::assertCount (T360017), Use iterator_to_array when calling ::assertCount (T360017)
  • 13:21 mabualruz@deploy2002: Finished scap: Backport for Disable night mode on history pages (T359183) (duration: 15m 48s)
  • 13:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:12 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:12 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:11 mabualruz@deploy2002: jdlrobson and mabualruz: Continuing with sync
  • 13:08 mabualruz@deploy2002: jdlrobson and mabualruz: Backport for Disable night mode on history pages (T359183) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:06 mabualruz@deploy2002: Started scap: Backport for Disable night mode on history pages (T359183)
  • 13:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:02 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:27 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:27 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:33 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:27 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:22 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:22 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:15 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:15 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:12 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:12 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:04 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:43 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:43 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:26 hashar@deploy2002: rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.42.0-wmf.22" - T354440 T360014
  • 10:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:42 hashar@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.22 refs T354440 (duration: 12m 18s)
  • 09:30 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.22 refs T354440
  • 09:17 isaranto@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'ores-legacy' for release 'main' .
  • 09:16 isaranto@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
  • 09:01 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:01 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:53 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:53 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:47 marostegui: Disable GTID on eqiad es4 master T358199
  • 08:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:45 marostegui: Disable GTID on eqiad es5 master T358199
  • 08:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:29 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:29 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:03 hashar: Moved "UTC morning backport window" and "MediaWiki train" deployment windows from PST to UTC effectively shifting them one hour later. That is due to daylight saving time kicking it at different time. With the change, the windows are now at their usual time relatively to CET
  • 07:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:53 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:53 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:49 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:49 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:46 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:46 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:12 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:12 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:50 kart_: Updated cxserver to 2024-03-12-113634-production (T350773, T359525)
  • 04:49 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 04:48 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 04:45 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 04:44 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 04:38 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 04:37 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 03:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:10 TimStarling: on mwmaint2002: running migrateBlocks.php on all wikis
  • 02:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:07 tstarling@deploy2002: Finished scap: Backport for migrateBlocks.php: Fix infinite loop, migrateBlocks.php: Fix infinite loop (duration: 16m 42s)
  • 01:55 tstarling@deploy2002: tstarling: Continuing with sync
  • 01:52 tstarling@deploy2002: tstarling: Backport for migrateBlocks.php: Fix infinite loop, migrateBlocks.php: Fix infinite loop synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 01:50 tstarling@deploy2002: Started scap: Backport for migrateBlocks.php: Fix infinite loop, migrateBlocks.php: Fix infinite loop

2024-03-12

  • 23:56 ejegg: increased timeout for dlocal API calls to 30 sec
  • 23:48 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:48 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:40 tstarling@deploy2002: Finished scap: Backport for migrateBlocks.php: Skip existing IDs (T355034) (duration: 42m 36s)
  • 23:30 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:29 tstarling@deploy2002: tstarling: Continuing with sync
  • 23:11 tzatziki: removing 1 file for legal compliance
  • 23:05 tzatziki: removing 3 files for legal compliance
  • 23:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:02 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:00 tstarling@deploy2002: tstarling: Backport for migrateBlocks.php: Skip existing IDs (T355034) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:57 tstarling@deploy2002: Started scap: Backport for migrateBlocks.php: Skip existing IDs (T355034)
  • 22:50 tzatziki: removing 2 files for legal compliance
  • 19:58 hashar@deploy2002: Finished scap: Backport for WikiModule: Fix data structure when preloading title info (T359939) (duration: 13m 14s)
  • 19:48 hashar@deploy2002: hashar and jforrester: Continuing with sync
  • 19:47 hashar@deploy2002: hashar and jforrester: Backport for WikiModule: Fix data structure when preloading title info (T359939) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:46 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:46 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:45 hashar@deploy2002: Started scap: Backport for WikiModule: Fix data structure when preloading title info (T359939)
  • 15:32 eoghan@cumin1002: END (FAIL) - Cookbook sre.gitlab.failover (exit_code=93) Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 14:58 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbprov2005.codfw.wmnet with OS bullseye
  • 14:58 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbprov2006.codfw.wmnet with OS bullseye
  • 14:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P58774 and previous config saved to /var/cache/conftool/dbconfig/20240312-144307-ladsgroup.json
  • 14:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P58773 and previous config saved to /var/cache/conftool/dbconfig/20240312-142801-ladsgroup.json
  • 14:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:22 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 14:22 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 14:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:20 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 14:20 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 14:18 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 14:18 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 14:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: db1246 depooled
  • 14:16 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: db1246 depooled
  • 14:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P58772 and previous config saved to /var/cache/conftool/dbconfig/20240312-141255-ladsgroup.json
  • 14:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1246.eqiad.wmnet with reason: db1246 depooled
  • 14:00 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1246.eqiad.wmnet with reason: db1246 depooled
  • 13:58 Lucas_WMDE: UTC afternoon backport+config window done
  • 13:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P58770 and previous config saved to /var/cache/conftool/dbconfig/20240312-135750-ladsgroup.json
  • 13:57 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Enable native math rendering options by default (T358803) (duration: 15m 05s)
  • 13:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Down', diff saved to https://phabricator.wikimedia.org/P58769 and previous config saved to /var/cache/conftool/dbconfig/20240312-135715-ladsgroup.json
  • 13:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1246', diff saved to https://phabricator.wikimedia.org/P58768 and previous config saved to /var/cache/conftool/dbconfig/20240312-135537-arnaudb.json
  • 13:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:47 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and physikerwelt: Continuing with sync
  • 13:45 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and physikerwelt: Backport for Enable native math rendering options by default (T358803) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:42 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Enable native math rendering options by default (T358803)
  • 13:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:38 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Add `suppressredirect` right to pagemover and filemover user groups in azwiki (T359614) (duration: 14m 13s)
  • 13:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:30 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:27 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and mdsshakil: Continuing with sync
  • 13:26 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and mdsshakil: Backport for Add `suppressredirect` right to pagemover and filemover user groups in azwiki (T359614) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:24 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Add `suppressredirect` right to pagemover and filemover user groups in azwiki (T359614)
  • 13:23 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
  • 13:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:30 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:26 eoghan@cumin1002: START - Cookbook sre.gitlab.failover Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org
  • 12:19 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:07 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:07 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:02 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:16 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:16 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 100%: Post clone', diff saved to https://phabricator.wikimedia.org/P58765 and previous config saved to /var/cache/conftool/dbconfig/20240312-104056-arnaudb.json
  • 10:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2210 (re)pooling @ 100%: Post clone', diff saved to https://phabricator.wikimedia.org/P58764 and previous config saved to /var/cache/conftool/dbconfig/20240312-104056-arnaudb.json
  • 10:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 100%: Post clone', diff saved to https://phabricator.wikimedia.org/P58763 and previous config saved to /var/cache/conftool/dbconfig/20240312-104055-arnaudb.json
  • 10:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 75%: Post clone', diff saved to https://phabricator.wikimedia.org/P58762 and previous config saved to /var/cache/conftool/dbconfig/20240312-102551-arnaudb.json
  • 10:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2210 (re)pooling @ 75%: Post clone', diff saved to https://phabricator.wikimedia.org/P58761 and previous config saved to /var/cache/conftool/dbconfig/20240312-102551-arnaudb.json
  • 10:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 75%: Post clone', diff saved to https://phabricator.wikimedia.org/P58760 and previous config saved to /var/cache/conftool/dbconfig/20240312-102550-arnaudb.json
  • 10:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2206 (re)pooling @ 100%: Post clone', diff saved to https://phabricator.wikimedia.org/P58759 and previous config saved to /var/cache/conftool/dbconfig/20240312-102245-arnaudb.json
  • 10:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 100%: Post clone', diff saved to https://phabricator.wikimedia.org/P58758 and previous config saved to /var/cache/conftool/dbconfig/20240312-102245-arnaudb.json
  • 10:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2210 (re)pooling @ 50%: Post clone', diff saved to https://phabricator.wikimedia.org/P58757 and previous config saved to /var/cache/conftool/dbconfig/20240312-101046-arnaudb.json
  • 10:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 50%: Post clone', diff saved to https://phabricator.wikimedia.org/P58756 and previous config saved to /var/cache/conftool/dbconfig/20240312-101045-arnaudb.json
  • 10:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 75%: Post clone', diff saved to https://phabricator.wikimedia.org/P58755 and previous config saved to /var/cache/conftool/dbconfig/20240312-100741-arnaudb.json
  • 10:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2206 (re)pooling @ 75%: Post clone', diff saved to https://phabricator.wikimedia.org/P58754 and previous config saved to /var/cache/conftool/dbconfig/20240312-100740-arnaudb.json
  • 10:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 75%: Post clone', diff saved to https://phabricator.wikimedia.org/P58753 and previous config saved to /var/cache/conftool/dbconfig/20240312-100740-arnaudb.json
  • 10:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 32%: Post clone', diff saved to https://phabricator.wikimedia.org/P58752 and previous config saved to /var/cache/conftool/dbconfig/20240312-095541-arnaudb.json
  • 09:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2210 (re)pooling @ 32%: Post clone', diff saved to https://phabricator.wikimedia.org/P58751 and previous config saved to /var/cache/conftool/dbconfig/20240312-095541-arnaudb.json
  • 09:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 32%: Post clone', diff saved to https://phabricator.wikimedia.org/P58750 and previous config saved to /var/cache/conftool/dbconfig/20240312-095540-arnaudb.json
  • 09:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 50%: Post clone', diff saved to https://phabricator.wikimedia.org/P58749 and previous config saved to /var/cache/conftool/dbconfig/20240312-095236-arnaudb.json
  • 09:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2206 (re)pooling @ 50%: Post clone', diff saved to https://phabricator.wikimedia.org/P58748 and previous config saved to /var/cache/conftool/dbconfig/20240312-095235-arnaudb.json
  • 09:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 50%: Post clone', diff saved to https://phabricator.wikimedia.org/P58747 and previous config saved to /var/cache/conftool/dbconfig/20240312-095235-arnaudb.json
  • 09:47 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.22 refs T354440
  • 09:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 16%: Post clone', diff saved to https://phabricator.wikimedia.org/P58746 and previous config saved to /var/cache/conftool/dbconfig/20240312-094036-arnaudb.json
  • 09:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2210 (re)pooling @ 16%: Post clone', diff saved to https://phabricator.wikimedia.org/P58745 and previous config saved to /var/cache/conftool/dbconfig/20240312-094036-arnaudb.json
  • 09:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 16%: Post clone', diff saved to https://phabricator.wikimedia.org/P58744 and previous config saved to /var/cache/conftool/dbconfig/20240312-094035-arnaudb.json
  • 09:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 32%: Post clone', diff saved to https://phabricator.wikimedia.org/P58743 and previous config saved to /var/cache/conftool/dbconfig/20240312-093729-arnaudb.json
  • 09:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 32%: Post clone', diff saved to https://phabricator.wikimedia.org/P58742 and previous config saved to /var/cache/conftool/dbconfig/20240312-093730-arnaudb.json
  • 09:27 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:27 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 8%: Post clone', diff saved to https://phabricator.wikimedia.org/P58741 and previous config saved to /var/cache/conftool/dbconfig/20240312-092531-arnaudb.json
  • 09:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2210 (re)pooling @ 8%: Post clone', diff saved to https://phabricator.wikimedia.org/P58740 and previous config saved to /var/cache/conftool/dbconfig/20240312-092531-arnaudb.json
  • 09:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 8%: Post clone', diff saved to https://phabricator.wikimedia.org/P58739 and previous config saved to /var/cache/conftool/dbconfig/20240312-092530-arnaudb.json
  • 09:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 16%: Post clone', diff saved to https://phabricator.wikimedia.org/P58738 and previous config saved to /var/cache/conftool/dbconfig/20240312-092225-arnaudb.json
  • 09:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 16%: Post clone', diff saved to https://phabricator.wikimedia.org/P58737 and previous config saved to /var/cache/conftool/dbconfig/20240312-092224-arnaudb.json
  • 09:11 eoghan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on gitlab1004.wikimedia.org with reason: Silencing alerts for switchover prep
  • 09:10 eoghan@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on gitlab1004.wikimedia.org with reason: Silencing alerts for switchover prep
  • 09:10 eoghan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on gitlab1004.wikimedia.org with reason: Silencing alerts for switchover prep
  • 09:10 eoghan@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on gitlab1004.wikimedia.org with reason: Silencing alerts for switchover prep
  • 09:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2210 (re)pooling @ 4%: Post clone', diff saved to https://phabricator.wikimedia.org/P58735 and previous config saved to /var/cache/conftool/dbconfig/20240312-091026-arnaudb.json
  • 09:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 4%: Post clone', diff saved to https://phabricator.wikimedia.org/P58734 and previous config saved to /var/cache/conftool/dbconfig/20240312-091025-arnaudb.json
  • 09:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2206 (re)pooling @ 8%: Post clone', diff saved to https://phabricator.wikimedia.org/P58733 and previous config saved to /var/cache/conftool/dbconfig/20240312-090720-arnaudb.json
  • 09:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 8%: Post clone', diff saved to https://phabricator.wikimedia.org/P58732 and previous config saved to /var/cache/conftool/dbconfig/20240312-090719-arnaudb.json
  • 08:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 2%: Post clone', diff saved to https://phabricator.wikimedia.org/P58731 and previous config saved to /var/cache/conftool/dbconfig/20240312-085522-arnaudb.json
  • 08:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2210 (re)pooling @ 2%: Post clone', diff saved to https://phabricator.wikimedia.org/P58730 and previous config saved to /var/cache/conftool/dbconfig/20240312-085521-arnaudb.json
  • 08:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 2%: Post clone', diff saved to https://phabricator.wikimedia.org/P58729 and previous config saved to /var/cache/conftool/dbconfig/20240312-085520-arnaudb.json
  • 08:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 4%: Post clone', diff saved to https://phabricator.wikimedia.org/P58728 and previous config saved to /var/cache/conftool/dbconfig/20240312-085215-arnaudb.json
  • 08:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2206 (re)pooling @ 4%: Post clone', diff saved to https://phabricator.wikimedia.org/P58727 and previous config saved to /var/cache/conftool/dbconfig/20240312-085214-arnaudb.json
  • 08:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 4%: Post clone', diff saved to https://phabricator.wikimedia.org/P58726 and previous config saved to /var/cache/conftool/dbconfig/20240312-085214-arnaudb.json
  • 08:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 1%: Post clone', diff saved to https://phabricator.wikimedia.org/P58725 and previous config saved to /var/cache/conftool/dbconfig/20240312-084016-arnaudb.json
  • 08:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2210 (re)pooling @ 1%: Post clone', diff saved to https://phabricator.wikimedia.org/P58724 and previous config saved to /var/cache/conftool/dbconfig/20240312-084016-arnaudb.json
  • 08:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 1%: Post clone', diff saved to https://phabricator.wikimedia.org/P58723 and previous config saved to /var/cache/conftool/dbconfig/20240312-084015-arnaudb.json
  • 08:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 2%: Post clone', diff saved to https://phabricator.wikimedia.org/P58722 and previous config saved to /var/cache/conftool/dbconfig/20240312-083710-arnaudb.json
  • 08:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2206 (re)pooling @ 2%: Post clone', diff saved to https://phabricator.wikimedia.org/P58721 and previous config saved to /var/cache/conftool/dbconfig/20240312-083710-arnaudb.json
  • 08:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 2%: Post clone', diff saved to https://phabricator.wikimedia.org/P58720 and previous config saved to /var/cache/conftool/dbconfig/20240312-083709-arnaudb.json
  • 07:18 eileen: civicrm upgraded from 1cc73ba9 to 3cab2177
  • 07:06 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1020.eqiad.wmnet with reason: Decommissioning — T354561
  • 07:05 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1020.eqiad.wmnet with reason: Decommissioning — T354561
  • 07:05 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:05 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:35 hashar@deploy2002: Finished deploy [integration/docroot@a13474b]: Link to Cite docs - T358641 (duration: 00m 06s)
  • 06:35 hashar@deploy2002: Started deploy [integration/docroot@a13474b]: Link to Cite docs - T358641
  • 05:11 kart_: Updated cxserver to 2024-03-11-120258-production (T350773)
  • 05:10 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 05:09 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:07 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:06 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 05:01 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:01 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 04:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:02 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.22 refs T354440 (duration: 56m 47s)
  • 03:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:06 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.22 refs T354440
  • 03:03 mwpresync@deploy2002: Pruned MediaWiki: 1.42.0-wmf.19 (duration: 03m 24s)
  • 01:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-11

  • 23:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:07 eileen: config revision changed from ec5690b9 to e91d00a6
  • 21:55 eileen: config revision changed from 92e91ab7 to ec5690b9
  • 21:22 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:22 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:12 ejegg: donorwiki upgraded from 9b31d4fe to 5755ea82
  • 20:51 rzl@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0)
  • 20:51 rzl@cumin2002: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance
  • 20:49 ejegg: Fundraising civicrm upgraded from 50278dbc to 1cc73ba9
  • 20:34 urbanecm@deploy2002: Finished scap: Backport for Interaction to Next Paint (INP) Core Web Vital Improvement (T358380) (duration: 13m 57s)
  • 20:23 urbanecm@deploy2002: urbanecm and jdlrobson: Continuing with sync
  • 20:22 urbanecm@deploy2002: urbanecm and jdlrobson: Backport for Interaction to Next Paint (INP) Core Web Vital Improvement (T358380) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:20 urbanecm@deploy2002: Started scap: Backport for Interaction to Next Paint (INP) Core Web Vital Improvement (T358380)
  • 20:18 urbanecm@deploy2002: Finished scap: Backport for Disable special pages on a per name basis (T359183) (duration: 13m 43s)
  • 20:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:07 urbanecm@deploy2002: jdlrobson and urbanecm: Continuing with sync
  • 20:06 urbanecm@deploy2002: jdlrobson and urbanecm: Backport for Disable special pages on a per name basis (T359183) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:04 urbanecm@deploy2002: Started scap: Backport for Disable special pages on a per name basis (T359183)
  • 19:58 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:56 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:56 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:27 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:25 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2110.codfw.wmnet onto db2210.codfw.wmnet
  • 16:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2109.codfw.wmnet onto db2209.codfw.wmnet
  • 16:21 arnaudb@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P58715 and previous config saved to /var/cache/conftool/dbconfig/20240311-162145-arnaudb.json
  • 16:06 arnaudb@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 75%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P58714 and previous config saved to /var/cache/conftool/dbconfig/20240311-160639-arnaudb.json
  • 15:51 arnaudb@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 50%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P58713 and previous config saved to /var/cache/conftool/dbconfig/20240311-155134-arnaudb.json
  • 15:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2111.codfw.wmnet onto db2211.codfw.wmnet
  • 15:36 jnuche@deploy2002: Installation of scap version "4.71.0" completed for 376 hosts
  • 15:36 arnaudb@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 32%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P58712 and previous config saved to /var/cache/conftool/dbconfig/20240311-153628-arnaudb.json
  • 15:35 jnuche@deploy2002: Installing scap version "4.71.0" for 376 hosts
  • 15:33 Daimona: T357007 Running mwscript CampaignEvents:GenerateInvitationList --wiki=metawiki --listfile=/home/daimona/list.txt
  • 15:21 arnaudb@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 16%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P58711 and previous config saved to /var/cache/conftool/dbconfig/20240311-152123-arnaudb.json
  • 15:06 arnaudb@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 8%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P58710 and previous config saved to /var/cache/conftool/dbconfig/20240311-150617-arnaudb.json
  • 15:01 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2111.codfw.wmnet onto db2211.codfw.wmnet
  • 15:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2111 in db2211 for T355422', diff saved to https://phabricator.wikimedia.org/P58709 and previous config saved to /var/cache/conftool/dbconfig/20240311-150025-arnaudb.json
  • 14:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: provisionning db2211.codfw.wmnet - T355422
  • 14:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: provisionning db2211.codfw.wmnet - T355422
  • 14:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: provisionning db2211.codfw.wmnet - T355422
  • 14:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: provisionning db2211.codfw.wmnet - T355422
  • 14:57 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2110.codfw.wmnet onto db2210.codfw.wmnet
  • 14:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2110 in db2210 for T355422', diff saved to https://phabricator.wikimedia.org/P58708 and previous config saved to /var/cache/conftool/dbconfig/20240311-145604-arnaudb.json
  • 14:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: provisionning db2210.codfw.wmnet - T355422
  • 14:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: provisionning db2210.codfw.wmnet - T355422
  • 14:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: provisionning db2210.codfw.wmnet - T355422
  • 14:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: provisionning db2210.codfw.wmnet - T355422
  • 14:52 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2109.codfw.wmnet onto db2209.codfw.wmnet
  • 14:51 arnaudb@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 4%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P58707 and previous config saved to /var/cache/conftool/dbconfig/20240311-145111-arnaudb.json
  • 14:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2109 in db2209 for T355422', diff saved to https://phabricator.wikimedia.org/P58706 and previous config saved to /var/cache/conftool/dbconfig/20240311-145102-arnaudb.json
  • 14:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: provisionning db2209.codfw.wmnet - T355422
  • 14:49 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: provisionning db2209.codfw.wmnet - T355422
  • 14:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: provisionning db2209.codfw.wmnet - T355422
  • 14:49 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: provisionning db2209.codfw.wmnet - T355422
  • 14:35 hashar@deploy2002: Finished deploy [gerrit/gerrit@2150230]: Gerrit to 3.7.8 on gerrit1003 - T359819 (duration: 00m 10s)
  • 14:35 hashar@deploy2002: Started deploy [gerrit/gerrit@2150230]: Gerrit to 3.7.8 on gerrit1003 - T359819
  • 14:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 2%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P58705 and previous config saved to /var/cache/conftool/dbconfig/20240311-143451-arnaudb.json
  • 14:31 hashar@deploy2002: Finished deploy [gerrit/gerrit@2150230]: Gerrit to 3.7.8 on gerrit2002 - T359819 (duration: 00m 07s)
  • 14:31 hashar@deploy2002: Started deploy [gerrit/gerrit@2150230]: Gerrit to 3.7.8 on gerrit2002 - T359819
  • 14:27 hashar@deploy2002: Finished deploy [gerrit/gerrit@737c475]: Gerrit to 3.7.8 on gerrit2002 (duration: 00m 03s)
  • 14:27 hashar@deploy2002: Started deploy [gerrit/gerrit@737c475]: Gerrit to 3.7.8 on gerrit2002
  • 14:20 mvolz@deploy2002: Finished scap: Backport for editcheckreferenceurl: don't error when aborting the lookupPromise (T359601) (duration: 19m 20s)
  • 14:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 1%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P58704 and previous config saved to /var/cache/conftool/dbconfig/20240311-141945-arnaudb.json
  • 14:17 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1179.eqiad.wmnet with OS bookworm
  • 14:10 mvolz@deploy2002: mvolz: Continuing with sync
  • 14:02 mvolz@deploy2002: mvolz: Backport for editcheckreferenceurl: don't error when aborting the lookupPromise (T359601) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:01 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:00 mvolz@deploy2002: Started scap: Backport for editcheckreferenceurl: don't error when aborting the lookupPromise (T359601)
  • 13:56 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
  • 13:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
  • 13:50 samtar@deploy2002: Finished scap: Backport for nnwiki: Enable sandbox link (T359788) (duration: 12m 18s)
  • 13:49 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:49 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:42 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1179.eqiad.wmnet with OS bookworm
  • 13:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:41 samtar@deploy2002: jhsoby and samtar: Continuing with sync
  • 13:40 samtar@deploy2002: jhsoby and samtar: Backport for nnwiki: Enable sandbox link (T359788) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1179.eqiad.wmnet with reason: Silence for upgrade
  • 13:40 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1179.eqiad.wmnet with reason: Silence for upgrade
  • 13:38 samtar@deploy2002: Started scap: Backport for nnwiki: Enable sandbox link (T359788)
  • 13:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1179 T359790', diff saved to https://phabricator.wikimedia.org/P58703 and previous config saved to /var/cache/conftool/dbconfig/20240311-133631-arnaudb.json
  • 13:36 Dreamy_Jazz: Running `foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-group2-sleep-30-no-render-now.txt` on a tmux session
  • 13:35 samtar@deploy2002: Finished scap: Backport for [itwiki] Set 'wgBlockAllowsUTEdit' to true (duration: 13m 13s)
  • 13:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db1220 to x1 primary T359790', diff saved to https://phabricator.wikimedia.org/P58702 and previous config saved to /var/cache/conftool/dbconfig/20240311-133405-arnaudb.json
  • 13:32 arnaudb: Starting x1 eqiad failover from db1179 to db1220 - T359790
  • 13:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:27 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:27 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:25 samtar@deploy2002: superpes and samtar: Continuing with sync
  • 13:24 samtar@deploy2002: superpes and samtar: Backport for [itwiki] Set 'wgBlockAllowsUTEdit' to true synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db1220 with weight 0 T359790', diff saved to https://phabricator.wikimedia.org/P58701 and previous config saved to /var/cache/conftool/dbconfig/20240311-132259-arnaudb.json
  • 13:22 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:22 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 T359790
  • 13:22 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:22 samtar@deploy2002: Started scap: Backport for [itwiki] Set 'wgBlockAllowsUTEdit' to true
  • 13:22 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 16 hosts with reason: Primary switchover x1 T359790
  • 12:59 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 12:59 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 12:59 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 12:57 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 12:57 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 12:56 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 12:52 Dreamy_Jazz: Re-starting MediaModeration scanning script
  • 12:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:26 mutante: DNS - added new project language 'kus' - Kusaal is a Gur language spoken primarily in northern eastern Ghana, and Burkina Faso. It is spoken by about 121,000 people. T359757
  • 09:59 kostajh: UTC morning deploys done
  • 09:52 kharlan@deploy2002: Finished scap: Backport for Exclude non-functional pages from night mode (T359183) (duration: 16m 42s)
  • 09:41 kharlan@deploy2002: jdlrobson and kharlan: Continuing with sync
  • 09:38 kharlan@deploy2002: jdlrobson and kharlan: Backport for Exclude non-functional pages from night mode (T359183) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:35 kharlan@deploy2002: Started scap: Backport for Exclude non-functional pages from night mode (T359183)
  • 08:57 jnuche@deploy2002: Installation of scap version "4.70.1" completed for 376 hosts
  • 08:56 jnuche@deploy2002: Installing scap version "4.70.1" for 376 hosts
  • 08:55 jnuche@deploy2002: Installation of scap version "4.70.1" completed for 376 hosts
  • 08:55 jnuche@deploy2002: Installing scap version "4.70.1" for 376 hosts
  • 08:29 godog: bounce prometheus@aux-k8s - T343529
  • 08:04 kharlan@deploy2002: Sync cancelled.
  • 07:56 kharlan@deploy2002: kharlan and jdlrobson: Backport for Exclude non-functional pages from night mode (T359183) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:44 kharlan@deploy2002: Started scap: Backport for Exclude non-functional pages from night mode (T359183)
  • 07:29 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:29 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:52 kart_: Updated cxserver to 2024-03-11-035839-production (T350773)
  • 04:48 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 04:47 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 04:47 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 04:46 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 04:39 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 04:38 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 04:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:18 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-10

  • 22:28 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:28 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:43 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:43 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:19 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1019.eqiad.wmnet with reason: Decommissioning — T354561
  • 14:19 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1019.eqiad.wmnet with reason: Decommissioning — T354561
  • 14:03 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 9 hosts
  • 14:03 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for 9 hosts
  • 13:51 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=eqiad,name=restbase1042.eqiad.wmnet
  • 13:51 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=eqiad,name=restbase1041.eqiad.wmnet
  • 13:51 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=eqiad,name=restbase1040.eqiad.wmnet
  • 13:51 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=eqiad,name=restbase1039.eqiad.wmnet
  • 13:51 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=eqiad,name=restbase1038.eqiad.wmnet
  • 13:51 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=eqiad,name=restbase1037.eqiad.wmnet
  • 13:51 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=eqiad,name=restbase1036.eqiad.wmnet
  • 13:51 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=eqiad,name=restbase1035.eqiad.wmnet
  • 13:51 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=eqiad,name=restbase1034.eqiad.wmnet
  • 13:50 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase1042.eqiad.wmnet
  • 13:50 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase1041.eqiad.wmnet
  • 13:50 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase1040.eqiad.wmnet
  • 13:50 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase1039.eqiad.wmnet
  • 13:50 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase1038.eqiad.wmnet
  • 13:50 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase1037.eqiad.wmnet
  • 13:49 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase1036.eqiad.wmnet
  • 13:49 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase1035.eqiad.wmnet
  • 13:49 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase1034.eqiad.wmnet
  • 03:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:28 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:27 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:13 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:13 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-09

  • 23:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:07 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:07 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:04 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:12 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:11 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:26 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1042.eqiad.wmnet with reason: Bootstrapping — T354560
  • 17:26 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1042.eqiad.wmnet with reason: Bootstrapping — T354560
  • 17:22 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@484d5e8]: Updated target list — T354560 (duration: 00m 37s)
  • 17:21 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@484d5e8]: Updated target list — T354560
  • 17:20 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@910b77d]: Updated target list — T354560 (duration: 00m 34s)
  • 17:20 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@910b77d]: Updated target list — T354560
  • 17:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:30 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:29 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:46 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:29 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:29 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:16 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:30 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:33 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:27 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:27 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:19 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-08

  • 23:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:28 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:28 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:23 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:23 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:16 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:16 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:05 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:05 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:01 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:01 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:46 mutante: planet1003/2003: apt-get remove prometheus-apache-exporter - T359596
  • 20:28 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:28 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:51 taavi: arm cumin_master keyholder key on cumin1002 after ganeti1033 froze and rebooted
  • 18:38 cdanis: ✔ cdanis@ganeti1027.eqiad.wmnet ~ 🕜☕ sudo gnt-node migrate -f ganeti1033.eqiad.wmnet
  • 18:20 cdanis: ❌cdanis@ganeti1027.eqiad.wmnet ~ 🕜☕ sudo gnt-node failover -f ganeti1033.eqiad.wmnet
  • 18:17 cdanis: forcibly rebooting ganeti1033
  • 18:13 cdanis: ✔ cdanis@ganeti1027.eqiad.wmnet ~ 🕐☕ sudo gnt-node migrate -f ganeti1033.eqiad.wmnet
  • 18:04 Dreamy_Jazz: Stopped scan on group 2 wiki (test complete)
  • 17:55 Dreamy_Jazz: Running `foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep 1 --verbose 2>&1 | tee ~/scan-files-in-scan-table-group2-sleep-1-no-render-now.txt` on a tmux session
  • 15:32 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 15:32 fabfur: repooling cp4037 for this weekend, all log-format changes are reverted (T351117)
  • 15:28 fabfur@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp4037.ulsfo.wmnet
  • 15:28 fabfur@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp4037.ulsfo.wmnet
  • 14:33 isaranto@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
  • 14:20 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@910b77d]: Deploying to updated target list — T354560 (duration: 00m 35s)
  • 14:20 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@910b77d]: Deploying to updated target list — T354560
  • 14:17 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@c200e79]: Deploying to updated target list — T354560 (duration: 00m 36s)
  • 14:16 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@c200e79]: Deploying to updated target list — T354560
  • 14:11 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1041.eqiad.wmnet with reason: Bootstrapping — T354560
  • 14:11 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1041.eqiad.wmnet with reason: Bootstrapping — T354560
  • 14:01 arturo: update deb packages on bookworm thirdparty/kubeadm-k8s-1-24 for T359619 (apt1002)
  • 10:30 fabfur@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cp4037.ulsfo.wmnet with reason: T358109
  • 10:30 jnuche@deploy2002: Installation of scap version "4.70.1" completed for 374 hosts
  • 10:30 fabfur@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on cp4037.ulsfo.wmnet with reason: T358109
  • 10:29 jnuche@deploy2002: Installing scap version "4.70.1" for 374 hosts
  • 10:08 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bookworm
  • 09:49 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 09:49 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 09:40 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt2001-dev.codfw.wmnet with reason: host reimage
  • 09:39 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@9bf7445] (releasing): (no justification provided) (duration: 00m 40s)
  • 09:38 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@9bf7445] (releasing): (no justification provided)
  • 09:38 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt2001-dev.codfw.wmnet with reason: host reimage
  • 09:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db2108 (re)pooling @ 100%: Temporary repool for the weekend', diff saved to https://phabricator.wikimedia.org/P58687 and previous config saved to /var/cache/conftool/dbconfig/20240308-092705-arnaudb.json
  • 09:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2106 (re)pooling @ 100%: Temporary repool for the weekend', diff saved to https://phabricator.wikimedia.org/P58686 and previous config saved to /var/cache/conftool/dbconfig/20240308-092621-arnaudb.json
  • 09:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: Temporary repool for the weekend', diff saved to https://phabricator.wikimedia.org/P58685 and previous config saved to /var/cache/conftool/dbconfig/20240308-092546-arnaudb.json
  • 09:17 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bookworm
  • 09:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db2108 (re)pooling @ 75%: Temporary repool for the weekend', diff saved to https://phabricator.wikimedia.org/P58684 and previous config saved to /var/cache/conftool/dbconfig/20240308-091159-arnaudb.json
  • 09:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2106 (re)pooling @ 75%: Temporary repool for the weekend', diff saved to https://phabricator.wikimedia.org/P58683 and previous config saved to /var/cache/conftool/dbconfig/20240308-091115-arnaudb.json
  • 09:10 kart_: Updated cxserver to 2024-03-08-084626-production (T359525)
  • 09:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: Temporary repool for the weekend', diff saved to https://phabricator.wikimedia.org/P58682 and previous config saved to /var/cache/conftool/dbconfig/20240308-091041-arnaudb.json
  • 09:09 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 09:09 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 09:08 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 09:07 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 08:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2108 (re)pooling @ 50%: Temporary repool for the weekend', diff saved to https://phabricator.wikimedia.org/P58681 and previous config saved to /var/cache/conftool/dbconfig/20240308-085654-arnaudb.json
  • 08:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2106 (re)pooling @ 50%: Temporary repool for the weekend', diff saved to https://phabricator.wikimedia.org/P58680 and previous config saved to /var/cache/conftool/dbconfig/20240308-085610-arnaudb.json
  • 08:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 50%: Temporary repool for the weekend', diff saved to https://phabricator.wikimedia.org/P58679 and previous config saved to /var/cache/conftool/dbconfig/20240308-085536-arnaudb.json
  • 08:53 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 08:53 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 08:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P58678 and previous config saved to /var/cache/conftool/dbconfig/20240308-084944-root.json
  • 08:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2108 (re)pooling @ 25%: Temporary repool for the weekend', diff saved to https://phabricator.wikimedia.org/P58677 and previous config saved to /var/cache/conftool/dbconfig/20240308-084149-arnaudb.json
  • 08:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2106 (re)pooling @ 25%: Temporary repool for the weekend', diff saved to https://phabricator.wikimedia.org/P58676 and previous config saved to /var/cache/conftool/dbconfig/20240308-084105-arnaudb.json
  • 08:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: Temporary repool for the weekend', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240308-084026-arnaudb.json
  • 08:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2151 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P58675 and previous config saved to /var/cache/conftool/dbconfig/20240308-083449-root.json
  • 08:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P58674 and previous config saved to /var/cache/conftool/dbconfig/20240308-083439-root.json
  • 08:27 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1216.eqiad.wmnet with reason: Silence for upgrade
  • 08:27 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on db1216.eqiad.wmnet with reason: Silence for upgrade
  • 08:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2151 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P58671 and previous config saved to /var/cache/conftool/dbconfig/20240308-080439-root.json
  • 08:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P58670 and previous config saved to /var/cache/conftool/dbconfig/20240308-080429-root.json
  • 07:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2151 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P58669 and previous config saved to /var/cache/conftool/dbconfig/20240308-074934-root.json
  • 07:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P58668 and previous config saved to /var/cache/conftool/dbconfig/20240308-074924-root.json
  • 07:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2151 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P58667 and previous config saved to /var/cache/conftool/dbconfig/20240308-073429-root.json
  • 07:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 5%: After reimage', diff saved to https://phabricator.wikimedia.org/P58666 and previous config saved to /var/cache/conftool/dbconfig/20240308-073419-root.json
  • 07:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2151 (re)pooling @ 5%: After reimage', diff saved to https://phabricator.wikimedia.org/P58665 and previous config saved to /var/cache/conftool/dbconfig/20240308-071924-root.json
  • 07:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 1%: After reimage', diff saved to https://phabricator.wikimedia.org/P58664 and previous config saved to /var/cache/conftool/dbconfig/20240308-071913-root.json
  • 07:16 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2151.codfw.wmnet onto db2124.codfw.wmnet
  • 06:28 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db2151.codfw.wmnet onto db2124.codfw.wmnet
  • 06:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2151', diff saved to https://phabricator.wikimedia.org/P58663 and previous config saved to /var/cache/conftool/dbconfig/20240308-062741-root.json
  • 00:55 eileen: civicrm upgraded from 867fc977 to 50278dbc
  • 00:35 eileen: config revision changed from 77ff8877 to 58829954

2024-03-07

  • 23:21 htriedman@deploy2002: Finished deploy [airflow-dags/platform_eng@00efab7]: (no justification provided) (duration: 00m 27s)
  • 23:21 htriedman@deploy2002: Started deploy [airflow-dags/platform_eng@00efab7]: (no justification provided)
  • 22:49 ejegg: donorwiki upgraded from bc49e5a6 to 9b31d4fe
  • 22:47 inflatador: bking@pcc-worker1006 deleted all dirs older than 22 Jan to free up space
  • 22:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P58661 and previous config saved to /var/cache/conftool/dbconfig/20240307-222330-ladsgroup.json
  • 22:17 rzl@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on db2124.codfw.wmnet with reason: index corruption
  • 22:16 rzl@cumin2002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on db2124.codfw.wmnet with reason: index corruption
  • 22:10 rzl@cumin2002: dbctl commit (dc=all): 'Depool db2124', diff saved to https://phabricator.wikimedia.org/P58659 and previous config saved to /var/cache/conftool/dbconfig/20240307-221056-rzl.json
  • 22:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P58658 and previous config saved to /var/cache/conftool/dbconfig/20240307-220824-ladsgroup.json
  • 21:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P58657 and previous config saved to /var/cache/conftool/dbconfig/20240307-215319-ladsgroup.json
  • 21:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P58656 and previous config saved to /var/cache/conftool/dbconfig/20240307-213814-ladsgroup.json
  • 21:19 brennen@deploy2002: Finished scap: Backport for Fixes: Less_Exception_Compiler (T359414 T357740) (duration: 14m 41s)
  • 21:09 brennen@deploy2002: brennen and jdlrobson: Continuing with sync
  • 21:07 brennen@deploy2002: brennen and jdlrobson: Backport for Fixes: Less_Exception_Compiler (T359414 T357740) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:04 brennen@deploy2002: Started scap: Backport for Fixes: Less_Exception_Compiler (T359414 T357740)
  • 20:50 dancy@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@c200e79]: (no justification provided) (duration: 00m 35s)
  • 20:50 dancy@deploy2002: Started deploy [cassandra/logstash-logback-encoder@c200e79]: (no justification provided)
  • 20:49 dancy@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@162f72f]: (no justification provided) (duration: 00m 56s)
  • 20:49 dancy@deploy2002: Started deploy [cassandra/logstash-logback-encoder@162f72f]: (no justification provided)
  • 18:49 btullis: running a wikidata dump manually on snapshot1009 for partitions 25,27
  • 18:22 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 60 days, 0:00:00 on wdqs[1022-1025].eqiad.wmnet with reason: T337013
  • 18:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 60 days, 0:00:00 on wdqs[1022-1025].eqiad.wmnet with reason: T337013
  • 18:19 bearloga@deploy2002: Finished deploy [airflow-dags/analytics_product@15edf4a]: (no justification provided) (duration: 00m 08s)
  • 18:19 bearloga@deploy2002: Started deploy [airflow-dags/analytics_product@15edf4a]: (no justification provided)
  • 17:43 cwhite: set aside WAL for prometheus@k8s in codfw and restart - T354399
  • 17:28 cwhite: set aside WAL for prometheus@k8s in eqiad and restart - T354399
  • 17:25 dancy@deploy2002: Finished scap: testing T358117 (duration: 11m 15s)
  • 17:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T352010)', diff saved to https://phabricator.wikimedia.org/P58654 and previous config saved to /var/cache/conftool/dbconfig/20240307-172227-ladsgroup.json
  • 17:14 dancy@deploy2002: Started scap: testing T358117
  • 17:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P58653 and previous config saved to /var/cache/conftool/dbconfig/20240307-170720-ladsgroup.json
  • 16:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P58652 and previous config saved to /var/cache/conftool/dbconfig/20240307-165213-ladsgroup.json
  • 16:48 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:47 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:47 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:47 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:44 dancy@deploy2002: Installation of scap version "4.70.0" completed for 373 hosts
  • 16:43 dancy@deploy2002: Installing scap version "4.70.0" for 373 hosts
  • 16:38 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2006.codfw.wmnet with OS bullseye
  • 16:38 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2005.codfw.wmnet with OS bullseye
  • 16:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T352010)', diff saved to https://phabricator.wikimedia.org/P58651 and previous config saved to /var/cache/conftool/dbconfig/20240307-163706-ladsgroup.json
  • 16:29 cdanis: T343529 ✔ cdanis@prometheus2005.codfw.wmnet ~ 🕦☕sudo systemctl restart thanos-sidecar@k8s.service
  • 16:20 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.21 refs T354439
  • 16:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2112.codfw.wmnet with reason: Maintenance
  • 16:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2112.codfw.wmnet with reason: Maintenance
  • 16:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 16:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 16:18 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 16:18 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 16:18 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 16:18 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 16:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T357189)', diff saved to https://phabricator.wikimedia.org/P58650 and previous config saved to /var/cache/conftool/dbconfig/20240307-161720-arnaudb.json
  • 16:06 claime: bouncing prometheus@k8s.service - T343529
  • 16:04 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts etherpad2001.codfw.wmnet
  • 16:04 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:02 dzahn@cumin2002: START - Cookbook sre.dns.netbox
  • 16:02 mutante: deleting etherpad2001 VM -replaced by etherpad2002 - T357159
  • 16:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P58649 and previous config saved to /var/cache/conftool/dbconfig/20240307-160213-arnaudb.json
  • 15:59 jnuche@deploy2002: Finished scap: Backport for REST: ignore request body on GET requests (T359509) (duration: 11m 06s)
  • 15:58 dzahn@cumin2002: START - Cookbook sre.hosts.decommission for hosts etherpad2001.codfw.wmnet
  • 15:50 jnuche@deploy2002: jnuche: Continuing with sync
  • 15:50 jnuche@deploy2002: jnuche: Backport for REST: ignore request body on GET requests (T359509) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:48 jnuche@deploy2002: Started scap: Backport for REST: ignore request body on GET requests (T359509)
  • 15:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P58648 and previous config saved to /var/cache/conftool/dbconfig/20240307-154706-arnaudb.json
  • 15:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T357189)', diff saved to https://phabricator.wikimedia.org/P58647 and previous config saved to /var/cache/conftool/dbconfig/20240307-153200-arnaudb.json
  • 15:05 ladsgroup@deploy2002: Finished scap: Backport for editcheckreferenceurl: Validate URL returned from Citoid, not input (T359527) (duration: 13m 23s)
  • 15:04 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dbprov2006']
  • 15:04 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dbprov2005']
  • 14:58 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov2006']
  • 14:58 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov2005']
  • 14:58 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 14:58 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['dbprov2006']
  • 14:57 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['dbprov2005']
  • 14:57 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov2006']
  • 14:57 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 14:57 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov2005']
  • 14:56 fabfur: repool cp4037 for very short time to process and collect logs from HAProxy/Benthos (T358109)
  • 14:55 ladsgroup@deploy2002: kemayo and ladsgroup: Continuing with sync
  • 14:53 ladsgroup@deploy2002: kemayo and ladsgroup: Backport for editcheckreferenceurl: Validate URL returned from Citoid, not input (T359527) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:51 ladsgroup@deploy2002: Started scap: Backport for editcheckreferenceurl: Validate URL returned from Citoid, not input (T359527)
  • 14:51 urbanecm@deploy2002: Finished scap: Backport for wikimaniawiki: Update logos to 2024 (T358379), knwiki: Add importupload userright to administrator usergroup (T359545) (duration: 12m 14s)
  • 14:46 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:46 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:41 urbanecm@deploy2002: urbanecm and anzx: Continuing with sync
  • 14:40 urbanecm@deploy2002: urbanecm and anzx: Backport for wikimaniawiki: Update logos to 2024 (T358379), knwiki: Add importupload userright to administrator usergroup (T359545) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:38 moritzm: import tomcat9 9.0.43-2~deb11u9+wmf12u1 to apt.wikimedia.org T359333
  • 14:38 urbanecm@deploy2002: Started scap: Backport for wikimaniawiki: Update logos to 2024 (T358379), knwiki: Add importupload userright to administrator usergroup (T359545)
  • 14:35 urbanecm@deploy2002: Finished scap: Backport for itwikivoyage: update wgNamespacesToBeSearchedDefault (T358456), kowikisource: add NamespaceAliases for User and Usertalk namespaces (T358508) (duration: 13m 34s)
  • 14:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1167 (T357189)', diff saved to https://phabricator.wikimedia.org/P58644 and previous config saved to /var/cache/conftool/dbconfig/20240307-143336-arnaudb.json
  • 14:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 14:33 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 14:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 14:33 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 14:32 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1040.eqiad.wmnet with reason: Bootstrapping — T354560
  • 14:32 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1040.eqiad.wmnet with reason: Bootstrapping — T354560
  • 14:31 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 14:31 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 14:31 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 14:30 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 14:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2107.codfw.wmnet with reason: Maintenance
  • 14:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2107.codfw.wmnet with reason: Maintenance
  • 14:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 14:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 14:26 btullis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
  • 14:25 btullis@deploy2002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
  • 14:25 btullis@deploy2002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
  • 14:25 btullis@deploy2002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
  • 14:25 btullis@deploy2002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 14:25 urbanecm@deploy2002: urbanecm and anzx: Continuing with sync
  • 14:24 btullis@deploy2002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 14:24 btullis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
  • 14:24 btullis@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
  • 14:24 btullis@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
  • 14:23 btullis@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
  • 14:23 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:23 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:22 urbanecm@deploy2002: urbanecm and anzx: Backport for itwikivoyage: update wgNamespacesToBeSearchedDefault (T358456), kowikisource: add NamespaceAliases for User and Usertalk namespaces (T358508) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:22 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:20 urbanecm@deploy2002: Started scap: Backport for itwikivoyage: update wgNamespacesToBeSearchedDefault (T358456), kowikisource: add NamespaceAliases for User and Usertalk namespaces (T358508)
  • 14:20 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:20 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:15 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:14 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync hiera as instructed by failed reimage cookbook - bking@cumin2002 - T358727"
  • 14:13 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync hiera as instructed by failed reimage cookbook - bking@cumin2002 - T358727"
  • 14:11 jnuche@deploy2002: Finished scap: Backport for REST: allow lower-case method names (duration: 11m 40s)
  • 14:11 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 14:11 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 14:10 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 14:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 14:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 14:06 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 14:01 jnuche@deploy2002: jnuche: Continuing with sync
  • 14:01 btullis@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 14:01 jnuche@deploy2002: jnuche: Backport for REST: allow lower-case method names synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:00 btullis@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 13:59 jnuche@deploy2002: Started scap: Backport for REST: allow lower-case method names
  • 13:42 arnaudb@cumin1002: dbctl commit (dc=all): 'db1220 (re)pooling @ 100%: Reimaging + upgrade done', diff saved to https://phabricator.wikimedia.org/P58643 and previous config saved to /var/cache/conftool/dbconfig/20240307-134226-arnaudb.json
  • 13:36 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 13:36 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 13:36 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 13:36 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 13:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2113.codfw.wmnet with reason: Maintenance
  • 13:33 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2113.codfw.wmnet with reason: Maintenance
  • 13:32 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 13:32 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 13:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 13:29 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 13:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 13:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 13:27 jnuche@deploy2002: Finished deploy [zuul/deploy@efce3ee]: test deployment for new host (duration: 00m 15s)
  • 13:27 jnuche@deploy2002: Started deploy [zuul/deploy@efce3ee]: test deployment for new host
  • 13:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db1220 (re)pooling @ 75%: Reimaging + upgrade done', diff saved to https://phabricator.wikimedia.org/P58642 and previous config saved to /var/cache/conftool/dbconfig/20240307-132721-arnaudb.json
  • 13:27 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 13:27 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 13:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2217 (T352010)', diff saved to https://phabricator.wikimedia.org/P58641 and previous config saved to /var/cache/conftool/dbconfig/20240307-131520-ladsgroup.json
  • 13:15 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 13:15 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 13:15 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 13:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T352010)', diff saved to https://phabricator.wikimedia.org/P58640 and previous config saved to /var/cache/conftool/dbconfig/20240307-131509-ladsgroup.json
  • 13:14 jiji@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 13:14 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 13:13 jiji@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 13:12 jiji@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 13:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db1220 (re)pooling @ 50%: Reimaging + upgrade done', diff saved to https://phabricator.wikimedia.org/P58639 and previous config saved to /var/cache/conftool/dbconfig/20240307-131216-arnaudb.json
  • 13:12 jiji@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 13:11 jiji@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:10 jiji@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:01 claime: trafficserver: move 60% of traffic to mw on k8s - T357508
  • 13:01 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 13:01 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 13:01 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 13:00 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 13:00 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 13:00 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 13:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P58638 and previous config saved to /var/cache/conftool/dbconfig/20240307-130002-ladsgroup.json
  • 12:59 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 12:59 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 12:57 arnaudb@cumin1002: dbctl commit (dc=all): 'db1220 (re)pooling @ 40%: Reimaging + upgrade done', diff saved to https://phabricator.wikimedia.org/P58637 and previous config saved to /var/cache/conftool/dbconfig/20240307-125711-arnaudb.json
  • 12:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P58636 and previous config saved to /var/cache/conftool/dbconfig/20240307-124456-ladsgroup.json
  • 12:42 arnaudb@cumin1002: dbctl commit (dc=all): 'db1220 (re)pooling @ 30%: Reimaging + upgrade done', diff saved to https://phabricator.wikimedia.org/P58635 and previous config saved to /var/cache/conftool/dbconfig/20240307-124206-arnaudb.json
  • 12:34 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 12:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T352010)', diff saved to https://phabricator.wikimedia.org/P58634 and previous config saved to /var/cache/conftool/dbconfig/20240307-122949-ladsgroup.json
  • 12:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db1220 (re)pooling @ 25%: Reimaging + upgrade done', diff saved to https://phabricator.wikimedia.org/P58633 and previous config saved to /var/cache/conftool/dbconfig/20240307-122701-arnaudb.json
  • 12:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db1220 (re)pooling @ 20%: Reimaging + upgrade done', diff saved to https://phabricator.wikimedia.org/P58632 and previous config saved to /var/cache/conftool/dbconfig/20240307-121155-arnaudb.json
  • 12:04 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add OVS codfw1dev test prefixes - taavi@cumin1002"
  • 12:00 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add OVS codfw1dev test prefixes - taavi@cumin1002"
  • 11:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db1220 (re)pooling @ 15%: Reimaging + upgrade done', diff saved to https://phabricator.wikimedia.org/P58631 and previous config saved to /var/cache/conftool/dbconfig/20240307-115650-arnaudb.json
  • 11:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2001.codfw.wmnet
  • 11:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db1220 (re)pooling @ 10%: Reimaging + upgrade done', diff saved to https://phabricator.wikimedia.org/P58630 and previous config saved to /var/cache/conftool/dbconfig/20240307-114145-arnaudb.json
  • 11:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2001.codfw.wmnet
  • 11:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db1220 (re)pooling @ 5%: Reimaging + upgrade done', diff saved to https://phabricator.wikimedia.org/P58629 and previous config saved to /var/cache/conftool/dbconfig/20240307-112640-arnaudb.json
  • 11:23 jnuche@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.21 refs T354439 (duration: 10m 13s)
  • 11:17 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
  • 11:16 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
  • 11:15 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
  • 11:15 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/zotero: apply
  • 11:13 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
  • 11:13 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
  • 11:13 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.21 refs T354439
  • 11:12 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
  • 11:12 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
  • 11:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db1220 (re)pooling @ 2%: Reimaging + upgrade done', diff saved to https://phabricator.wikimedia.org/P58628 and previous config saved to /var/cache/conftool/dbconfig/20240307-111134-arnaudb.json
  • 11:07 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 11:06 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 11:04 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 11:03 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 11:03 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 11:03 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 10:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db1220 (re)pooling @ 1%: Reimaging + upgrade done', diff saved to https://phabricator.wikimedia.org/P58627 and previous config saved to /var/cache/conftool/dbconfig/20240307-105629-arnaudb.json
  • 10:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1220.eqiad.wmnet with OS bookworm
  • 10:48 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:48 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:37 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 10:36 arnaudb@cumin1002: dbctl commit (dc=all): 'db2131 (re)pooling @ 100%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P58626 and previous config saved to /var/cache/conftool/dbconfig/20240307-103630-arnaudb.json
  • 10:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
  • 10:26 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
  • 10:21 jnuche: restarting Jenkins CI to update plugins
  • 10:21 volans: updated spicerack on cumin[12]002 to v8.4.1
  • 10:21 arnaudb@cumin1002: dbctl commit (dc=all): 'db2131 (re)pooling @ 75%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P58625 and previous config saved to /var/cache/conftool/dbconfig/20240307-102125-arnaudb.json
  • 10:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:14 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1220.eqiad.wmnet with OS bookworm
  • 10:12 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1220.eqiad.wmnet with reason: T358642
  • 10:11 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1220.eqiad.wmnet with reason: T358642
  • 10:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool to upgrade T358642', diff saved to https://phabricator.wikimedia.org/P58624 and previous config saved to /var/cache/conftool/dbconfig/20240307-101004-arnaudb.json
  • 10:06 arnaudb@cumin1002: dbctl commit (dc=all): 'db2131 (re)pooling @ 50%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P58623 and previous config saved to /var/cache/conftool/dbconfig/20240307-100620-arnaudb.json
  • 10:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T352010)', diff saved to https://phabricator.wikimedia.org/P58622 and previous config saved to /var/cache/conftool/dbconfig/20240307-100611-ladsgroup.json
  • 10:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 10:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 10:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T352010)', diff saved to https://phabricator.wikimedia.org/P58621 and previous config saved to /var/cache/conftool/dbconfig/20240307-100549-ladsgroup.json
  • 09:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:52 root@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 09:52 root@deploy2002: helmfile [eqiad] [canary] DONE helmfile.d/services/mw-jobrunner : sync
  • 09:52 root@deploy2002: helmfile [eqiad] [canary] START helmfile.d/services/mw-jobrunner : sync
  • 09:52 root@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
  • 09:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:51 arnaudb@cumin1002: dbctl commit (dc=all): 'db2131 (re)pooling @ 25%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P58620 and previous config saved to /var/cache/conftool/dbconfig/20240307-095108-arnaudb.json
  • 09:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P58619 and previous config saved to /var/cache/conftool/dbconfig/20240307-095043-ladsgroup.json
  • 09:48 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:48 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:39 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P58618 and previous config saved to /var/cache/conftool/dbconfig/20240307-093536-ladsgroup.json
  • 09:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T352010)', diff saved to https://phabricator.wikimedia.org/P58616 and previous config saved to /var/cache/conftool/dbconfig/20240307-092029-ladsgroup.json
  • 09:18 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.21 refs T354439
  • 09:11 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 09:05 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 09:04 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 08:58 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 08:27 moritzm: installing nftables bugfix updates from bullseye point release
  • 08:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:06 moritzm: revoke Kerberos host principals for apt1001/apt2001 T331613
  • 06:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T352010)', diff saved to https://phabricator.wikimedia.org/P58615 and previous config saved to /var/cache/conftool/dbconfig/20240307-064112-ladsgroup.json
  • 06:41 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 06:40 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 06:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T352010)', diff saved to https://phabricator.wikimedia.org/P58614 and previous config saved to /var/cache/conftool/dbconfig/20240307-064050-ladsgroup.json
  • 06:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P58613 and previous config saved to /var/cache/conftool/dbconfig/20240307-062541-ladsgroup.json
  • 06:22 _joe_: updated php-luasandbox everywhere T353414
  • 06:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P58612 and previous config saved to /var/cache/conftool/dbconfig/20240307-061034-ladsgroup.json
  • 05:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T352010)', diff saved to https://phabricator.wikimedia.org/P58611 and previous config saved to /var/cache/conftool/dbconfig/20240307-055528-ladsgroup.json
  • 04:55 kart_: Updated cxserver to 2024-03-05-082211-production (T353136, T353259, T350773)
  • 04:50 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 04:50 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 04:49 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 04:48 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 04:43 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 04:41 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 04:15 cstone: civicrm upgraded from 2dd94b1e to ef6ebc35
  • 04:13 cstone: payments-wiki upgraded from 99d8e9f6 to 6a3ff7e5
  • 03:16 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:16 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2169 (T352010)', diff saved to https://phabricator.wikimedia.org/P58610 and previous config saved to /var/cache/conftool/dbconfig/20240307-021652-ladsgroup.json
  • 02:16 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 02:16 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 02:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T352010)', diff saved to https://phabricator.wikimedia.org/P58609 and previous config saved to /var/cache/conftool/dbconfig/20240307-021631-ladsgroup.json
  • 02:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P58608 and previous config saved to /var/cache/conftool/dbconfig/20240307-020124-ladsgroup.json
  • 01:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:50 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 01:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P58607 and previous config saved to /var/cache/conftool/dbconfig/20240307-014618-ladsgroup.json
  • 01:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:36 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 01:35 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 01:31 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1025.eqiad.wmnet with reason: host reimage
  • 01:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T352010)', diff saved to https://phabricator.wikimedia.org/P58606 and previous config saved to /var/cache/conftool/dbconfig/20240307-013111-ladsgroup.json
  • 01:29 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 01:29 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1025.eqiad.wmnet with reason: host reimage
  • 01:28 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 01:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:24 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:23 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:23 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:07 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1025.mgmt.eqiad.wmnet with reboot policy FORCED
  • 01:04 jclark@cumin1002: START - Cookbook sre.hosts.provision for host wdqs1025.mgmt.eqiad.wmnet with reboot policy FORCED
  • 00:46 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 00:37 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1025.eqiad.wmnet with OS bullseye

2024-03-06

  • 23:16 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 21:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T352010)', diff saved to https://phabricator.wikimedia.org/P58605 and previous config saved to /var/cache/conftool/dbconfig/20240306-213603-ladsgroup.json
  • 21:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 21:36 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 21:35 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 21:35 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 21:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T352010)', diff saved to https://phabricator.wikimedia.org/P58604 and previous config saved to /var/cache/conftool/dbconfig/20240306-213525-ladsgroup.json
  • 21:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P58601 and previous config saved to /var/cache/conftool/dbconfig/20240306-212019-ladsgroup.json
  • 21:19 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P58600 and previous config saved to /var/cache/conftool/dbconfig/20240306-210512-ladsgroup.json
  • 21:04 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1025
  • 21:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1025
  • 21:01 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:01 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:56 ejegg: changed wmf_cli logger to point to stderr instead of stdout
  • 20:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T352010)', diff saved to https://phabricator.wikimedia.org/P58599 and previous config saved to /var/cache/conftool/dbconfig/20240306-205006-ladsgroup.json
  • 20:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:20 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 20:19 taavi@deploy2002: Finished scap: Backport for Set wgFlaggedRevsHandleIncludes to FR_INCLUDES_CURRENT on ruwiki (duration: 12m 01s)
  • 20:11 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:09 taavi@deploy2002: taavi: Continuing with sync
  • 20:08 taavi@deploy2002: taavi: Backport for Set wgFlaggedRevsHandleIncludes to FR_INCLUDES_CURRENT on ruwiki synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:07 taavi@deploy2002: Started scap: Backport for Set wgFlaggedRevsHandleIncludes to FR_INCLUDES_CURRENT on ruwiki
  • 20:05 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:05 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:00 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wdqs1025']
  • 18:59 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wdqs1025']
  • 18:59 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 18:46 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@af71f6e] (releasing): (no justification provided) (duration: 00m 41s)
  • 18:45 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@af71f6e] (releasing): (no justification provided)
  • 18:21 urbanecm@deploy2002: Finished scap: Backport for JS REST: make POST default to empty object (T359216) (duration: 14m 19s)
  • 18:12 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 18:11 urbanecm@deploy2002: urbanecm: Backport for JS REST: make POST default to empty object (T359216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:07 urbanecm@deploy2002: Started scap: Backport for JS REST: make POST default to empty object (T359216)
  • 17:53 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 17:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db2196 (re)pooling @ 100%: Clone source repooling', diff saved to https://phabricator.wikimedia.org/P58598 and previous config saved to /var/cache/conftool/dbconfig/20240306-173954-arnaudb.json
  • 17:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2196 (re)pooling @ 75%: Clone source repooling', diff saved to https://phabricator.wikimedia.org/P58597 and previous config saved to /var/cache/conftool/dbconfig/20240306-172449-arnaudb.json
  • 17:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:18 denisse: failing over from alert2001 to alert1001
  • 17:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:15 denisse@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host alert1001.wikimedia.org with OS bookworm
  • 17:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db2196 (re)pooling @ 50%: Clone source repooling', diff saved to https://phabricator.wikimedia.org/P58596 and previous config saved to /var/cache/conftool/dbconfig/20240306-170944-arnaudb.json
  • 17:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:02 claime: restart rsyslog on mw2436
  • 17:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T352010)', diff saved to https://phabricator.wikimedia.org/P58595 and previous config saved to /var/cache/conftool/dbconfig/20240306-170125-ladsgroup.json
  • 17:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 17:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 17:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T352010)', diff saved to https://phabricator.wikimedia.org/P58594 and previous config saved to /var/cache/conftool/dbconfig/20240306-170106-ladsgroup.json
  • 16:56 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:56 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db2196 (re)pooling @ 25%: Clone source repooling', diff saved to https://phabricator.wikimedia.org/P58593 and previous config saved to /var/cache/conftool/dbconfig/20240306-165439-arnaudb.json
  • 16:52 cgoubert@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=(mw1441.eqiad.wmnet|mw1442.eqiad.wmnet|mw1451.eqiad.wmnet|mw1452.eqiad.wmnet|mw1454.eqiad.wmnet|mw1455.eqiad.wmnet),cluster=kubernetes,service=kubesvc
  • 16:52 claime: Pooling and uncordoning mw1441.eqiad.wmnet,mw1442.eqiad.wmnet,mw1451.eqiad.wmnet,mw1452.eqiad.wmnet,mw1454.eqiad.wmnet,mw1455.eqiad.wmnet - T351074
  • 16:52 denisse@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on alert1001.wikimedia.org with reason: host reimage
  • 16:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2196.codfw.wmnet onto db2131.codfw.wmnet
  • 16:49 volans: uploaded spicerack_8.4.1 to apt.wikimedia.org bullseye-wikimedia
  • 16:48 denisse@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on alert1001.wikimedia.org with reason: host reimage
  • 16:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P58592 and previous config saved to /var/cache/conftool/dbconfig/20240306-164559-ladsgroup.json
  • 16:44 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:44 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:36 claime: Running homer 'cr*eqiad*' commit 'T351074'
  • 16:36 denisse@cumin2002: START - Cookbook sre.hosts.reimage for host alert1001.wikimedia.org with OS bookworm
  • 16:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P58591 and previous config saved to /var/cache/conftool/dbconfig/20240306-163053-ladsgroup.json
  • 16:26 denisse: Disable meta-monitoring for alert1001 - T333615
  • 16:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T352010)', diff saved to https://phabricator.wikimedia.org/P58590 and previous config saved to /var/cache/conftool/dbconfig/20240306-161546-ladsgroup.json
  • 16:05 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:05 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:00 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters
  • 16:00 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1451.eqiad.wmnet with OS bullseye
  • 15:59 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0)
  • 15:59 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl
  • 15:57 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1455.eqiad.wmnet with OS bullseye
  • 15:57 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0)
  • 15:55 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:55 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:55 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1442.eqiad.wmnet with OS bullseye
  • 15:54 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance
  • 15:51 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1452.eqiad.wmnet with OS bullseye
  • 15:50 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:50 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1454.eqiad.wmnet with OS bullseye
  • 15:50 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:49 jiji@cumin1002: END (FAIL) - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner (exit_code=99)
  • 15:48 root@deploy2002: helmfile [eqiad] [canary] FAIL helmfile.d/services/mw-jobrunner : sync
  • 15:48 root@deploy2002: helmfile [eqiad] [main] FAIL helmfile.d/services/mw-jobrunner : sync
  • 15:48 root@deploy2002: helmfile [eqiad] [canary] START helmfile.d/services/mw-jobrunner : sync
  • 15:48 root@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
  • 15:48 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner
  • 15:48 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0)
  • 15:48 jiji@cumin1002: [DRY-RUN] MediaWiki read-only period ends at: 2024-03-06 15:48:02.718097
  • 15:43 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.02-set-readonly
  • 15:42 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1451.eqiad.wmnet with reason: host reimage
  • 15:39 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1455.eqiad.wmnet with reason: host reimage
  • 15:36 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1442.eqiad.wmnet with reason: host reimage
  • 15:34 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0)
  • 15:34 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance
  • 15:34 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1452.eqiad.wmnet with reason: host reimage
  • 15:31 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1454.eqiad.wmnet with reason: host reimage
  • 15:31 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 15:29 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1441.eqiad.wmnet with reason: host reimage
  • 15:28 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0)
  • 15:28 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1455.eqiad.wmnet with reason: host reimage
  • 15:28 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks
  • 15:27 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1452.eqiad.wmnet with reason: host reimage
  • 15:27 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1442.eqiad.wmnet with reason: host reimage
  • 15:27 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1454.eqiad.wmnet with reason: host reimage
  • 15:27 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1451.eqiad.wmnet with reason: host reimage
  • 15:27 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1441.eqiad.wmnet with reason: host reimage
  • 15:25 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2196.codfw.wmnet onto db2131.codfw.wmnet
  • 15:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2131.codfw.wmnet with reason: provisionning db2131.codfw.wmnet - T355422
  • 15:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2131.codfw.wmnet with reason: provisionning db2131.codfw.wmnet - T355422
  • 15:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: provisionning db2131.codfw.wmnet - T355422
  • 15:23 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: provisionning db2131.codfw.wmnet - T355422
  • 15:23 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki viwiki --current --all --touched-after=20230613000000 --start '["8661638"]' 2>&1 | tee ~/T315510-viwiki-2 # in tmux
  • 15:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool to clone on db2131 T358642', diff saved to https://phabricator.wikimedia.org/P58589 and previous config saved to /var/cache/conftool/dbconfig/20240306-152130-arnaudb.json
  • 15:17 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0)
  • 15:17 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2131.codfw.wmnet with OS bookworm
  • 15:13 herron@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-codfw
  • 15:13 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1455.eqiad.wmnet with OS bullseye
  • 15:13 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1454.eqiad.wmnet with OS bullseye
  • 15:13 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1452.eqiad.wmnet with OS bullseye
  • 15:13 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1451.eqiad.wmnet with OS bullseye
  • 15:13 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1442.eqiad.wmnet with OS bullseye
  • 15:12 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1441.eqiad.wmnet with OS bullseye
  • 15:12 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl
  • 15:11 jiji@cumin1002: END (FAIL) - Cookbook sre.switchdc.mediawiki.00-optional-warmup-caches (exit_code=99)
  • 15:11 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.00-optional-warmup-caches
  • 14:56 jiji@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0)
  • 14:56 jiji@cumin1002: START - Cookbook sre.switchdc.mediawiki.00-disable-puppet
  • 14:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2131.codfw.wmnet with reason: host reimage
  • 14:52 claime: Depooling mw1441.eqiad.wmnet,mw1442.eqiad.wmnet,mw1451.eqiad.wmnet,mw1452.eqiad.wmnet,mw1454.eqiad.wmnet,mw1455.eqiad.wmnet for reimage to kubernetes - T351074
  • 14:51 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2131.codfw.wmnet with reason: host reimage
  • 14:51 herron@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-codfw
  • 14:45 moritzm: installing postgres 13 security updates
  • 14:44 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 14:42 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 14:42 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 14:42 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 14:41 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 14:41 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 14:40 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 14:40 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 14:40 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 14:40 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 14:34 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2131.codfw.wmnet with OS bookworm
  • 14:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2131.codfw.wmnet with reason: Silence for reimaging
  • 14:33 moritzm: installing nftables bugfix updates from bullseye point release
  • 14:33 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2131.codfw.wmnet with reason: Silence for reimaging
  • 14:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool to reimage T358642', diff saved to https://phabricator.wikimedia.org/P58588 and previous config saved to /var/cache/conftool/dbconfig/20240306-143204-arnaudb.json
  • 14:20 akosiaris@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(parse2002.codfw.wmnet|parse2003.codfw.wmnet|parse2004.codfw.wmnet|parse2005.codfw.wmnet|parse2006.codfw.wmnet|parse2007.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 14:18 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 14:11 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
  • 14:11 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
  • 14:11 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
  • 14:11 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
  • 13:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T357189)', diff saved to https://phabricator.wikimedia.org/P58587 and previous config saved to /var/cache/conftool/dbconfig/20240306-135102-arnaudb.json
  • 13:35 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P58586 and previous config saved to /var/cache/conftool/dbconfig/20240306-133555-arnaudb.json
  • 13:30 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2004.codfw.wmnet with OS bullseye
  • 13:27 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.21 refs T354439
  • 13:27 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2007.codfw.wmnet with OS bullseye
  • 13:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2006.codfw.wmnet with OS bullseye
  • 13:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2003.codfw.wmnet with OS bullseye
  • 13:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P58585 and previous config saved to /var/cache/conftool/dbconfig/20240306-132048-arnaudb.json
  • 13:20 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2005.codfw.wmnet with OS bullseye
  • 13:17 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2002.codfw.wmnet with OS bullseye
  • 13:16 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
  • 13:16 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
  • 13:16 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
  • 13:16 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
  • 13:13 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:13 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:11 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2004.codfw.wmnet with reason: host reimage
  • 13:08 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2007.codfw.wmnet with reason: host reimage
  • 13:06 cgoubert@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=(mw2371.codfw.wmnet|mw2372.codfw.wmnet|mw2373.codfw.wmnet|mw2374.codfw.wmnet|mw2375.codfw.wmnet|mw2376.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 13:06 claime: Pooling and uncordoning mw2372.codfw.wmnet mw2373.codfw.wmnet mw2374.codfw.wmnet mw2375.codfw.wmnet mw2376.codfw.wmnet - T351074
  • 13:06 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2006.codfw.wmnet with reason: host reimage
  • 13:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T357189)', diff saved to https://phabricator.wikimedia.org/P58583 and previous config saved to /var/cache/conftool/dbconfig/20240306-130542-arnaudb.json
  • 13:03 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2003.codfw.wmnet with reason: host reimage
  • 13:01 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
  • 13:01 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
  • 13:01 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2005.codfw.wmnet with reason: host reimage
  • 13:01 robh@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:01 robh@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fixing incorrect asset tags - robh@cumin1002"
  • 13:00 robh@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fixing incorrect asset tags - robh@cumin1002"
  • 12:59 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2007.codfw.wmnet with reason: host reimage
  • 12:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:59 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2002.codfw.wmnet with reason: host reimage
  • 12:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:58 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2006.codfw.wmnet with reason: host reimage
  • 12:58 robh@cumin1002: START - Cookbook sre.dns.netbox
  • 12:57 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2005.codfw.wmnet with reason: host reimage
  • 12:57 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2004.codfw.wmnet with reason: host reimage
  • 12:57 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2003.codfw.wmnet with reason: host reimage
  • 12:56 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2002.codfw.wmnet with reason: host reimage
  • 12:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2218 (T357189)', diff saved to https://phabricator.wikimedia.org/P58582 and previous config saved to /var/cache/conftool/dbconfig/20240306-125529-arnaudb.json
  • 12:55 jnuche@deploy2002: Finished scap: Backport for Set two more wikis to read new for pagelinks migration (T351237) (duration: 13m 20s)
  • 12:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 12:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 12:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:54 claime: Running homer 'cr*codfw*' commit 'T351074'
  • 12:52 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2374.codfw.wmnet with OS bullseye
  • 12:49 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2373.codfw.wmnet with OS bullseye
  • 12:46 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
  • 12:46 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2376.codfw.wmnet with OS bullseye
  • 12:46 jnuche@deploy2002: jnuche and ladsgroup: Continuing with sync
  • 12:45 jnuche@deploy2002: jnuche and ladsgroup: Backport for Set two more wikis to read new for pagelinks migration (T351237) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:43 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2007.codfw.wmnet with OS bullseye
  • 12:43 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2371.codfw.wmnet with OS bullseye
  • 12:42 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2006.codfw.wmnet with OS bullseye
  • 12:42 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2005.codfw.wmnet with OS bullseye
  • 12:42 jnuche@deploy2002: Started scap: Backport for Set two more wikis to read new for pagelinks migration (T351237)
  • 12:41 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2004.codfw.wmnet with OS bullseye
  • 12:41 volans@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 12:41 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2372.codfw.wmnet with OS bullseye
  • 12:41 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2003.codfw.wmnet with OS bullseye
  • 12:40 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2002.codfw.wmnet with OS bullseye
  • 12:39 jnuche@deploy2002: Finished scap: Backport for Add missing function argument to titleWithoutPrefix call (T359290) (duration: 11m 10s)
  • 12:39 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2375.codfw.wmnet with OS bullseye
  • 12:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:35 volans@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 12:33 volans@cumin2002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 12:33 volans@cumin2002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 12:33 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2374.codfw.wmnet with reason: host reimage
  • 12:30 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2373.codfw.wmnet with reason: host reimage
  • 12:30 jnuche@deploy2002: jnuche: Continuing with sync
  • 12:30 jnuche@deploy2002: jnuche: Backport for Add missing function argument to titleWithoutPrefix call (T359290) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:28 jnuche@deploy2002: Started scap: Backport for Add missing function argument to titleWithoutPrefix call (T359290)
  • 12:27 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2376.codfw.wmnet with reason: host reimage
  • 12:26 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
  • 12:24 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2371.codfw.wmnet with reason: host reimage
  • 12:22 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2372.codfw.wmnet with reason: host reimage
  • 12:20 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2375.codfw.wmnet with reason: host reimage
  • 12:18 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2372.codfw.wmnet with reason: host reimage
  • 12:18 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2376.codfw.wmnet with reason: host reimage
  • 12:18 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2374.codfw.wmnet with reason: host reimage
  • 12:18 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2373.codfw.wmnet with reason: host reimage
  • 12:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T352010)', diff saved to https://phabricator.wikimedia.org/P58581 and previous config saved to /var/cache/conftool/dbconfig/20240306-121800-ladsgroup.json
  • 12:18 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2371.codfw.wmnet with reason: host reimage
  • 12:17 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2375.codfw.wmnet with reason: host reimage
  • 12:17 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 12:17 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 12:10 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:02 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2376.codfw.wmnet with OS bullseye
  • 12:02 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2375.codfw.wmnet with OS bullseye
  • 12:02 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2374.codfw.wmnet with OS bullseye
  • 12:02 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2373.codfw.wmnet with OS bullseye
  • 12:02 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2372.codfw.wmnet with OS bullseye
  • 12:02 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2371.codfw.wmnet with OS bullseye
  • 12:01 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:01 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:53 moritzm: restarting Exim on the MXes to pick up new GNU TLS
  • 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling restart_daemons on A:ldap-replicas-eqiad
  • 11:51 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling restart_daemons on A:ldap-replicas-eqiad
  • 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling restart_daemons on A:ldap-replicas-codfw
  • 11:47 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling restart_daemons on A:ldap-replicas-codfw
  • 11:43 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:43 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:42 claime: Depooling mw2371.codfw.wmnet,mw2372.codfw.wmnet,mw2373.codfw.wmnet,mw2374.codfw.wmnet,mw2375.codfw.wmnet,mw2376.codfw.wmnet for reimage to kubernetes - T351074
  • 11:41 cgoubert@cumin2002: conftool action : set/weight=30; selector: cluster=api_appserver,service=canary,dc=codfw
  • 11:41 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: cluster=api_appserver,service=canary,dc=codfw
  • 11:40 claime: pooling new canaries - T351074
  • 11:37 claime: Enabling and running puppet on deployment servers - T351074
  • 11:33 claime: Enabling and running puppet on new canaries mw2283.codfw.wmnet,mw2284.codfw.wmnet - T351074
  • 11:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:31 claime: Disabling puppet on mw2374.codfw.wmnet,mw2376.codfw.wmnet,mw2283.codfw.wmnet,mw2284.codfw.wmnet,mw2371.codfw.wmnet,mw2372.codfw.wmnet,mw2373.codfw.wmnet,mw2375.codfw.wmnet for canary api_appserver move - T351074
  • 11:28 claime: Disabling puppet on deployment servers for canary api_appserver move - T351074
  • 11:21 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 11:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2106.codfw.wmnet onto db2206.codfw.wmnet
  • 11:21 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 11:20 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 11:19 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 11:17 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 11:17 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 11:10 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 11:10 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 11:08 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 11:08 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 11:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2108.codfw.wmnet onto db2208.codfw.wmnet
  • 10:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Host has already been cloned, there was 2 candidate master', diff saved to https://phabricator.wikimedia.org/P58580 and previous config saved to /var/cache/conftool/dbconfig/20240306-105007-arnaudb.json
  • 10:49 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: sync
  • 10:49 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: sync
  • 10:48 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 10:47 jiji@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 10:46 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 10:46 jiji@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 10:38 akosiaris@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(parse2008.codfw.wmnet|parse2009.codfw.wmnet|parse2010.codfw.wmnet|parse2011.codfw.wmnet|parse2012.codfw.wmnet|parse2013.codfw.wmnet|parse2014.codfw.wmnet|parse2015.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 10:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2105.codfw.wmnet onto db2205.codfw.wmnet
  • 10:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db2204 (re)pooling @ 100%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58579 and previous config saved to /var/cache/conftool/dbconfig/20240306-102404-arnaudb.json
  • 10:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db2203 (re)pooling @ 100%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58578 and previous config saved to /var/cache/conftool/dbconfig/20240306-102402-arnaudb.json
  • 10:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2196 (re)pooling @ 100%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58577 and previous config saved to /var/cache/conftool/dbconfig/20240306-102357-arnaudb.json
  • 10:11 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 10:11 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 10:10 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 10:09 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 10:09 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 10:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db2204 (re)pooling @ 75%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58576 and previous config saved to /var/cache/conftool/dbconfig/20240306-100859-arnaudb.json
  • 10:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2203 (re)pooling @ 75%: Cloning done', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240306-100853-arnaudb.json
  • 10:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2196 (re)pooling @ 75%: Cloning done', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240306-100847-arnaudb.json
  • 10:08 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 10:08 jnuche@deploy2002: Finished scap: Backport for Rename `--color-link--visited` to `--color-visited` (T356928) (duration: 15m 35s)
  • 09:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2217 (re)pooling @ 100%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58574 and previous config saved to /var/cache/conftool/dbconfig/20240306-095820-arnaudb.json
  • 09:57 jnuche@deploy2002: jnuche and toyofuku: Continuing with sync
  • 09:56 jnuche@deploy2002: jnuche and toyofuku: Backport for Rename `--color-link--visited` to `--color-visited` (T356928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2204 (re)pooling @ 50%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58573 and previous config saved to /var/cache/conftool/dbconfig/20240306-095354-arnaudb.json
  • 09:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2203 (re)pooling @ 50%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58572 and previous config saved to /var/cache/conftool/dbconfig/20240306-095347-arnaudb.json
  • 09:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2196 (re)pooling @ 50%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58571 and previous config saved to /var/cache/conftool/dbconfig/20240306-095342-arnaudb.json
  • 09:52 jnuche@deploy2002: Started scap: Backport for Rename `--color-link--visited` to `--color-visited` (T356928)
  • 09:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2217 (re)pooling @ 75%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58570 and previous config saved to /var/cache/conftool/dbconfig/20240306-094314-arnaudb.json
  • 09:39 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2011.codfw.wmnet with OS bullseye
  • 09:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2204 (re)pooling @ 25%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58569 and previous config saved to /var/cache/conftool/dbconfig/20240306-093849-arnaudb.json
  • 09:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2203 (re)pooling @ 25%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58568 and previous config saved to /var/cache/conftool/dbconfig/20240306-093842-arnaudb.json
  • 09:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2196 (re)pooling @ 25%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58567 and previous config saved to /var/cache/conftool/dbconfig/20240306-093837-arnaudb.json
  • 09:35 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2010.codfw.wmnet with OS bullseye
  • 09:32 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2014.codfw.wmnet with OS bullseye
  • 09:29 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2012.codfw.wmnet with OS bullseye
  • 09:28 arnaudb@cumin1002: dbctl commit (dc=all): 'db2217 (re)pooling @ 50%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58566 and previous config saved to /var/cache/conftool/dbconfig/20240306-092809-arnaudb.json
  • 09:27 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2015.codfw.wmnet with OS bullseye
  • 09:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2013.codfw.wmnet with OS bullseye
  • 09:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2204 (re)pooling @ 20%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58565 and previous config saved to /var/cache/conftool/dbconfig/20240306-092343-arnaudb.json
  • 09:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2203 (re)pooling @ 20%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58564 and previous config saved to /var/cache/conftool/dbconfig/20240306-092337-arnaudb.json
  • 09:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2196 (re)pooling @ 20%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58563 and previous config saved to /var/cache/conftool/dbconfig/20240306-092332-arnaudb.json
  • 09:23 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2008.codfw.wmnet with OS bullseye
  • 09:20 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2009.codfw.wmnet with OS bullseye
  • 09:20 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2011.codfw.wmnet with reason: host reimage
  • 09:16 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2010.codfw.wmnet with reason: host reimage
  • 09:13 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2014.codfw.wmnet with reason: host reimage
  • 09:13 arnaudb@cumin1002: dbctl commit (dc=all): 'db2217 (re)pooling @ 25%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58562 and previous config saved to /var/cache/conftool/dbconfig/20240306-091304-arnaudb.json
  • 09:11 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2012.codfw.wmnet with reason: host reimage
  • 09:08 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2015.codfw.wmnet with reason: host reimage
  • 09:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2204 (re)pooling @ 15%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58561 and previous config saved to /var/cache/conftool/dbconfig/20240306-090839-arnaudb.json
  • 09:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2203 (re)pooling @ 15%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58560 and previous config saved to /var/cache/conftool/dbconfig/20240306-090833-arnaudb.json
  • 09:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2196 (re)pooling @ 15%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58559 and previous config saved to /var/cache/conftool/dbconfig/20240306-090827-arnaudb.json
  • 09:06 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2108.codfw.wmnet onto db2208.codfw.wmnet
  • 09:06 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2013.codfw.wmnet with reason: host reimage
  • 09:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2108 in db2208 for T355422', diff saved to https://phabricator.wikimedia.org/P58558 and previous config saved to /var/cache/conftool/dbconfig/20240306-090524-arnaudb.json
  • 09:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: provisionning db2208.codfw.wmnet - T355422
  • 09:04 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: provisionning db2208.codfw.wmnet - T355422
  • 09:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: provisionning db2208.codfw.wmnet - T355422
  • 09:04 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2008.codfw.wmnet with reason: host reimage
  • 09:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: provisionning db2208.codfw.wmnet - T355422
  • 09:03 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2015.codfw.wmnet with reason: host reimage
  • 09:02 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2014.codfw.wmnet with reason: host reimage
  • 09:01 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2009.codfw.wmnet with reason: host reimage
  • 09:01 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2013.codfw.wmnet with reason: host reimage
  • 09:00 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2012.codfw.wmnet with reason: host reimage
  • 09:00 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2106.codfw.wmnet onto db2206.codfw.wmnet
  • 09:00 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2011.codfw.wmnet with reason: host reimage
  • 08:59 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2010.codfw.wmnet with reason: host reimage
  • 08:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2106 in db2206 for T355422', diff saved to https://phabricator.wikimedia.org/P58557 and previous config saved to /var/cache/conftool/dbconfig/20240306-085924-arnaudb.json
  • 08:59 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2008.codfw.wmnet with reason: host reimage
  • 08:58 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2009.codfw.wmnet with reason: host reimage
  • 08:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: provisionning db2206.codfw.wmnet - T355422
  • 08:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: provisionning db2206.codfw.wmnet - T355422
  • 08:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: provisionning db2206.codfw.wmnet - T355422
  • 08:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2217 (re)pooling @ 20%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58556 and previous config saved to /var/cache/conftool/dbconfig/20240306-085759-arnaudb.json
  • 08:57 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: provisionning db2206.codfw.wmnet - T355422
  • 08:56 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:56 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:54 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2105.codfw.wmnet onto db2205.codfw.wmnet
  • 08:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2204 (re)pooling @ 10%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58555 and previous config saved to /var/cache/conftool/dbconfig/20240306-085334-arnaudb.json
  • 08:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2203 (re)pooling @ 10%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58554 and previous config saved to /var/cache/conftool/dbconfig/20240306-085327-arnaudb.json
  • 08:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2196 (re)pooling @ 10%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58553 and previous config saved to /var/cache/conftool/dbconfig/20240306-085322-arnaudb.json
  • 08:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2105 in db2205 for T355422', diff saved to https://phabricator.wikimedia.org/P58552 and previous config saved to /var/cache/conftool/dbconfig/20240306-085318-arnaudb.json
  • 08:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2205.codfw.wmnet with reason: provisionning db2205.codfw.wmnet - T355422
  • 08:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2205.codfw.wmnet with reason: provisionning db2205.codfw.wmnet - T355422
  • 08:51 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: provisionning db2205.codfw.wmnet - T355422
  • 08:51 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: provisionning db2205.codfw.wmnet - T355422
  • 08:51 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2105.codfw.wmnet with reason: Silence for cloning
  • 08:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:51 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2105.codfw.wmnet with reason: Silence for cloning
  • 08:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:51 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2205.codfw.wmnet with reason: Silence for cloning
  • 08:50 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2205.codfw.wmnet with reason: Silence for cloning
  • 08:47 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2015.codfw.wmnet with OS bullseye
  • 08:46 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2014.codfw.wmnet with OS bullseye
  • 08:45 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2013.codfw.wmnet with OS bullseye
  • 08:45 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2012.codfw.wmnet with OS bullseye
  • 08:44 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2011.codfw.wmnet with OS bullseye
  • 08:43 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2010.codfw.wmnet with OS bullseye
  • 08:43 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2009.codfw.wmnet with OS bullseye
  • 08:42 arnaudb@cumin1002: dbctl commit (dc=all): 'db2217 (re)pooling @ 15%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58551 and previous config saved to /var/cache/conftool/dbconfig/20240306-084254-arnaudb.json
  • 08:42 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2008.codfw.wmnet with OS bullseye
  • 08:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2204 (re)pooling @ 5%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58550 and previous config saved to /var/cache/conftool/dbconfig/20240306-083829-arnaudb.json
  • 08:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2203 (re)pooling @ 5%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58549 and previous config saved to /var/cache/conftool/dbconfig/20240306-083822-arnaudb.json
  • 08:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2196 (re)pooling @ 5%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58548 and previous config saved to /var/cache/conftool/dbconfig/20240306-083804-arnaudb.json
  • 08:37 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1014.eqiad.wmnet with OS bullseye
  • 08:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db2217 (re)pooling @ 10%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58547 and previous config saved to /var/cache/conftool/dbconfig/20240306-082749-arnaudb.json
  • 07:59 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58545 and previous config saved to /var/cache/conftool/dbconfig/20240306-075950-root.json
  • 07:58 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1014.eqiad.wmnet with reason: host reimage
  • 07:55 akosiaris@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw1356.eqiad.wmnet|mw1357.eqiad.wmnet|parse1002.eqiad.wmnet|parse1003.eqiad.wmnet|parse1004.eqiad.wmnet|parse1005.eqiad.wmnet|parse1006.eqiad.wmnet|parse1007.eqiad.wmnet|parse1008.eqiad.wmnet|parse1009.eqiad.wmnet|parse1010.eqiad.wmnet|parse1011.eqiad.wmnet|parse1012.eqiad.wmnet|parse1013.eqiad.wmnet|parse1014.eqiad.wmnet|parse1015.eqiad.
  • 07:55 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1014.eqiad.wmnet with reason: host reimage
  • 07:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:44 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58544 and previous config saved to /var/cache/conftool/dbconfig/20240306-074445-root.json
  • 07:41 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1014.eqiad.wmnet with OS bullseye
  • 07:29 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58543 and previous config saved to /var/cache/conftool/dbconfig/20240306-072940-root.json
  • 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P58542 and previous config saved to /var/cache/conftool/dbconfig/20240306-071804-root.json
  • 07:14 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58541 and previous config saved to /var/cache/conftool/dbconfig/20240306-071435-root.json
  • 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P58540 and previous config saved to /var/cache/conftool/dbconfig/20240306-070259-root.json
  • 06:59 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58539 and previous config saved to /var/cache/conftool/dbconfig/20240306-065929-root.json
  • 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P58538 and previous config saved to /var/cache/conftool/dbconfig/20240306-064754-root.json
  • 06:44 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58537 and previous config saved to /var/cache/conftool/dbconfig/20240306-064424-root.json
  • 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P58536 and previous config saved to /var/cache/conftool/dbconfig/20240306-063249-root.json
  • 06:29 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 1%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58535 and previous config saved to /var/cache/conftool/dbconfig/20240306-062919-root.json
  • 06:22 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1025', diff saved to https://phabricator.wikimedia.org/P58534 and previous config saved to /var/cache/conftool/dbconfig/20240306-062221-root.json
  • 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P58533 and previous config saved to /var/cache/conftool/dbconfig/20240306-061744-root.json
  • 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 5%: After schema change', diff saved to https://phabricator.wikimedia.org/P58532 and previous config saved to /var/cache/conftool/dbconfig/20240306-060239-root.json
  • 05:47 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:47 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:43 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:43 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:28 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:28 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:20 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:13 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:13 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:11 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:10 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:04 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:27 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:27 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:18 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:05 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:05 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:02 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:34 krinkle@deploy2002: Synchronized src/Profiler.php: I101a80a (duration: 10m 48s)
  • 00:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:40 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-05

  • 23:53 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 23:48 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:48 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:22 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:22 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:01 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host parse1014.eqiad.wmnet with OS bullseye
  • 22:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:37 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 60 days, 0:00:00 on wdqs[1022-1024].eqiad.wmnet with reason: T337013
  • 22:37 bking@cumin2002: START - Cookbook sre.hosts.downtime for 60 days, 0:00:00 on wdqs[1022-1024].eqiad.wmnet with reason: T337013
  • 22:34 brett@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 22:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 22:27 brett@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 22:26 brett@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ncredir4037.ulsfo.wmnet
  • 22:21 brett@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ncredir4037.ulsfo.wmnet
  • 22:18 brett@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp5025.eqsin.wmnet
  • 22:09 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wdqs1025']
  • 22:07 brett: upload fifo-log-demux 0.6.5+deb11u1 to bullseye-wikimedia
  • 22:03 brett: upload fifo-log-demux 0.6.5+deb12u1 to bookworm-wikimedia
  • 21:49 brett: Remove fifo-log-demux from bookworm-wikimedia (dist version needs revision)
  • 21:47 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wdqs1025']
  • 21:47 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 21:42 urbanecm@deploy2002: Finished scap: Backport for Set background/color to inherit for common templates in dark mode (T358164), HandleSectionLinks: Fix handling headings with raw `>` in attributes (T358810) (duration: 15m 50s)
  • 21:41 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1014.eqiad.wmnet with OS bullseye
  • 21:32 urbanecm@deploy2002: matmarex and jdlrobson and urbanecm: Continuing with sync
  • 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 21:28 urbanecm@deploy2002: matmarex and jdlrobson and urbanecm: Backport for Set background/color to inherit for common templates in dark mode (T358164), HandleSectionLinks: Fix handling headings with raw `>` in attributes (T358810) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:27 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wdqs1025']
  • 21:26 brett@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ncredir4001.ulsfo.wmnet
  • 21:26 urbanecm@deploy2002: Started scap: Backport for Set background/color to inherit for common templates in dark mode (T358164), HandleSectionLinks: Fix handling headings with raw `>` in attributes (T358810)
  • 21:25 brett@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ncredir5001.eqsin.wmnet
  • 21:22 urbanecm@deploy2002: Finished scap: Backport for Move account vanishing contact form to Meta wiki. (T343536), Stop sharing vector and vector-2022 scripts on wikis where no users are impacted (T331679) (duration: 14m 46s)
  • 21:20 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['elastic2107']
  • 21:20 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2107']
  • 21:20 brett@puppetmaster1001: conftool action : set/pooled=no; selector: name=ncredir5001.eqsin.wmnet
  • 21:17 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wdqs1025']
  • 21:17 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 21:12 urbanecm@deploy2002: jdlrobson and urbanecm and dbrant: Continuing with sync
  • 21:10 urbanecm@deploy2002: jdlrobson and urbanecm and dbrant: Backport for Move account vanishing contact form to Meta wiki. (T343536), Stop sharing vector and vector-2022 scripts on wikis where no users are impacted (T331679) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:07 urbanecm@deploy2002: Started scap: Backport for Move account vanishing contact form to Meta wiki. (T343536), Stop sharing vector and vector-2022 scripts on wikis where no users are impacted (T331679)
  • 21:03 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 21:02 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 20:54 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) es2035.codfw.wmnet es2036.codfw.wmnet es2037.codfw.wmnet es2038.codfw.wmnet es2039.codfw.wmnet es2040.codfw.wmnet on all recursors
  • 20:54 volans@cumin1002: START - Cookbook sre.dns.wipe-cache es2035.codfw.wmnet es2036.codfw.wmnet es2037.codfw.wmnet es2038.codfw.wmnet es2039.codfw.wmnet es2040.codfw.wmnet on all recursors
  • 20:53 volans@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:53 volans@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Deleted AAAA records from new DBs - volans@cumin1002"
  • 20:52 brett: upload fifo-log-demux 0.6.5 to bookworm-wikimedia
  • 20:52 volans@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Deleted AAAA records from new DBs - volans@cumin1002"
  • 20:50 volans@cumin1002: START - Cookbook sre.dns.netbox
  • 20:46 brett: Disable puppet on A:cp and A:ncredir - T355905
  • 20:46 brett: Start rolling out updated fifo-log-demux and configuration to A:cp and A:ncredir - T355905
  • 20:08 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 20:07 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 20:07 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 20:06 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 20:05 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 20:04 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 20:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:58 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: flink-zk reboots T356239
  • 19:58 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: flink-zk reboots T356239
  • 19:57 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 1:00:00 on 6 hosts with reason: flink-zk reboots
  • 19:57 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: flink-zk reboots
  • 19:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:47 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 19:46 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['wdqs1025']
  • 19:46 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wdqs1025']
  • 19:44 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1025.eqiad.wmnet with OS bullseye
  • 19:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:04 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:56 Daimona: T357007 Running mwscript CampaignEvents:GenerateInvitationList --wiki=metawiki --listfile=/home/daimona/list.txt
  • 18:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2104.codfw.wmnet onto db2204.codfw.wmnet
  • 18:46 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:46 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:40 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:40 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:37 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2103.codfw.wmnet onto db2203.codfw.wmnet
  • 18:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:30 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2011.codfw.wmnet with OS bullseye
  • 18:28 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1025.eqiad.wmnet with OS bullseye
  • 18:28 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1025
  • 18:28 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1025
  • 18:27 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1017.eqiad.wmnet with OS bullseye
  • 18:26 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:25 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 18:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1018.eqiad.wmnet with OS bullseye
  • 18:22 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1019.eqiad.wmnet with OS bullseye
  • 18:19 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1016.eqiad.wmnet with OS bullseye
  • 18:17 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1015.eqiad.wmnet with OS bullseye
  • 18:13 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: host reimage
  • 18:13 arnaudb@cumin1002: dbctl commit (dc=all): 'db2096 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58528 and previous config saved to /var/cache/conftool/dbconfig/20240305-181349-arnaudb.json
  • 18:11 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:11 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:10 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2011.codfw.wmnet with reason: host reimage
  • 18:09 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1017.eqiad.wmnet with reason: host reimage
  • 18:06 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1018.eqiad.wmnet with reason: host reimage
  • 18:06 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host parse1014.eqiad.wmnet with OS bullseye
  • 18:04 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1019.eqiad.wmnet with reason: host reimage
  • 18:02 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1016.eqiad.wmnet with reason: host reimage
  • 18:00 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1019.eqiad.wmnet with reason: host reimage
  • 18:00 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1015.eqiad.wmnet with reason: host reimage
  • 17:59 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1018.eqiad.wmnet with reason: host reimage
  • 17:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2096 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58527 and previous config saved to /var/cache/conftool/dbconfig/20240305-175844-arnaudb.json
  • 17:58 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1017.eqiad.wmnet with reason: host reimage
  • 17:58 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1016.eqiad.wmnet with reason: host reimage
  • 17:57 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1015.eqiad.wmnet with reason: host reimage
  • 17:47 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1019.eqiad.wmnet with OS bullseye
  • 17:46 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1018.eqiad.wmnet with OS bullseye
  • 17:46 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host lvs2011.codfw.wmnet with OS bullseye
  • 17:46 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1017.eqiad.wmnet with OS bullseye
  • 17:45 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1016.eqiad.wmnet with OS bullseye
  • 17:44 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 17:44 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1015.eqiad.wmnet with OS bullseye
  • 17:44 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 17:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2096 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58526 and previous config saved to /var/cache/conftool/dbconfig/20240305-174339-arnaudb.json
  • 17:41 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 17:41 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 17:40 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) lvs2011.codfw.wmnet on all recursors
  • 17:40 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache lvs2011.codfw.wmnet on all recursors
  • 17:40 inflatador: bking@prometheus1006 reload prometheus service as part of troubleshooting T358029
  • 17:40 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 17:39 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 17:32 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:31 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for lvs2011 - cmooney@cumin1002"
  • 17:31 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for lvs2011 - cmooney@cumin1002"
  • 17:29 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 17:28 arnaudb@cumin1002: dbctl commit (dc=all): 'db2096 (re)pooling @ 25%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58524 and previous config saved to /var/cache/conftool/dbconfig/20240305-172834-arnaudb.json
  • 17:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1013.eqiad.wmnet with OS bullseye
  • 17:21 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1011.eqiad.wmnet with OS bullseye
  • 17:19 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1012.eqiad.wmnet with OS bullseye
  • 17:17 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1010.eqiad.wmnet with OS bullseye
  • 17:16 brennen@deploy2002: Finished scap: Backport for Partial Revert "Set background/color to inherit for common templates" (T358164) (duration: 23m 42s)
  • 17:11 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1006.eqiad.wmnet with OS bullseye
  • 17:11 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2096.codfw.wmnet onto db2196.codfw.wmnet
  • 17:10 topranks: disabling pybal on lvs2011 (traffic will move to lvs2014) in advance of reimage T352920
  • 17:08 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr[1-2]-codfw,lsw1-a2-codfw.mgmt with reason: moving lvs2011 which will disrupt bgp
  • 17:08 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr[1-2]-codfw,lsw1-a2-codfw.mgmt with reason: moving lvs2011 which will disrupt bgp
  • 17:07 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1013.eqiad.wmnet with reason: host reimage
  • 17:07 brennen@deploy2002: jdlrobson and brennen: Continuing with sync
  • 17:07 arnaudb@cumin1002: dbctl commit (dc=all): 'es2030 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58523 and previous config saved to /var/cache/conftool/dbconfig/20240305-170611-arnaudb.json
  • 17:07 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Move lvs2011 from private1-a-codfw to private1-a2-codfw vlan
  • 17:06 arnaudb@cumin1002: dbctl commit (dc=all): 'es2029 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58522 and previous config saved to /var/cache/conftool/dbconfig/20240305-170558-arnaudb.json
  • 17:06 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2011.codfw.wmnet with reason: Move lvs2011 from private1-a-codfw to private1-a2-codfw vlan
  • 17:06 arnaudb@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58521 and previous config saved to /var/cache/conftool/dbconfig/20240305-170540-arnaudb.json
  • 17:05 cmooney@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 2:00:00 on lvs2012.codfw.wmnet with reason: Move lvs2011 from private1-a-codfw to private1-a2-codfw vlan
  • 17:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58520 and previous config saved to /var/cache/conftool/dbconfig/20240305-170527-arnaudb.json
  • 17:05 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: Move lvs2011 from private1-a-codfw to private1-a2-codfw vlan
  • 17:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58519 and previous config saved to /var/cache/conftool/dbconfig/20240305-170511-arnaudb.json
  • 17:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58518 and previous config saved to /var/cache/conftool/dbconfig/20240305-170448-arnaudb.json
  • 17:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2148 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58517 and previous config saved to /var/cache/conftool/dbconfig/20240305-170437-arnaudb.json
  • 17:04 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1011.eqiad.wmnet with reason: host reimage
  • 17:01 cgoubert@cumin1002: conftool action : set/pooled=yes; selector: name=mw243(2|3).*
  • 17:01 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1012.eqiad.wmnet with reason: host reimage
  • 16:59 cgoubert@cumin1002: conftool action : set/pooled=yes; selector: cluster=parsoid
  • 16:59 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1010.eqiad.wmnet with reason: host reimage
  • 16:58 brennen@deploy2002: jdlrobson and brennen: Backport for Partial Revert "Set background/color to inherit for common templates" (T358164) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:58 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1013.eqiad.wmnet with reason: host reimage
  • 16:57 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1012.eqiad.wmnet with reason: host reimage
  • 16:56 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1011.eqiad.wmnet with reason: host reimage
  • 16:56 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1010.eqiad.wmnet with reason: host reimage
  • 16:55 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1006.eqiad.wmnet with reason: host reimage
  • 16:53 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 16:53 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:53 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:52 brennen@deploy2002: Started scap: Backport for Partial Revert "Set background/color to inherit for common templates" (T358164)
  • 16:51 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1006.eqiad.wmnet with reason: host reimage
  • 16:51 arnaudb@cumin1002: dbctl commit (dc=all): 'es2030 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58514 and previous config saved to /var/cache/conftool/dbconfig/20240305-165106-arnaudb.json
  • 16:50 arnaudb@cumin1002: dbctl commit (dc=all): 'es2029 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58513 and previous config saved to /var/cache/conftool/dbconfig/20240305-165053-arnaudb.json
  • 16:50 arnaudb@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58512 and previous config saved to /var/cache/conftool/dbconfig/20240305-165035-arnaudb.json
  • 16:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58511 and previous config saved to /var/cache/conftool/dbconfig/20240305-165022-arnaudb.json
  • 16:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58510 and previous config saved to /var/cache/conftool/dbconfig/20240305-165006-arnaudb.json
  • 16:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58509 and previous config saved to /var/cache/conftool/dbconfig/20240305-164942-arnaudb.json
  • 16:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2148 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58508 and previous config saved to /var/cache/conftool/dbconfig/20240305-164931-arnaudb.json
  • 16:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2035.codfw.wmnet with OS bookworm
  • 16:47 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 16:46 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1014.eqiad.wmnet with OS bullseye
  • 16:45 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1013.eqiad.wmnet with OS bullseye
  • 16:44 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1012.eqiad.wmnet with OS bullseye
  • 16:44 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1011.eqiad.wmnet with OS bullseye
  • 16:43 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1010.eqiad.wmnet with OS bullseye
  • 16:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 16:41 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1005.eqiad.wmnet with OS bullseye
  • 16:39 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1006.eqiad.wmnet with OS bullseye
  • 16:39 denisse: enabling meta-monitoring for the alert* hosts - T333615
  • 16:38 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host parse1006.eqiad.wmnet with OS bullseye
  • 16:36 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1009.eqiad.wmnet with OS bullseye
  • 16:36 arnaudb@cumin1002: dbctl commit (dc=all): 'es2030 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58507 and previous config saved to /var/cache/conftool/dbconfig/20240305-163601-arnaudb.json
  • 16:35 arnaudb@cumin1002: dbctl commit (dc=all): 'es2029 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58506 and previous config saved to /var/cache/conftool/dbconfig/20240305-163548-arnaudb.json
  • 16:35 arnaudb@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58505 and previous config saved to /var/cache/conftool/dbconfig/20240305-163530-arnaudb.json
  • 16:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58504 and previous config saved to /var/cache/conftool/dbconfig/20240305-163516-arnaudb.json
  • 16:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58503 and previous config saved to /var/cache/conftool/dbconfig/20240305-163501-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58502 and previous config saved to /var/cache/conftool/dbconfig/20240305-163437-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2148 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58501 and previous config saved to /var/cache/conftool/dbconfig/20240305-163426-arnaudb.json
  • 16:33 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1008.eqiad.wmnet with OS bullseye
  • 16:31 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1007.eqiad.wmnet with OS bullseye
  • 16:28 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1002.eqiad.wmnet with OS bullseye
  • 16:27 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
  • 16:25 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2104.codfw.wmnet onto db2204.codfw.wmnet
  • 16:25 jgleeson: civicrm updated to cae487db
  • 16:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2104 in db2204 for T355422~', diff saved to https://phabricator.wikimedia.org/P58500 and previous config saved to /var/cache/conftool/dbconfig/20240305-162442-arnaudb.json
  • 16:24 jynus: patching oldimage table for commons T359176
  • 16:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2204.codfw.wmnet with reason: provisionning db2204.codfw.wmnet - T355422
  • 16:23 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2204.codfw.wmnet with reason: provisionning db2204.codfw.wmnet - T355422
  • 16:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: provisionning db2204.codfw.wmnet - T355422
  • 16:23 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: provisionning db2204.codfw.wmnet - T355422
  • 16:22 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1005.eqiad.wmnet with reason: host reimage
  • 16:20 arnaudb@cumin1002: dbctl commit (dc=all): 'es2030 (re)pooling @ 25%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58499 and previous config saved to /var/cache/conftool/dbconfig/20240305-162056-arnaudb.json
  • 16:20 arnaudb@cumin1002: dbctl commit (dc=all): 'es2029 (re)pooling @ 25%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58498 and previous config saved to /var/cache/conftool/dbconfig/20240305-162043-arnaudb.json
  • 16:20 arnaudb@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 25%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58497 and previous config saved to /var/cache/conftool/dbconfig/20240305-162025-arnaudb.json
  • 16:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 25%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58496 and previous config saved to /var/cache/conftool/dbconfig/20240305-162011-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 25%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58495 and previous config saved to /var/cache/conftool/dbconfig/20240305-161955-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 25%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58494 and previous config saved to /var/cache/conftool/dbconfig/20240305-161932-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2148 (re)pooling @ 25%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58493 and previous config saved to /var/cache/conftool/dbconfig/20240305-161921-arnaudb.json
  • 16:18 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1009.eqiad.wmnet with reason: host reimage
  • 16:17 claime: uncordon kubernetes2035.codfw.wmnet kubernetes2034.codfw.wmnet mw2434.codfw.wmnet mw2435.codfw.wmnet
  • 16:16 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dns2004.wikimedia.org
  • 16:16 sukhe@cumin2002: START - Cookbook sre.hosts.remove-downtime for dns2004.wikimedia.org
  • 16:16 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2103.codfw.wmnet onto db2203.codfw.wmnet
  • 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
  • 16:15 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns2004.wikimedia.org
  • 16:15 claime: Repooling mw2433.codfw.wmnet mw2432.codfw.wmnet parse2008.codfw.wmnet parse2009.codfw.wmnet parse2010.codfw.wmnet
  • 16:15 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1008.eqiad.wmnet with reason: host reimage
  • 16:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2103 in db2203 for T355422', diff saved to https://phabricator.wikimedia.org/P58492 and previous config saved to /var/cache/conftool/dbconfig/20240305-161517-arnaudb.json
  • 16:14 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2203.codfw.wmnet with reason: provisionning db2203.codfw.wmnet - T355422
  • 16:14 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2203.codfw.wmnet with reason: provisionning db2203.codfw.wmnet - T355422
  • 16:14 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2103.codfw.wmnet with reason: provisionning db2203.codfw.wmnet - T355422
  • 16:13 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2103.codfw.wmnet with reason: provisionning db2203.codfw.wmnet - T355422
  • 16:13 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1009.eqiad.wmnet with reason: host reimage
  • 16:13 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1007.eqiad.wmnet with reason: host reimage
  • 16:11 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:11 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:11 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1008.eqiad.wmnet with reason: host reimage
  • 16:10 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1002.eqiad.wmnet with reason: host reimage
  • 16:10 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1007.eqiad.wmnet with reason: host reimage
  • 16:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:08 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1005.eqiad.wmnet with reason: host reimage
  • 16:08 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1002.eqiad.wmnet with reason: host reimage
  • 16:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:04 topranks: commencing migration of servers in codfw rack b8 to lsw1-b8-codfw T355873
  • 16:02 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:02 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:00 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1009.eqiad.wmnet with OS bullseye
  • 15:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:58 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:58 _joe_: depooled mw2434-5, T355873
  • 15:58 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1008.eqiad.wmnet with OS bullseye
  • 15:57 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1007.eqiad.wmnet with OS bullseye
  • 15:56 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1006.eqiad.wmnet with OS bullseye
  • 15:56 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 25 hosts with reason: Migrating servers in codfw rack B8 to lsw1-b8-codfw
  • 15:56 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1005.eqiad.wmnet with OS bullseye
  • 15:56 _joe_: depooled parse2008-10 T355873
  • 15:55 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 25 hosts with reason: Migrating servers in codfw rack B8 to lsw1-b8-codfw
  • 15:55 godog: bounce ircecho on alert2001 one last time
  • 15:55 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1002.eqiad.wmnet with OS bullseye
  • 15:54 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:54 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b8-codfw.mgmt with reason: prepping for server uplink migration codfw rack b8
  • 15:54 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:54 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b8-codfw.mgmt with reason: prepping for server uplink migration codfw rack b8
  • 15:54 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es2035.codfw.wmnet with OS bookworm
  • 15:54 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 1:00:00 on cr[1-2]-codfw,lsw1-b8-codfw.mgmt asw-b-codfw with reason: prepping for server uplink migration codfw rack b8
  • 15:54 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on cr[1-2]-codfw,lsw1-b8-codfw.mgmt asw-b-codfw with reason: prepping for server uplink migration codfw rack b8
  • 15:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db[2219-2220].codfw.wmnet with reason: Silence for cloning
  • 15:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db[2219-2220].codfw.wmnet with reason: Silence for cloning
  • 15:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 20 hosts with reason: Silence for cloning
  • 15:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 20 hosts with reason: Silence for cloning
  • 15:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2035.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:50 _joe_: draining mw2435 T355873
  • 15:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 23:00:00 on db2196.codfw.wmnet with reason: Silence for cloning
  • 15:49 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 23:00:00 on db2196.codfw.wmnet with reason: Silence for cloning
  • 15:48 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 23:00:00 on db2096.codfw.wmnet with reason: Silence for cloning
  • 15:48 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 23:00:00 on db2096.codfw.wmnet with reason: Silence for cloning
  • 15:48 _joe_: draining mw2434 T355873
  • 15:48 godog: bounce ircecho on alert2001
  • 15:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2151 (re)pooling @ 100%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58490 and previous config saved to /var/cache/conftool/dbconfig/20240305-154718-arnaudb.json
  • 15:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2035.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:46 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 15:46 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:46 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:45 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 15:45 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1004.eqiad.wmnet with OS bullseye
  • 15:45 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 15:44 arnaudb@cumin1002: dbctl commit (dc=all): 'T355873 - depooling db2148 db2163 db2185 db2164 db2189 es2025 es2029 es2030', diff saved to https://phabricator.wikimedia.org/P58489 and previous config saved to /var/cache/conftool/dbconfig/20240305-154400-arnaudb.json
  • 15:43 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:40:00 on 8 hosts with reason: Silence for maintenance T355873
  • 15:43 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1003.eqiad.wmnet with OS bullseye
  • 15:43 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 15:43 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:40:00 on 8 hosts with reason: Silence for maintenance T355873
  • 15:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2040.codfw.wmnet with OS bookworm
  • 15:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 15:43 _joe_: draining kubernetes2054 T355873
  • 15:41 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1356.eqiad.wmnet with OS bullseye
  • 15:39 _joe_: draining kubernetes2035 T355873
  • 15:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:11 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:11 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:09 denisse: disable meta-monitoring for alert1001 - T333615
  • 15:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:07 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1004.eqiad.wmnet with OS bullseye
  • 15:06 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1003.eqiad.wmnet with OS bullseye
  • 15:03 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host mw1356.eqiad.wmnet with OS bullseye
  • 15:02 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host mw1357.eqiad.wmnet with OS bullseye
  • 15:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2151 (re)pooling @ 25%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58481 and previous config saved to /var/cache/conftool/dbconfig/20240305-150203-arnaudb.json
  • 14:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:56 cgoubert@deploy2002: Finished scap: (no justification provided) (duration: 05m 18s)
  • 14:54 jnuche@deploy2002: Started deploy [zuul/deploy@cadc625]: test deployment for new host
  • 14:52 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1038.eqiad.wmnet with reason: Bootstrapping — T354560
  • 14:52 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1038.eqiad.wmnet with reason: Bootstrapping — T354560
  • 14:51 cgoubert@deploy2002: Started scap: (no justification provided)
  • 14:49 cgoubert@deploy2002: Finished scap: (no justification provided) (duration: 00m 23s)
  • 14:49 cgoubert@deploy2002: Started scap: (no justification provided)
  • 14:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2151 (re)pooling @ 20%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58480 and previous config saved to /var/cache/conftool/dbconfig/20240305-144658-arnaudb.json
  • 14:44 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.21 refs T354439
  • 14:40 akosiaris: remove all but 1 host from parsoid@eqiad T358752
  • 14:40 akosiaris: remove all but 1 host from parsoid@eqiad
  • 14:38 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:38 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for wqds1025 - cmooney@cumin1002"
  • 14:37 fabfur@cumin2002: conftool action : set/pooled=no; selector: name=dns2004.wikimedia.org
  • 14:37 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for wqds1025 - cmooney@cumin1002"
  • 14:37 fabfur: depooling dns2004 for T355873
  • 14:36 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 14:36 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 14:35 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 14:35 fabfur@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dns2004.wikimedia.org with reason: T355873
  • 14:35 fabfur@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on dns2004.wikimedia.org with reason: T355873
  • 14:34 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:34 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:34 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:34 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for wqds1025 - cmooney@cumin1002"
  • 14:33 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for wqds1025 - cmooney@cumin1002"
  • 14:32 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 14:32 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 14:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2151 (re)pooling @ 10%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58477 and previous config saved to /var/cache/conftool/dbconfig/20240305-143154-arnaudb.json
  • 14:31 jnuche@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.21 refs T354439 (duration: 42m 08s)
  • 14:29 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 14:28 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:28 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:26 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:26 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db2151 (re)pooling @ 5%: Cloning done', diff saved to https://phabricator.wikimedia.org/P58476 and previous config saved to /var/cache/conftool/dbconfig/20240305-141649-arnaudb.json
  • 14:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2151.codfw.wmnet onto db2217.codfw.wmnet
  • 14:04 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:49 jnuche@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.21 refs T354439
  • 13:35 jiji@deploy2002: Finished scap: (no justification provided) (duration: 22m 47s)
  • 13:33 jynus: running refreshImageMetadata.php on commons for Алфавітно-предметний_покажчик_за_1938_рік_до_Збірника_постанов_і_розпоряджень_Уряду_Української_Радянської_Соціалістичної_Республіки.pdf
  • 13:28 jnuche@deploy2002: Finished deploy [zuul/deploy@bb76c45]: test deployment for new host (duration: 00m 04s)
  • 13:28 jnuche@deploy2002: Started deploy [zuul/deploy@bb76c45]: test deployment for new host
  • 13:24 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2151.codfw.wmnet onto db2217.codfw.wmnet
  • 13:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2151 in db2217 for T355422', diff saved to https://phabricator.wikimedia.org/P58475 and previous config saved to /var/cache/conftool/dbconfig/20240305-132106-arnaudb.json
  • 13:18 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: provisionning db2217.codfw.wmnet - T355422
  • 13:18 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: provisionning db2217.codfw.wmnet - T355422
  • 13:18 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: provisionning db2217.codfw.wmnet - T355422
  • 13:18 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: provisionning db2217.codfw.wmnet - T355422
  • 13:17 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2151.codfw.wmnet with reason: Silence for cloning
  • 13:17 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2151.codfw.wmnet with reason: Silence for cloning
  • 13:12 jiji@deploy2002: Started scap: (no justification provided)
  • 13:11 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 13:11 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 12:57 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 12:57 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 12:54 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 12:54 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 12:54 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 12:54 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 12:52 eoghan@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts vrts1002.eqiad.wmnet
  • 12:52 eoghan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:52 eoghan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: vrts1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eoghan@cumin1002"
  • 12:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Optimize revision table T354015
  • 12:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Optimize revision table T354015
  • 12:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1186', diff saved to https://phabricator.wikimedia.org/P58474 and previous config saved to /var/cache/conftool/dbconfig/20240305-125152-root.json
  • 12:51 eoghan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: vrts1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eoghan@cumin1002"
  • 12:48 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 12:48 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 12:46 eoghan@cumin1002: START - Cookbook sre.dns.netbox
  • 12:35 eoghan@cumin1002: START - Cookbook sre.hosts.decommission for hosts vrts1002.eqiad.wmnet
  • 12:32 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 12:22 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 12:17 cgoubert@deploy2002: scap failed: KeyError 'canaries' (duration: 10m 29s)
  • 12:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T343718)', diff saved to https://phabricator.wikimedia.org/P58473 and previous config saved to /var/cache/conftool/dbconfig/20240305-121546-ladsgroup.json
  • 12:07 cgoubert@deploy2002: Started scap: (no justification provided)
  • 12:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P58472 and previous config saved to /var/cache/conftool/dbconfig/20240305-120040-ladsgroup.json
  • 11:58 klausman@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 11:57 klausman@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 11:54 klausman@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 11:54 klausman@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 11:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P58471 and previous config saved to /var/cache/conftool/dbconfig/20240305-114533-ladsgroup.json
  • 11:42 klausman@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 11:42 klausman@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 11:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:38 klausman@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 11:38 klausman@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 11:32 klausman@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 11:32 klausman@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 11:30 jnuche@deploy2002: sync-world aborted: testwikis wikis to 1.42.0-wmf.21 refs T354439 (duration: 34m 15s)
  • 11:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T343718)', diff saved to https://phabricator.wikimedia.org/P58469 and previous config saved to /var/cache/conftool/dbconfig/20240305-113027-ladsgroup.json
  • 11:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P58467 and previous config saved to /var/cache/conftool/dbconfig/20240305-111031-root.json
  • 10:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2218 (T343718)', diff saved to https://phabricator.wikimedia.org/P58466 and previous config saved to /var/cache/conftool/dbconfig/20240305-105950-ladsgroup.json
  • 10:59 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 10:59 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 10:56 jnuche@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.21 refs T354439
  • 10:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P58465 and previous config saved to /var/cache/conftool/dbconfig/20240305-105526-root.json
  • 10:53 jnuche@deploy2002: Pruned MediaWiki: 1.42.0-wmf.18 (duration: 03m 25s)
  • 10:51 akosiaris@cumin1002: conftool action : set/pooled=yes; selector: service=kubesvc,name=parse1.*
  • 10:50 akosiaris@cumin1002: conftool action : set/weight=10; selector: service=kubesvc,name=parse1.*
  • 10:50 akosiaris@cumin1002: conftool action : set/pooled=yes; selector: service=kubesvc,name=parse2.*
  • 10:50 akosiaris@cumin1002: conftool action : set/weight=10; selector: service=kubesvc,name=parse2.*
  • 10:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P58464 and previous config saved to /var/cache/conftool/dbconfig/20240305-104021-root.json
  • 10:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P58463 and previous config saved to /var/cache/conftool/dbconfig/20240305-102516-root.json
  • 10:22 akosiaris: uncordon parse10{20..24}.eqiad.wmnet parse10{10..12}.eqiad.wmnet T358752
  • 10:21 akosiaris: uncordon parse20{16..20}.codfw.wmnet T358752
  • 10:21 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2020.codfw.wmnet with OS bullseye
  • 10:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T357189)', diff saved to https://phabricator.wikimedia.org/P58462 and previous config saved to /var/cache/conftool/dbconfig/20240305-101241-arnaudb.json
  • 10:11 akosiaris: homer commit T358752
  • 10:07 jnuche@deploy2002: Finished deploy [zuul/deploy@bb76c45]: test deployment for new host (duration: 00m 24s)
  • 10:06 jnuche@deploy2002: Started deploy [zuul/deploy@bb76c45]: test deployment for new host
  • 10:04 jnuche@deploy2002: Finished deploy [zuul/deploy@bb76c45]: test deployment for new host (duration: 00m 01s)
  • 10:04 jnuche@deploy2002: Started deploy [zuul/deploy@bb76c45]: test deployment for new host
  • 09:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P58460 and previous config saved to /var/cache/conftool/dbconfig/20240305-094228-arnaudb.json
  • 09:42 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2020.codfw.wmnet with OS bullseye
  • 09:33 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2019.codfw.wmnet with OS bullseye
  • 09:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T357189)', diff saved to https://phabricator.wikimedia.org/P58459 and previous config saved to /var/cache/conftool/dbconfig/20240305-092721-arnaudb.json
  • 09:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2116 (T357189)', diff saved to https://phabricator.wikimedia.org/P58458 and previous config saved to /var/cache/conftool/dbconfig/20240305-092202-arnaudb.json
  • 09:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 09:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 09:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T357189)', diff saved to https://phabricator.wikimedia.org/P58457 and previous config saved to /var/cache/conftool/dbconfig/20240305-092140-arnaudb.json
  • 09:16 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts db2117.codfw.wmnet
  • 09:16 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:16 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2117.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 09:15 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2117.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 09:14 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2019.codfw.wmnet with reason: host reimage
  • 09:13 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 09:12 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2017.codfw.wmnet with OS bullseye
  • 09:12 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2117 T359141', diff saved to https://phabricator.wikimedia.org/P58456 and previous config saved to /var/cache/conftool/dbconfig/20240305-091244-marostegui.json
  • 09:11 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2019.codfw.wmnet with reason: host reimage
  • 09:08 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2117.codfw.wmnet
  • 09:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P58455 and previous config saved to /var/cache/conftool/dbconfig/20240305-090634-arnaudb.json
  • 08:56 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2019.codfw.wmnet with OS bullseye
  • 08:54 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2017.codfw.wmnet with reason: host reimage
  • 08:52 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse2018.codfw.wmnet with OS bullseye
  • 08:51 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2017.codfw.wmnet with reason: host reimage
  • 08:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P58454 and previous config saved to /var/cache/conftool/dbconfig/20240305-085128-arnaudb.json
  • 08:47 godog: add new disk to titan2001 /srv - T359068
  • 08:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T357189)', diff saved to https://phabricator.wikimedia.org/P58453 and previous config saved to /var/cache/conftool/dbconfig/20240305-083621-arnaudb.json
  • 08:35 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2017.codfw.wmnet with OS bullseye
  • 08:33 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2018.codfw.wmnet with reason: host reimage
  • 08:30 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2018.codfw.wmnet with reason: host reimage
  • 08:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1235 (T357189)', diff saved to https://phabricator.wikimedia.org/P58452 and previous config saved to /var/cache/conftool/dbconfig/20240305-083028-arnaudb.json
  • 08:30 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 08:30 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 08:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: mariadb::misc::multiinstance
  • 07:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: mariadb::misc::multiinstance
  • 07:52 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2016.codfw.wmnet with reason: host reimage
  • 07:49 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2016.codfw.wmnet with reason: host reimage
  • 07:49 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1024.eqiad.wmnet with OS bullseye
  • 07:48 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:48 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2020.codfw.wmnet
  • 07:33 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse2016.codfw.wmnet with OS bullseye
  • 07:33 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2020.codfw.wmnet
  • 07:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2019.codfw.wmnet
  • 07:31 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1024.eqiad.wmnet with reason: host reimage
  • 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2019.codfw.wmnet
  • 07:27 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1024.eqiad.wmnet with reason: host reimage
  • 07:17 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:17 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:15 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1024.eqiad.wmnet with OS bullseye
  • 06:57 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:57 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:19 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:19 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T357189)', diff saved to https://phabricator.wikimedia.org/P58451 and previous config saved to /var/cache/conftool/dbconfig/20240305-061300-arnaudb.json
  • 06:04 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:04 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P58450 and previous config saved to /var/cache/conftool/dbconfig/20240305-055754-arnaudb.json
  • 05:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:45 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:45 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P58449 and previous config saved to /var/cache/conftool/dbconfig/20240305-054247-arnaudb.json
  • 05:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T357189)', diff saved to https://phabricator.wikimedia.org/P58448 and previous config saved to /var/cache/conftool/dbconfig/20240305-052741-arnaudb.json
  • 05:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2188 (T357189)', diff saved to https://phabricator.wikimedia.org/P58447 and previous config saved to /var/cache/conftool/dbconfig/20240305-052259-arnaudb.json
  • 05:22 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 05:22 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 05:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T357189)', diff saved to https://phabricator.wikimedia.org/P58446 and previous config saved to /var/cache/conftool/dbconfig/20240305-052237-arnaudb.json
  • 05:15 kart_: Updated cxserver to 2024-03-04-113412-production (T350773)
  • 05:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P58445 and previous config saved to /var/cache/conftool/dbconfig/20240305-050731-arnaudb.json
  • 05:03 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 05:02 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:01 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:01 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 04:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P58444 and previous config saved to /var/cache/conftool/dbconfig/20240305-045225-arnaudb.json
  • 04:52 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 04:51 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 04:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T357189)', diff saved to https://phabricator.wikimedia.org/P58443 and previous config saved to /var/cache/conftool/dbconfig/20240305-043718-arnaudb.json
  • 04:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 04:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 04:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2176 (T357189)', diff saved to https://phabricator.wikimedia.org/P58442 and previous config saved to /var/cache/conftool/dbconfig/20240305-043155-arnaudb.json
  • 04:31 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 04:31 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 04:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T357189)', diff saved to https://phabricator.wikimedia.org/P58441 and previous config saved to /var/cache/conftool/dbconfig/20240305-043133-arnaudb.json
  • 04:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P58440 and previous config saved to /var/cache/conftool/dbconfig/20240305-041626-arnaudb.json
  • 04:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P58439 and previous config saved to /var/cache/conftool/dbconfig/20240305-040120-arnaudb.json
  • 03:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T357189)', diff saved to https://phabricator.wikimedia.org/P58438 and previous config saved to /var/cache/conftool/dbconfig/20240305-034614-arnaudb.json
  • 03:42 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:42 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2174 (T357189)', diff saved to https://phabricator.wikimedia.org/P58437 and previous config saved to /var/cache/conftool/dbconfig/20240305-033755-arnaudb.json
  • 03:37 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 03:37 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 03:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T357189)', diff saved to https://phabricator.wikimedia.org/P58436 and previous config saved to /var/cache/conftool/dbconfig/20240305-033732-arnaudb.json
  • 03:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P58435 and previous config saved to /var/cache/conftool/dbconfig/20240305-032225-arnaudb.json
  • 03:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P58434 and previous config saved to /var/cache/conftool/dbconfig/20240305-030719-arnaudb.json
  • 02:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T357189)', diff saved to https://phabricator.wikimedia.org/P58433 and previous config saved to /var/cache/conftool/dbconfig/20240305-025212-arnaudb.json
  • 02:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2173 (T357189)', diff saved to https://phabricator.wikimedia.org/P58432 and previous config saved to /var/cache/conftool/dbconfig/20240305-024657-arnaudb.json
  • 02:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 02:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 02:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
  • 02:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
  • 02:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T357189)', diff saved to https://phabricator.wikimedia.org/P58431 and previous config saved to /var/cache/conftool/dbconfig/20240305-024608-arnaudb.json
  • 02:33 eileen: civicrm upgraded from 614ac9e8 to 431b53cc
  • 02:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P58430 and previous config saved to /var/cache/conftool/dbconfig/20240305-023102-arnaudb.json
  • 02:26 ejegg: payments-wiki upgraded from 45ebffce to 99d8e9f6
  • 02:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P58429 and previous config saved to /var/cache/conftool/dbconfig/20240305-021556-arnaudb.json
  • 02:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T357189)', diff saved to https://phabricator.wikimedia.org/P58428 and previous config saved to /var/cache/conftool/dbconfig/20240305-020049-arnaudb.json
  • 01:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2170 (T357189)', diff saved to https://phabricator.wikimedia.org/P58427 and previous config saved to /var/cache/conftool/dbconfig/20240305-015550-arnaudb.json
  • 01:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 01:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 01:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T357189)', diff saved to https://phabricator.wikimedia.org/P58426 and previous config saved to /var/cache/conftool/dbconfig/20240305-015527-arnaudb.json
  • 01:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P58425 and previous config saved to /var/cache/conftool/dbconfig/20240305-014020-arnaudb.json
  • 01:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P58424 and previous config saved to /var/cache/conftool/dbconfig/20240305-012514-arnaudb.json
  • 01:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2035.codfw.wmnet with OS bookworm
  • 01:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T357189)', diff saved to https://phabricator.wikimedia.org/P58423 and previous config saved to /var/cache/conftool/dbconfig/20240305-011008-arnaudb.json
  • 01:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2153 (T357189)', diff saved to https://phabricator.wikimedia.org/P58422 and previous config saved to /var/cache/conftool/dbconfig/20240305-010459-arnaudb.json
  • 01:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 01:04 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 01:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T357189)', diff saved to https://phabricator.wikimedia.org/P58421 and previous config saved to /var/cache/conftool/dbconfig/20240305-010438-arnaudb.json
  • 00:55 mutante: contint1003 -rebooting
  • 00:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P58420 and previous config saved to /var/cache/conftool/dbconfig/20240305-004931-arnaudb.json
  • 00:48 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 00:42 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2037.codfw.wmnet with OS bookworm
  • 00:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 00:41 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 00:41 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2038.codfw.wmnet with OS bookworm
  • 00:40 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 00:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2039.codfw.wmnet with OS bookworm
  • 00:38 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 00:38 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 00:34 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 00:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P58419 and previous config saved to /var/cache/conftool/dbconfig/20240305-003425-arnaudb.json
  • 00:34 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2036.codfw.wmnet with OS bookworm
  • 00:34 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 00:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
  • 00:30 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 00:30 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
  • 00:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
  • 00:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2038.codfw.wmnet with reason: host reimage
  • 00:21 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:21 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:20 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2039.codfw.wmnet with reason: host reimage
  • 00:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T357189)', diff saved to https://phabricator.wikimedia.org/P58418 and previous config saved to /var/cache/conftool/dbconfig/20240305-001918-arnaudb.json
  • 00:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:18 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:18 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
  • 00:17 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2038.codfw.wmnet with reason: host reimage
  • 00:17 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2039.codfw.wmnet with reason: host reimage
  • 00:15 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
  • 00:14 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:14 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2146 (T357189)', diff saved to https://phabricator.wikimedia.org/P58417 and previous config saved to /var/cache/conftool/dbconfig/20240305-001408-arnaudb.json
  • 00:14 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 00:13 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 00:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T357189)', diff saved to https://phabricator.wikimedia.org/P58416 and previous config saved to /var/cache/conftool/dbconfig/20240305-001345-arnaudb.json
  • 00:12 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
  • 00:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-03-04

  • 23:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P58415 and previous config saved to /var/cache/conftool/dbconfig/20240304-235839-arnaudb.json
  • 23:51 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es2040.codfw.wmnet with OS bookworm
  • 23:50 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es2039.codfw.wmnet with OS bookworm
  • 23:50 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es2038.codfw.wmnet with OS bookworm
  • 23:50 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS bookworm
  • 23:50 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es2036.codfw.wmnet with OS bookworm
  • 23:50 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host es2035.codfw.wmnet with OS bookworm
  • 23:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2038.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2037.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:44 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2039.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:44 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2036.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P58414 and previous config saved to /var/cache/conftool/dbconfig/20240304-234332-arnaudb.json
  • 23:41 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2035.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2040.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 23:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 23:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:37 dancy@deploy2002: Locking from deployment [mediawiki]: Mediawiki deployments locked pending resolution of T359114
  • 23:35 eileen: civicrm upgraded from b1252d09 to 614ac9e8
  • 23:35 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:35 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 dancy@deploy2002: Installing scap version "4.68.0" for 413 hosts
  • 23:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2040.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2039.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2038.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2037.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2036.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2035.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:30 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:30 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:28 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:28 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T357189)', diff saved to https://phabricator.wikimedia.org/P58413 and previous config saved to /var/cache/conftool/dbconfig/20240304-232826-arnaudb.json
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:20 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:11 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:11 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:03 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:03 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['es2035']
  • 23:01 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2035']
  • 23:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2035.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:59 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:59 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:48 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2035.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:47 maryum: deployed patch for T357760
  • 22:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2145 (T357189)', diff saved to https://phabricator.wikimedia.org/P58412 and previous config saved to /var/cache/conftool/dbconfig/20240304-224550-arnaudb.json
  • 22:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 22:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 22:44 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['es2035']
  • 22:44 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2035']
  • 22:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['es2035']
  • 22:43 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2035']
  • 22:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 22:41 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 22:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T357189)', diff saved to https://phabricator.wikimedia.org/P58411 and previous config saved to /var/cache/conftool/dbconfig/20240304-224145-arnaudb.json
  • 22:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:33 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P58410 and previous config saved to /var/cache/conftool/dbconfig/20240304-222639-arnaudb.json
  • 22:19 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2108.codfw.wmnet with OS bullseye
  • 22:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on cp5025.eqsin.wmnet with reason: T355905
  • 22:19 brett@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on cp5025.eqsin.wmnet with reason: T355905
  • 22:19 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2107.codfw.wmnet with OS bullseye
  • 22:11 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:11 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P58409 and previous config saved to /var/cache/conftool/dbconfig/20240304-221132-arnaudb.json
  • 22:09 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:09 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:00 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:00 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T357189)', diff saved to https://phabricator.wikimedia.org/P58408 and previous config saved to /var/cache/conftool/dbconfig/20240304-215626-arnaudb.json
  • 21:56 cjming: end of UTC late backport window due to deployment errors
  • 21:55 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:55 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:51 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:50 cjming@deploy2002: Finished scap: Backport for InitialiseSettings: Set wgSignatureValidation to disallow [enwiki] (T355462) (duration: 38m 34s)
  • 21:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2130 (T357189)', diff saved to https://phabricator.wikimedia.org/P58407 and previous config saved to /var/cache/conftool/dbconfig/20240304-214757-arnaudb.json
  • 21:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 21:47 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 21:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T357189)', diff saved to https://phabricator.wikimedia.org/P58406 and previous config saved to /var/cache/conftool/dbconfig/20240304-214734-arnaudb.json
  • 21:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P58405 and previous config saved to /var/cache/conftool/dbconfig/20240304-213228-arnaudb.json
  • 21:29 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:29 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:25 cjming@deploy2002: cjming and houseblaster: Continuing with sync
  • 21:24 cjming@deploy2002: cjming and houseblaster: Backport for InitialiseSettings: Set wgSignatureValidation to disallow [enwiki] (T355462) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P58404 and previous config saved to /var/cache/conftool/dbconfig/20240304-211721-arnaudb.json
  • 21:14 inflatador: bking@cumin2002 depool wdqs2007 for T355873
  • 21:12 wfan: civicrm upgraded from 3145a587 to b1252d09
  • 21:12 cjming@deploy2002: Started scap: Backport for InitialiseSettings: Set wgSignatureValidation to disallow [enwiki] (T355462)
  • 21:05 wfan: payments-wiki upgraded from 78bf2b71 to 45ebffce
  • 21:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T357189)', diff saved to https://phabricator.wikimedia.org/P58403 and previous config saved to /var/cache/conftool/dbconfig/20240304-210214-arnaudb.json
  • 20:58 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2108.codfw.wmnet with OS bullseye
  • 20:58 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2107.codfw.wmnet with OS bullseye
  • 20:52 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:52 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:50 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:46 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:46 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:41 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:41 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2105.codfw.wmnet with OS bullseye
  • 20:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2104.codfw.wmnet with OS bullseye
  • 20:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:34 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on cp5025.eqsin.wmnet with reason: T355905
  • 20:34 brett@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on cp5025.eqsin.wmnet with reason: T355905
  • 20:33 brett@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5025.eqsin.wmnet
  • 20:33 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:33 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:31 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:31 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:29 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:28 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:18 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2105.codfw.wmnet with reason: host reimage
  • 20:15 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2104.codfw.wmnet with reason: host reimage
  • 20:14 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2105.codfw.wmnet with reason: host reimage
  • 20:12 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2104.codfw.wmnet with reason: host reimage
  • 20:08 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:08 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2116 (T357189)', diff saved to https://phabricator.wikimedia.org/P58401 and previous config saved to /var/cache/conftool/dbconfig/20240304-200143-arnaudb.json
  • 20:01 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 20:01 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 20:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T357189)', diff saved to https://phabricator.wikimedia.org/P58400 and previous config saved to /var/cache/conftool/dbconfig/20240304-200121-arnaudb.json
  • 19:58 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2105.codfw.wmnet with OS bullseye
  • 19:57 htriedman@deploy2002: Finished deploy [airflow-dags/platform_eng@a076d5c]: (no justification provided) (duration: 00m 26s)
  • 19:56 htriedman@deploy2002: Started deploy [airflow-dags/platform_eng@a076d5c]: (no justification provided)
  • 19:56 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2104.codfw.wmnet with OS bullseye
  • 19:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P58399 and previous config saved to /var/cache/conftool/dbconfig/20240304-194614-arnaudb.json
  • 19:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P58398 and previous config saved to /var/cache/conftool/dbconfig/20240304-193108-arnaudb.json
  • 19:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T357189)', diff saved to https://phabricator.wikimedia.org/P58396 and previous config saved to /var/cache/conftool/dbconfig/20240304-191601-arnaudb.json
  • 19:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2103 (T357189)', diff saved to https://phabricator.wikimedia.org/P58395 and previous config saved to /var/cache/conftool/dbconfig/20240304-191028-arnaudb.json
  • 19:10 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 19:10 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 19:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 19:06 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 19:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 19:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 19:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 19:00 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 19:00 htriedman@deploy2002: Finished deploy [airflow-dags/analytics_product@a076d5c]: (no justification provided) (duration: 00m 09s)
  • 19:00 htriedman@deploy2002: Started deploy [airflow-dags/analytics_product@a076d5c]: (no justification provided)
  • 18:57 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 18:57 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 18:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T357189)', diff saved to https://phabricator.wikimedia.org/P58394 and previous config saved to /var/cache/conftool/dbconfig/20240304-185740-arnaudb.json
  • 18:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P58393 and previous config saved to /var/cache/conftool/dbconfig/20240304-184234-arnaudb.json
  • 18:40 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host parse1024.eqiad.wmnet with OS bullseye
  • 18:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 18:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 18:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T354015)', diff saved to https://phabricator.wikimedia.org/P58392 and previous config saved to /var/cache/conftool/dbconfig/20240304-183212-marostegui.json
  • 18:29 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['es2036']
  • 18:29 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2036']
  • 18:27 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:27 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P58391 and previous config saved to /var/cache/conftool/dbconfig/20240304-182726-arnaudb.json
  • 18:27 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['es2035']
  • 18:26 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2035']
  • 18:26 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1024.eqiad.wmnet with OS bullseye
  • 18:26 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1023.eqiad.wmnet with OS bullseye
  • 18:26 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['es2035']
  • 18:25 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2035']
  • 18:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P58390 and previous config saved to /var/cache/conftool/dbconfig/20240304-181705-marostegui.json
  • 18:16 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2037.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T357189)', diff saved to https://phabricator.wikimedia.org/P58389 and previous config saved to /var/cache/conftool/dbconfig/20240304-181219-arnaudb.json
  • 18:09 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1012.eqiad.wmnet with OS bullseye
  • 18:08 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1023.eqiad.wmnet with reason: host reimage
  • 18:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T357189)', diff saved to https://phabricator.wikimedia.org/P58388 and previous config saved to /var/cache/conftool/dbconfig/20240304-180717-arnaudb.json
  • 18:07 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 18:07 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 18:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T357189)', diff saved to https://phabricator.wikimedia.org/P58387 and previous config saved to /var/cache/conftool/dbconfig/20240304-180655-arnaudb.json
  • 18:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P58386 and previous config saved to /var/cache/conftool/dbconfig/20240304-180159-marostegui.json
  • 17:59 jforrester@deploy2002: Finished scap: Backport for ZObjectStore::updateZObjectAsSystemUser: Also give wf-staff rights (duration: 38m 44s)
  • 17:52 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1023.eqiad.wmnet with OS bullseye
  • 17:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P58385 and previous config saved to /var/cache/conftool/dbconfig/20240304-175148-arnaudb.json
  • 17:51 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1012.eqiad.wmnet with reason: host reimage
  • 17:49 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1022.eqiad.wmnet with OS bullseye
  • 17:49 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1012.eqiad.wmnet with reason: host reimage
  • 17:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T354015)', diff saved to https://phabricator.wikimedia.org/P58384 and previous config saved to /var/cache/conftool/dbconfig/20240304-174653-marostegui.json
  • 17:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P58383 and previous config saved to /var/cache/conftool/dbconfig/20240304-173642-arnaudb.json
  • 17:36 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1012.eqiad.wmnet with OS bullseye
  • 17:34 jforrester@deploy2002: jforrester: Continuing with sync
  • 17:34 jforrester@deploy2002: jforrester: Backport for ZObjectStore::updateZObjectAsSystemUser: Also give wf-staff rights synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:31 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1022.eqiad.wmnet with reason: host reimage
  • 17:29 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1022.eqiad.wmnet with reason: host reimage
  • 17:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T357189)', diff saved to https://phabricator.wikimedia.org/P58382 and previous config saved to /var/cache/conftool/dbconfig/20240304-172136-arnaudb.json
  • 17:21 jforrester@deploy2002: Started scap: Backport for ZObjectStore::updateZObjectAsSystemUser: Also give wf-staff rights
  • 17:20 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 45m 54s)
  • 17:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58381 and previous config saved to /var/cache/conftool/dbconfig/20240304-171913-arnaudb.json
  • 17:16 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1022.eqiad.wmnet with OS bullseye
  • 17:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T357189)', diff saved to https://phabricator.wikimedia.org/P58380 and previous config saved to /var/cache/conftool/dbconfig/20240304-171543-arnaudb.json
  • 17:15 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 17:15 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 17:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T357189)', diff saved to https://phabricator.wikimedia.org/P58379 and previous config saved to /var/cache/conftool/dbconfig/20240304-171521-arnaudb.json
  • 17:14 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1021.eqiad.wmnet with OS bullseye
  • 17:11 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 17:11 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 17:09 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2037.mgmt.codfw.wmnet with reboot policy FORCED
  • 17:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58378 and previous config saved to /var/cache/conftool/dbconfig/20240304-170408-arnaudb.json
  • 17:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58377 and previous config saved to /var/cache/conftool/dbconfig/20240304-170320-arnaudb.json
  • 17:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P58376 and previous config saved to /var/cache/conftool/dbconfig/20240304-170015-arnaudb.json
  • 16:59 sukhe: sudo cumin -b1 -s120 "A:dns-rec" "run-puppet-agent --enable 'merging CR 1007918'": finish rolling out confd state management: T347054
  • 16:57 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns2004.wikimedia.org,service=authdns-ns1
  • 16:56 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns2004.wikimedia.org,service=authdns-ns1
  • 16:56 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1021.eqiad.wmnet with reason: host reimage
  • 16:53 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns6001.wikimedia.org,service=authdns-ns2
  • 16:53 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns4003.wikimedia.org,service=authdns-ns2
  • 16:53 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1021.eqiad.wmnet with reason: host reimage
  • 16:52 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns6001.wikimedia.org,service=authdns-ns2
  • 16:52 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns4003.wikimedia.org,service=authdns-ns2
  • 16:51 akosiaris@cumin1002: conftool action : set/pooled=inactive; selector: service=parsoid-php,dc=codfw,name=parse2020.codfw.wmnet
  • 16:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58375 and previous config saved to /var/cache/conftool/dbconfig/20240304-164903-arnaudb.json
  • 16:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58374 and previous config saved to /var/cache/conftool/dbconfig/20240304-164816-arnaudb.json
  • 16:47 brett@puppetmaster1001: conftool action : set/pooled=no; selector: name=ncredir4001.ulsfo.wmnet
  • 16:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P58373 and previous config saved to /var/cache/conftool/dbconfig/20240304-164508-arnaudb.json
  • 16:40 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1021.eqiad.wmnet with OS bullseye
  • 16:39 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1020.eqiad.wmnet with OS bullseye
  • 16:34 sukhe: running dummy authdns-update
  • 16:34 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns4003.wikimedia.org,service=authdns-update
  • 16:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 25%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58372 and previous config saved to /var/cache/conftool/dbconfig/20240304-163358-arnaudb.json
  • 16:33 sukhe: running dummy authdns-update
  • 16:33 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns4003.wikimedia.org,service=authdns-update
  • 16:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58371 and previous config saved to /var/cache/conftool/dbconfig/20240304-163311-arnaudb.json
  • 16:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2171.codfw.wmnet
  • 16:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T357189)', diff saved to https://phabricator.wikimedia.org/P58370 and previous config saved to /var/cache/conftool/dbconfig/20240304-163002-arnaudb.json
  • 16:29 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns4003.wikimedia.org,service=ntp
  • 16:29 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns4003.wikimedia.org,service=ntp
  • 16:28 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2171.codfw.wmnet
  • 16:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db2171.codfw.wmnet with reason: Silence for maintenance T356240
  • 16:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db2171.codfw.wmnet with reason: Silence for maintenance T356240
  • 16:27 arnaudb@cumin1002: dbctl commit (dc=all): 'T356240 ', diff saved to https://phabricator.wikimedia.org/P58369 and previous config saved to /var/cache/conftool/dbconfig/20240304-162755-arnaudb.json
  • 16:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1228 (T357189)', diff saved to https://phabricator.wikimedia.org/P58368 and previous config saved to /var/cache/conftool/dbconfig/20240304-162514-arnaudb.json
  • 16:25 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 16:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 16:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T357189)', diff saved to https://phabricator.wikimedia.org/P58367 and previous config saved to /var/cache/conftool/dbconfig/20240304-162452-arnaudb.json
  • 16:24 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns6001.wikimedia.org,service=authdns-update
  • 16:23 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns6001.wikimedia.org,service=authdns-update
  • 16:21 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns6001.wikimedia.org,service=ntp
  • 16:21 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns6001.wikimedia.org,service=ntp
  • 16:20 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1020.eqiad.wmnet with reason: host reimage
  • 16:18 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1020.eqiad.wmnet with reason: host reimage
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 25%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58366 and previous config saved to /var/cache/conftool/dbconfig/20240304-161806-arnaudb.json
  • 16:15 akosiaris@cumin1002: conftool action : set/pooled=no; selector: service=parsoid-php,dc=codfw,name=parse2020.codfw.wmnet
  • 16:15 akosiaris@cumin1002: conftool action : set/pooled=no; selector: service=parsoid-php,dc=codfw,name=parse201[6-9].codfw.wmnet
  • 16:12 sukhe: sudo cumin "A:dns-rec" "disable-puppet 'merging CR 1007918'": T347054
  • 16:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P58365 and previous config saved to /var/cache/conftool/dbconfig/20240304-160945-arnaudb.json
  • 16:05 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1020.eqiad.wmnet with OS bullseye
  • 16:02 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1011.eqiad.wmnet with OS bullseye
  • 15:59 akosiaris: depool parse2016-parse2020 from parsoid from re-imaging. T358752
  • 15:58 akosiaris: repool parse200[1-5] in parsoid. There are 2 canaries in that set, I 'll leave them for last. T358752.
  • 15:58 akosiaris@cumin1002: conftool action : set/pooled=yes; selector: service=parsoid-php,dc=codfw,name=parse200[1-5].codfw.wmnet
  • 15:57 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2124.codfw.wmnet
  • 15:57 arnaudb@cumin1002: dbctl commit (dc=all): 'T356240 ', diff saved to https://phabricator.wikimedia.org/P58363 and previous config saved to /var/cache/conftool/dbconfig/20240304-155742-arnaudb.json
  • 15:57 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db2124.codfw.wmnet with reason: Silence for maintenance T356240
  • 15:57 akosiaris: depool parse200[1-5] from parsoid from re-imaging. T358752
  • 15:56 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db2124.codfw.wmnet with reason: Silence for maintenance T356240
  • 15:56 akosiaris@cumin1002: conftool action : set/pooled=no; selector: service=parsoid-php,dc=codfw,name=parse200[1-5].codfw.wmnet
  • 15:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P58362 and previous config saved to /var/cache/conftool/dbconfig/20240304-155439-arnaudb.json
  • 15:52 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2156.codfw.wmnet onto db2194.codfw.wmnet
  • 15:46 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 15:44 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1011.eqiad.wmnet with reason: host reimage
  • 15:44 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 15:44 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 15:44 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on db2132.codfw.wmnet with reason: Silence for maintenance
  • 15:43 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 15:43 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on db2132.codfw.wmnet with reason: Silence for maintenance
  • 15:43 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 15:43 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 15:43 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 15:43 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 15:42 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1011.eqiad.wmnet with reason: host reimage
  • 15:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T357189)', diff saved to https://phabricator.wikimedia.org/P58361 and previous config saved to /var/cache/conftool/dbconfig/20240304-153933-arnaudb.json
  • 15:38 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
  • 15:38 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
  • 15:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T357189)', diff saved to https://phabricator.wikimedia.org/P58360 and previous config saved to /var/cache/conftool/dbconfig/20240304-153425-arnaudb.json
  • 15:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 15:34 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 15:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T357189)', diff saved to https://phabricator.wikimedia.org/P58359 and previous config saved to /var/cache/conftool/dbconfig/20240304-153403-arnaudb.json
  • 15:30 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1037.eqiad.wmnet with reason: Bootstrapping — T354560
  • 15:30 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1037.eqiad.wmnet with reason: Bootstrapping — T354560
  • 15:29 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1011.eqiad.wmnet with OS bullseye
  • 15:23 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
  • 15:23 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
  • 15:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P58358 and previous config saved to /var/cache/conftool/dbconfig/20240304-151856-arnaudb.json
  • 15:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58357 and previous config saved to /var/cache/conftool/dbconfig/20240304-151436-arnaudb.json
  • 15:13 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
  • 15:13 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
  • 15:04 _joe_: installing php-luasandbox update on mediawiki canaries T353414
  • 15:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P58356 and previous config saved to /var/cache/conftool/dbconfig/20240304-150350-arnaudb.json
  • 15:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2117.codfw.wmnet with reason: Silence for maintenance
  • 15:00 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db2117.codfw.wmnet with reason: Silence for maintenance
  • 14:59 arnaudb@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58354 and previous config saved to /var/cache/conftool/dbconfig/20240304-145931-arnaudb.json
  • 14:50 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on stat1010.eqiad.wmnet with reason: Moving GPU from stat1005 to stat1010
  • 14:50 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on stat1010.eqiad.wmnet with reason: Moving GPU from stat1005 to stat1010
  • 14:50 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on stat1005.eqiad.wmnet with reason: Moving GPU from stat1005 to stat1010
  • 14:50 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on stat1005.eqiad.wmnet with reason: Moving GPU from stat1005 to stat1010
  • 14:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T357189)', diff saved to https://phabricator.wikimedia.org/P58353 and previous config saved to /var/cache/conftool/dbconfig/20240304-144844-arnaudb.json
  • 14:45 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1010.eqiad.wmnet with OS bullseye
  • 14:44 arnaudb@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58352 and previous config saved to /var/cache/conftool/dbconfig/20240304-144426-arnaudb.json
  • 14:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T357189)', diff saved to https://phabricator.wikimedia.org/P58351 and previous config saved to /var/cache/conftool/dbconfig/20240304-144344-arnaudb.json
  • 14:43 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 14:43 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 14:43 sukhe: sudo cumin -b1 -s 30 "A:lvs and not P{lvs2014*}" "run-puppet-agent --enable 'merging CR 1007879'"
  • 14:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 14:40 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 14:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T357189)', diff saved to https://phabricator.wikimedia.org/P58350 and previous config saved to /var/cache/conftool/dbconfig/20240304-144005-arnaudb.json
  • 14:30 taavi: manually update PCC facts from puppetserver1001 to pick up cloudnet2007/8-dev os upgrade
  • 14:29 arnaudb@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 25%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58349 and previous config saved to /var/cache/conftool/dbconfig/20240304-142921-arnaudb.json
  • 14:28 sukhe: reprepro -C component/pybal include bullseye-wikimedia pybal_1.15.14_amd64.changes
  • 14:27 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1010.eqiad.wmnet with reason: host reimage
  • 14:27 sukhe: disable puppet on A:lvs to merge CR 1007879
  • 14:25 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1010.eqiad.wmnet with reason: host reimage
  • 14:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P58348 and previous config saved to /var/cache/conftool/dbconfig/20240304-142459-arnaudb.json
  • 14:22 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2158.codfw.wmnet
  • 14:19 ladsgroup@cumin1002: START - Cookbook sre.mysql.clone of db2156.codfw.wmnet onto db2194.codfw.wmnet
  • 14:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'For maint', diff saved to https://phabricator.wikimedia.org/P58347 and previous config saved to /var/cache/conftool/dbconfig/20240304-141913-ladsgroup.json
  • 14:17 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2158.codfw.wmnet
  • 14:17 arnaudb@cumin1002: dbctl commit (dc=all): 'T356240 ', diff saved to https://phabricator.wikimedia.org/P58346 and previous config saved to /var/cache/conftool/dbconfig/20240304-141730-arnaudb.json
  • 14:17 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db2158.codfw.wmnet with reason: Silence for maintenance T356240
  • 14:16 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db2158.codfw.wmnet with reason: Silence for maintenance T356240
  • 14:12 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host parse1010.eqiad.wmnet with OS bullseye
  • 14:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P58345 and previous config saved to /var/cache/conftool/dbconfig/20240304-140952-arnaudb.json
  • 13:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T357189)', diff saved to https://phabricator.wikimedia.org/P58344 and previous config saved to /var/cache/conftool/dbconfig/20240304-135446-arnaudb.json
  • 13:51 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1010.eqiad.wmnet with reason: re-image
  • 13:51 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1010.eqiad.wmnet with reason: re-image
  • 13:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1163 (T357189)', diff saved to https://phabricator.wikimedia.org/P58343 and previous config saved to /var/cache/conftool/dbconfig/20240304-134922-arnaudb.json
  • 13:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 13:49 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 13:47 claime: Uncordoning mw1351.eqiad.wmnet mw1352.eqiad.wmnet mw1353.eqiad.wmnet mw1354.eqiad.wmnet - T351074
  • 13:47 cgoubert@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=(mw1350.eqiad.wmnet|mw1351.eqiad.wmnet|mw1352.eqiad.wmnet|mw1353.eqiad.wmnet|mw1354.eqiad.wmnet),cluster=kubernetes,service=kubesvc
  • 13:41 claime: Running homer 'cr*eqiad*' commit 'T351074'
  • 13:39 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1354.eqiad.wmnet with OS bullseye
  • 13:36 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1353.eqiad.wmnet with OS bullseye
  • 13:33 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1351.eqiad.wmnet with OS bullseye
  • 13:33 jnuche@deploy2002: Finished deploy [zuul/deploy@bb76c45]: (no justification provided) (duration: 04m 33s)
  • 13:31 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1352.eqiad.wmnet with OS bullseye
  • 13:28 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1350.eqiad.wmnet with OS bullseye
  • 13:28 jnuche@deploy2002: Started deploy [zuul/deploy@bb76c45]: (no justification provided)
  • 13:28 akosiaris@cumin1002: conftool action : set/pooled=no; selector: service=parsoid-php,name=parse101[012].eqiad.wmnet,dc=eqiad
  • 13:27 akosiaris@cumin1002: conftool action : set/pooled=no; selector: service=parsoid-php,name=parse101[012],dc=eqiad
  • 13:25 akosiaris: depool parse102.* from parsoid-php in eqiad T358752
  • 13:24 akosiaris@cumin1002: conftool action : set/pooled=no; selector: service=parsoid-php,name=parse102.*,dc=eqiad
  • 13:23 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 13:22 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 13:21 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 13:20 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1354.eqiad.wmnet with reason: host reimage
  • 13:19 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 13:19 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 13:18 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 13:17 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 13:17 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1353.eqiad.wmnet with reason: host reimage
  • 13:17 dcaro@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:17 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 13:17 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 13:16 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 13:15 dcaro@cumin1002: START - Cookbook sre.dns.netbox
  • 13:15 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 13:14 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1351.eqiad.wmnet with reason: host reimage
  • 13:14 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 13:14 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 13:13 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 13:12 moritzm: installing jqueryui security updates
  • 13:12 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 13:12 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1352.eqiad.wmnet with reason: host reimage
  • 13:10 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 13:10 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 13:09 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1350.eqiad.wmnet with reason: host reimage
  • 13:07 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1354.eqiad.wmnet with reason: host reimage
  • 13:07 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1351.eqiad.wmnet with reason: host reimage
  • 13:07 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1353.eqiad.wmnet with reason: host reimage
  • 13:07 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1352.eqiad.wmnet with reason: host reimage
  • 13:06 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1350.eqiad.wmnet with reason: host reimage
  • 12:53 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1354.eqiad.wmnet with OS bullseye
  • 12:53 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1353.eqiad.wmnet with OS bullseye
  • 12:53 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1352.eqiad.wmnet with OS bullseye
  • 12:52 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1351.eqiad.wmnet with OS bullseye
  • 12:52 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1350.eqiad.wmnet with OS bullseye
  • 12:45 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts etherpad1003.eqiad.wmnet
  • 12:45 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:45 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: etherpad1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jelto@cumin1002"
  • 12:45 claime: Depooling mw1350.eqiad.wmnet,mw1351.eqiad.wmnet,mw1352.eqiad.wmnet,mw1353.eqiad.wmnet,mw1354.eqiad.wmnet for move to kubernetes - T351074
  • 12:43 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: etherpad1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jelto@cumin1002"
  • 12:41 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:38 claime: Re-enabling puppet on C:profile::firewall::log::ferm to deploy new ferm_status.py - T354855
  • 12:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset: apply
  • 12:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset: apply
  • 12:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 12:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 12:35 jelto@cumin1002: START - Cookbook sre.hosts.decommission for hosts etherpad1003.eqiad.wmnet
  • 12:33 claime: Enabling puppet on puppetboard2003 to test new ferm_status.py - T354855
  • 12:30 claime: Enabling puppet on mw2322 to test new ferm_status.py - T354855
  • 12:28 claime: Enabling puppet on kubernetes2019 to test new ferm_status.py - T354855
  • 12:22 claime: Disabling puppet on C:profile::firewall::log::ferm to deploy new ferm_status.py - T354855
  • 12:22 claime: Uncordoning mw2314.codfw.wmnet mw2315.codfw.wmnet mw2316.codfw.wmnet mw2320.codfw.wmnet mw2321.codfw.wmnet mw2322.codfw.wmnet - T351074
  • 12:21 cgoubert@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=(mw2314.codfw.wmnet|mw2315.codfw.wmnet|mw2316.codfw.wmnet|mw2320.codfw.wmnet|mw2321.codfw.wmnet|mw2322.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 12:14 claime: Running homer 'cr*codfw*' commit 'T351074'
  • 12:13 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 6 hosts
  • 12:13 cgoubert@cumin2002: START - Cookbook sre.hosts.remove-downtime for 6 hosts
  • 12:13 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2321.codfw.wmnet with OS bullseye
  • 12:11 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2315.codfw.wmnet with OS bullseye
  • 12:08 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2314.codfw.wmnet with OS bullseye
  • 12:05 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2322.codfw.wmnet with OS bullseye
  • 12:03 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2316.codfw.wmnet with OS bullseye
  • 12:01 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2320.codfw.wmnet with OS bullseye
  • 11:50 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2315.codfw.wmnet with reason: host reimage
  • 11:49 btullis@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons.
  • 11:48 claime: Disregard previous puppet disable message, waiting a bit T354855
  • 11:47 claime: Disabling puppet on C:profile::firewall::log::ferm to deploy 1005978 - T354855
  • 11:47 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2321.codfw.wmnet with reason: host reimage
  • 11:44 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2314.codfw.wmnet with reason: host reimage
  • 11:43 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet2007-dev.codfw.wmnet with OS bookworm
  • 11:42 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet2008-dev.codfw.wmnet with OS bookworm
  • 11:42 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2322.codfw.wmnet with reason: host reimage
  • 11:41 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 11:40 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 11:40 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 11:40 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 11:40 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 11:39 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 11:39 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 11:39 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2316.codfw.wmnet with reason: host reimage
  • 11:39 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 11:39 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 11:39 btullis@cumin1002: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons.
  • 11:38 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 11:37 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2320.codfw.wmnet with reason: host reimage
  • 11:35 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
  • 11:35 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2314.codfw.wmnet with reason: host reimage
  • 11:35 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2316.codfw.wmnet with reason: host reimage
  • 11:35 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2315.codfw.wmnet with reason: host reimage
  • 11:34 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2321.codfw.wmnet with reason: host reimage
  • 11:34 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2322.codfw.wmnet with reason: host reimage
  • 11:34 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
  • 11:34 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2320.codfw.wmnet with reason: host reimage
  • 11:34 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 11:33 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 11:33 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
  • 11:33 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
  • 11:33 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 11:32 Dreamy_Jazz: Re-starting MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 11:31 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 11:30 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
  • 11:30 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
  • 11:25 taavi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 'private.codfw.wikimedia.cloud$' on all recursors
  • 11:25 taavi@cumin1002: START - Cookbook sre.dns.wipe-cache 'private.codfw.wikimedia.cloud$' on all recursors
  • 11:25 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 11:25 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 11:25 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 11:25 taavi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 'private.codfw.wikimedia.cloud$' on codfw recursors
  • 11:24 taavi@cumin1002: START - Cookbook sre.dns.wipe-cache 'private.codfw.wikimedia.cloud$' on codfw recursors
  • 11:24 taavi@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 11:24 taavi@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add cloud-private IPs for nwe cloudnet-devs - taavi@cumin1002"
  • 11:24 btullis@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
  • 11:23 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 11:23 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 11:22 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 11:22 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 11:22 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add cloud-private IPs for nwe cloudnet-devs - taavi@cumin1002"
  • 11:21 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 11:21 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 11:21 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 11:20 taavi@cumin1002: START - Cookbook sre.dns.netbox
  • 11:18 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2322.codfw.wmnet with OS bullseye
  • 11:18 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2321.codfw.wmnet with OS bullseye
  • 11:18 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2320.codfw.wmnet with OS bullseye
  • 11:18 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2316.codfw.wmnet with OS bullseye
  • 11:18 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2315.codfw.wmnet with OS bullseye
  • 11:18 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2314.codfw.wmnet with OS bullseye
  • 11:17 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2008-dev.codfw.wmnet with reason: host reimage
  • 11:14 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2007-dev.codfw.wmnet with reason: host reimage
  • 11:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1235 (T354015)', diff saved to https://phabricator.wikimedia.org/P58342 and previous config saved to /var/cache/conftool/dbconfig/20240304-111424-marostegui.json
  • 11:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 11:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 11:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T354015)', diff saved to https://phabricator.wikimedia.org/P58341 and previous config saved to /var/cache/conftool/dbconfig/20240304-111401-marostegui.json
  • 11:12 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2008-dev.codfw.wmnet with reason: host reimage
  • 11:12 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2007-dev.codfw.wmnet with reason: host reimage
  • 11:08 claime: Depooling mw2314.codfw.wmnet,mw2315.codfw.wmnet,mw2316.codfw.wmnet,mw2320.codfw.wmnet,mw2321.codfw.wmnet,mw2322.codfw.wmnet for move to k8s - T351074
  • 10:59 btullis@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
  • 10:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P58340 and previous config saved to /var/cache/conftool/dbconfig/20240304-105855-marostegui.json
  • 10:53 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudnet2008-dev.codfw.wmnet with OS bookworm
  • 10:53 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudnet2007-dev.codfw.wmnet with OS bookworm
  • 10:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on etherpad1003.eqiad.wmnet with reason: Shutdown and decommission old host
  • 10:48 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 10:00:00 on etherpad1003.eqiad.wmnet with reason: Shutdown and decommission old host
  • 10:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P58339 and previous config saved to /var/cache/conftool/dbconfig/20240304-104348-marostegui.json
  • 10:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T354015)', diff saved to https://phabricator.wikimedia.org/P58338 and previous config saved to /var/cache/conftool/dbconfig/20240304-102842-marostegui.json
  • 09:30 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 09:30 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 09:28 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 09:27 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 08:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 100%: After optimizing revision table', diff saved to https://phabricator.wikimedia.org/P58337 and previous config saved to /var/cache/conftool/dbconfig/20240304-080546-root.json
  • 07:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 75%: After optimizing revision table', diff saved to https://phabricator.wikimedia.org/P58336 and previous config saved to /var/cache/conftool/dbconfig/20240304-075041-root.json
  • 07:35 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 50%: After optimizing revision table', diff saved to https://phabricator.wikimedia.org/P58335 and previous config saved to /var/cache/conftool/dbconfig/20240304-073536-root.json
  • 07:32 moritzm: installing tar security updates
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 25%: After optimizing revision table', diff saved to https://phabricator.wikimedia.org/P58334 and previous config saved to /var/cache/conftool/dbconfig/20240304-072031-root.json
  • 07:08 kart_: Updated cxserver to 2024-03-04-023843-production (T350773)
  • 07:08 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 07:07 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 07:06 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 07:05 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 07:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 10%: After optimizing revision table', diff saved to https://phabricator.wikimedia.org/P58333 and previous config saved to /var/cache/conftool/dbconfig/20240304-070526-root.json
  • 07:01 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 07:01 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 06:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 5%: After optimizing revision table', diff saved to https://phabricator.wikimedia.org/P58332 and previous config saved to /var/cache/conftool/dbconfig/20240304-065021-root.json
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 1%: After optimizing revision table', diff saved to https://phabricator.wikimedia.org/P58331 and previous config saved to /var/cache/conftool/dbconfig/20240304-063516-root.json
  • 06:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1186', diff saved to https://phabricator.wikimedia.org/P58330 and previous config saved to /var/cache/conftool/dbconfig/20240304-062703-root.json
  • 06:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2118.codfw.wmnet
  • 06:21 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 06:21 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2118.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 06:19 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2118.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 06:17 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 06:12 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2118.codfw.wmnet
  • 03:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T354015)', diff saved to https://phabricator.wikimedia.org/P58329 and previous config saved to /var/cache/conftool/dbconfig/20240304-034333-marostegui.json
  • 03:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 03:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 03:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T354015)', diff saved to https://phabricator.wikimedia.org/P58328 and previous config saved to /var/cache/conftool/dbconfig/20240304-034309-marostegui.json
  • 03:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P58327 and previous config saved to /var/cache/conftool/dbconfig/20240304-032803-marostegui.json
  • 03:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P58326 and previous config saved to /var/cache/conftool/dbconfig/20240304-031256-marostegui.json
  • 02:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T354015)', diff saved to https://phabricator.wikimedia.org/P58325 and previous config saved to /var/cache/conftool/dbconfig/20240304-025750-marostegui.json

2024-03-03

  • 20:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T354015)', diff saved to https://phabricator.wikimedia.org/P58324 and previous config saved to /var/cache/conftool/dbconfig/20240303-203302-marostegui.json
  • 20:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 20:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 20:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T354015)', diff saved to https://phabricator.wikimedia.org/P58323 and previous config saved to /var/cache/conftool/dbconfig/20240303-203240-marostegui.json
  • 20:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P58322 and previous config saved to /var/cache/conftool/dbconfig/20240303-201734-marostegui.json
  • 20:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P58321 and previous config saved to /var/cache/conftool/dbconfig/20240303-200228-marostegui.json
  • 19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T354015)', diff saved to https://phabricator.wikimedia.org/P58320 and previous config saved to /var/cache/conftool/dbconfig/20240303-194721-marostegui.json
  • 13:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1228 (T354015)', diff saved to https://phabricator.wikimedia.org/P58319 and previous config saved to /var/cache/conftool/dbconfig/20240303-132536-marostegui.json
  • 13:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 13:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 13:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T354015)', diff saved to https://phabricator.wikimedia.org/P58318 and previous config saved to /var/cache/conftool/dbconfig/20240303-132514-marostegui.json
  • 13:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P58317 and previous config saved to /var/cache/conftool/dbconfig/20240303-131008-marostegui.json
  • 12:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P58316 and previous config saved to /var/cache/conftool/dbconfig/20240303-125502-marostegui.json
  • 12:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T354015)', diff saved to https://phabricator.wikimedia.org/P58315 and previous config saved to /var/cache/conftool/dbconfig/20240303-123955-marostegui.json
  • 05:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T354015)', diff saved to https://phabricator.wikimedia.org/P58314 and previous config saved to /var/cache/conftool/dbconfig/20240303-055447-marostegui.json
  • 05:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 05:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 05:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T354015)', diff saved to https://phabricator.wikimedia.org/P58313 and previous config saved to /var/cache/conftool/dbconfig/20240303-055424-marostegui.json
  • 05:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P58312 and previous config saved to /var/cache/conftool/dbconfig/20240303-053918-marostegui.json
  • 05:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P58311 and previous config saved to /var/cache/conftool/dbconfig/20240303-052411-marostegui.json
  • 05:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T354015)', diff saved to https://phabricator.wikimedia.org/P58310 and previous config saved to /var/cache/conftool/dbconfig/20240303-050905-marostegui.json

2024-03-02

  • 22:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T354015)', diff saved to https://phabricator.wikimedia.org/P58309 and previous config saved to /var/cache/conftool/dbconfig/20240302-223741-marostegui.json
  • 22:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 22:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 17:24 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 17:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 17:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T354015)', diff saved to https://phabricator.wikimedia.org/P58308 and previous config saved to /var/cache/conftool/dbconfig/20240302-172351-marostegui.json
  • 17:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P58307 and previous config saved to /var/cache/conftool/dbconfig/20240302-170845-marostegui.json
  • 16:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P58306 and previous config saved to /var/cache/conftool/dbconfig/20240302-165338-marostegui.json
  • 16:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T354015)', diff saved to https://phabricator.wikimedia.org/P58305 and previous config saved to /var/cache/conftool/dbconfig/20240302-163832-marostegui.json
  • 09:58 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T354015)', diff saved to https://phabricator.wikimedia.org/P58304 and previous config saved to /var/cache/conftool/dbconfig/20240302-095854-marostegui.json
  • 09:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 09:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 09:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T354015)', diff saved to https://phabricator.wikimedia.org/P58303 and previous config saved to /var/cache/conftool/dbconfig/20240302-095831-marostegui.json
  • 09:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P58302 and previous config saved to /var/cache/conftool/dbconfig/20240302-094325-marostegui.json
  • 09:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P58301 and previous config saved to /var/cache/conftool/dbconfig/20240302-092819-marostegui.json
  • 09:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T354015)', diff saved to https://phabricator.wikimedia.org/P58300 and previous config saved to /var/cache/conftool/dbconfig/20240302-091312-marostegui.json
  • 02:22 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T354015)', diff saved to https://phabricator.wikimedia.org/P58299 and previous config saved to /var/cache/conftool/dbconfig/20240302-022247-marostegui.json
  • 02:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 02:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 02:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 02:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 02:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T354015)', diff saved to https://phabricator.wikimedia.org/P58298 and previous config saved to /var/cache/conftool/dbconfig/20240302-022156-marostegui.json
  • 02:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P58297 and previous config saved to /var/cache/conftool/dbconfig/20240302-020650-marostegui.json
  • 01:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P58296 and previous config saved to /var/cache/conftool/dbconfig/20240302-015143-marostegui.json
  • 01:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T354015)', diff saved to https://phabricator.wikimedia.org/P58295 and previous config saved to /var/cache/conftool/dbconfig/20240302-013637-marostegui.json

2024-03-01

  • 23:20 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2040.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2039.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2038.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2036.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:09 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host es2037.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2035.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2037.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host es2037.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2040.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:57 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dbprov2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:57 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host es2040.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:57 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2040.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2039.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2038.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:53 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host es2038.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2038.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:52 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2037.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:51 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2036.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:48 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2109.codfw.wmnet with OS bullseye
  • 22:48 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2035.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:47 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:47 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbprov2006 to codfw - jhancock@cumin2002"
  • 22:46 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbprov2006 to codfw - jhancock@cumin2002"
  • 22:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host es2035.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:44 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 22:35 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:35 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbprov2005 to codfw - jhancock@cumin2002"
  • 22:34 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbprov2005 to codfw - jhancock@cumin2002"
  • 22:32 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 22:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2109.codfw.wmnet with reason: host reimage
  • 22:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding es2040 to codfw - jhancock@cumin2002"
  • 22:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2035.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:30 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding es2040 to codfw - jhancock@cumin2002"
  • 22:28 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 22:27 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2109.codfw.wmnet with reason: host reimage
  • 22:26 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:26 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding es2039 to codfw - jhancock@cumin2002"
  • 22:26 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding es2039 to codfw - jhancock@cumin2002"
  • 22:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P58294 and previous config saved to /var/cache/conftool/dbconfig/20240301-222527-root.json
  • 22:23 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 22:21 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host es2035.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:18 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding es2038 to codfw - jhancock@cumin2002"
  • 22:17 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding es2038 to codfw - jhancock@cumin2002"
  • 22:15 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 22:12 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:12 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding es2037 to codfw - jhancock@cumin2002"
  • 22:11 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2035.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:11 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2109.codfw.wmnet with OS bullseye
  • 22:11 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding es2037 to codfw - jhancock@cumin2002"
  • 22:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host es2035.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:10 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P58293 and previous config saved to /var/cache/conftool/dbconfig/20240301-221022-root.json
  • 22:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding es2035 to codfw - jhancock@cumin2002"
  • 22:03 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding es2035 to codfw - jhancock@cumin2002"
  • 22:02 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host es2035.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:01 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 21:55 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:55 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding es2035 to codfw - jhancock@cumin2002"
  • 21:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P58292 and previous config saved to /var/cache/conftool/dbconfig/20240301-215517-root.json
  • 21:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding es2035 to codfw - jhancock@cumin2002"
  • 21:50 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 21:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P58291 and previous config saved to /var/cache/conftool/dbconfig/20240301-214013-root.json
  • 21:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P58290 and previous config saved to /var/cache/conftool/dbconfig/20240301-212508-root.json
  • 21:10 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 5%: After schema change', diff saved to https://phabricator.wikimedia.org/P58289 and previous config saved to /var/cache/conftool/dbconfig/20240301-211003-root.json
  • 20:45 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2109.codfw.wmnet with OS bullseye
  • 20:40 mutante: phabricator - added to WMF-NDA (group 61): Loren Johnson, Jonathan Fraine, Kris Litson, Lena Meintrup (all WMDE staff appearing in NDA spreadsheet) T358578
  • 20:35 mutante: phabricator - added to WMF-NDA (group 61): Aline Bruenger, Corinna Hillebrand, Kai Nissen, Christoph Jauera (all WMDE staff appearing in NDA spreadsheet) T358578
  • 19:12 mutante: contint1003 - sudo a2dismod mpm_event ; a2enmod php7.4 ; systemctl restart apache2 - common issue with puppet setup of an apache on first run
  • 18:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T354015)', diff saved to https://phabricator.wikimedia.org/P58288 and previous config saved to /var/cache/conftool/dbconfig/20240301-185046-marostegui.json
  • 18:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 18:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 18:12 taavi@cumin1002: dbctl commit (dc=all): 'depool db1169 T358892', diff saved to https://phabricator.wikimedia.org/P58287 and previous config saved to /var/cache/conftool/dbconfig/20240301-181221-taavi.json
  • 17:58 dancy@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@162f72f]: (no justification provided) (duration: 00m 08s)
  • 17:58 dancy@deploy2002: Started deploy [cassandra/logstash-logback-encoder@162f72f]: (no justification provided)
  • 16:54 claime: Pooled and uncordoned mw1384.eqiad.wmnet mw1432.eqiad.wmnet mw1433.eqiad.wmnet - T351074
  • 16:52 cgoubert@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=(mw1384.eqiad.wmnet|mw1432.eqiad.wmnet|mw1433.eqiad.wmnet),cluster=kubernetes,service=kubesvc
  • 16:46 claime: Running homer 'cr*eqiad*' commit 'T351074'
  • 16:46 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1384.eqiad.wmnet with OS bullseye
  • 16:43 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1432.eqiad.wmnet with OS bullseye
  • 16:40 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1433.eqiad.wmnet with OS bullseye
  • 16:27 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1384.eqiad.wmnet with reason: host reimage
  • 16:24 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1432.eqiad.wmnet with reason: host reimage
  • 16:22 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1433.eqiad.wmnet with reason: host reimage
  • 16:20 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1384.eqiad.wmnet with reason: host reimage
  • 16:20 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1432.eqiad.wmnet with reason: host reimage
  • 16:19 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1433.eqiad.wmnet with reason: host reimage
  • 16:17 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 16:16 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 16:16 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 16:16 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 16:15 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 16:15 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 16:07 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1432.eqiad.wmnet with OS bullseye
  • 16:06 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1384.eqiad.wmnet with OS bullseye
  • 16:06 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1433.eqiad.wmnet with OS bullseye
  • 16:05 dancy@deploy2002: Finished deploy [analytics/refinery@6e8f25b]: (no justification provided) (duration: 00m 03s)
  • 16:05 dancy@deploy2002: Started deploy [analytics/refinery@6e8f25b]: (no justification provided)
  • 16:04 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'ma