Jump to content

Server Admin Log

From Wikitech

2025-01-25

  • 08:34 fabfur@cumin1002: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool site magru [reason: depool magru to check for cr issues, no task ID specified]
  • 08:34 fabfur@cumin1002: START - Cookbook sre.dns.admin DNS admin: depool site magru [reason: depool magru to check for cr issues, no task ID specified]
  • 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2238 (T384592)', diff saved to https://phabricator.wikimedia.org/P72431 and previous config saved to /var/cache/conftool/dbconfig/20250125-083104-marostegui.json
  • 08:30 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db2238.codfw.wmnet with reason: Maintenance
  • 08:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T384592)', diff saved to https://phabricator.wikimedia.org/P72430 and previous config saved to /var/cache/conftool/dbconfig/20250125-083042-marostegui.json
  • 08:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P72429 and previous config saved to /var/cache/conftool/dbconfig/20250125-081535-marostegui.json
  • 08:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P72428 and previous config saved to /var/cache/conftool/dbconfig/20250125-080028-marostegui.json
  • 07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T384592)', diff saved to https://phabricator.wikimedia.org/P72427 and previous config saved to /var/cache/conftool/dbconfig/20250125-074521-marostegui.json
  • 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2226 (T384592)', diff saved to https://phabricator.wikimedia.org/P72426 and previous config saved to /var/cache/conftool/dbconfig/20250125-073059-marostegui.json
  • 07:30 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:30 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db2226.codfw.wmnet with reason: Maintenance
  • 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T384592)', diff saved to https://phabricator.wikimedia.org/P72425 and previous config saved to /var/cache/conftool/dbconfig/20250125-073021-marostegui.json
  • 07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P72424 and previous config saved to /var/cache/conftool/dbconfig/20250125-071514-marostegui.json
  • 07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P72423 and previous config saved to /var/cache/conftool/dbconfig/20250125-070007-marostegui.json
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T384592)', diff saved to https://phabricator.wikimedia.org/P72422 and previous config saved to /var/cache/conftool/dbconfig/20250125-064500-marostegui.json
  • 05:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2225 (T384592)', diff saved to https://phabricator.wikimedia.org/P72421 and previous config saved to /var/cache/conftool/dbconfig/20250125-055917-marostegui.json
  • 05:59 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db2225.codfw.wmnet with reason: Maintenance
  • 05:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T384592)', diff saved to https://phabricator.wikimedia.org/P72420 and previous config saved to /var/cache/conftool/dbconfig/20250125-055855-marostegui.json
  • 05:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P72419 and previous config saved to /var/cache/conftool/dbconfig/20250125-054347-marostegui.json
  • 05:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P72418 and previous config saved to /var/cache/conftool/dbconfig/20250125-052839-marostegui.json
  • 05:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T384592)', diff saved to https://phabricator.wikimedia.org/P72417 and previous config saved to /var/cache/conftool/dbconfig/20250125-051332-marostegui.json
  • 04:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2207 (T384592)', diff saved to https://phabricator.wikimedia.org/P72416 and previous config saved to /var/cache/conftool/dbconfig/20250125-042719-marostegui.json
  • 04:27 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db2207.codfw.wmnet with reason: Maintenance
  • 03:34 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 03:30 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 03:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T384592)', diff saved to https://phabricator.wikimedia.org/P72415 and previous config saved to /var/cache/conftool/dbconfig/20250125-033035-marostegui.json
  • 03:27 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 03:27 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 03:21 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 03:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P72414 and previous config saved to /var/cache/conftool/dbconfig/20250125-031528-marostegui.json
  • 03:12 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 03:04 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 03:04 andrew@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 03:04 andrew@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 03:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P72413 and previous config saved to /var/cache/conftool/dbconfig/20250125-030021-marostegui.json
  • 02:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T384592)', diff saved to https://phabricator.wikimedia.org/P72412 and previous config saved to /var/cache/conftool/dbconfig/20250125-024514-marostegui.json
  • 01:57 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T384592)', diff saved to https://phabricator.wikimedia.org/P72411 and previous config saved to /var/cache/conftool/dbconfig/20250125-015731-marostegui.json
  • 01:57 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 01:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T384592)', diff saved to https://phabricator.wikimedia.org/P72410 and previous config saved to /var/cache/conftool/dbconfig/20250125-015709-marostegui.json
  • 01:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P72409 and previous config saved to /var/cache/conftool/dbconfig/20250125-014201-marostegui.json
  • 01:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P72408 and previous config saved to /var/cache/conftool/dbconfig/20250125-012654-marostegui.json
  • 01:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T384592)', diff saved to https://phabricator.wikimedia.org/P72407 and previous config saved to /var/cache/conftool/dbconfig/20250125-011147-marostegui.json
  • 00:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T384592)', diff saved to https://phabricator.wikimedia.org/P72406 and previous config saved to /var/cache/conftool/dbconfig/20250125-001950-marostegui.json
  • 00:19 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 00:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T384592)', diff saved to https://phabricator.wikimedia.org/P72405 and previous config saved to /var/cache/conftool/dbconfig/20250125-001929-marostegui.json
  • 00:08 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudgw1004.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 00:04 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudgw1004.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 00:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P72404 and previous config saved to /var/cache/conftool/dbconfig/20250125-000422-marostegui.json
  • 00:04 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudgw1004.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED

2025-01-24

  • 23:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P72403 and previous config saved to /var/cache/conftool/dbconfig/20250124-234914-marostegui.json
  • 23:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp7004.magru.wmnet
  • 23:39 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp7004.magru.wmnet
  • 23:39 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudgw1004.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 23:36 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7004.magru.wmnet
  • 23:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T384592)', diff saved to https://phabricator.wikimedia.org/P72402 and previous config saved to /var/cache/conftool/dbconfig/20250124-233407-marostegui.json
  • 23:30 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudgw1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 23:26 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudgw1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 22:56 vriley@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudgw1004
  • 22:55 vriley@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudgw1004
  • 22:54 vriley@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudgw1003
  • 22:52 vriley@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudgw1003
  • 22:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T384592)', diff saved to https://phabricator.wikimedia.org/P72401 and previous config saved to /var/cache/conftool/dbconfig/20250124-224303-marostegui.json
  • 22:42 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 22:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp7004.magru.wmnet
  • 22:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp7004.magru.wmnet,service=cdn
  • 22:11 sukhe: pool bunch of cp7x in magru for ats-be that were depooled
  • 22:11 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7015.magru.wmnet,service=(cdn|ats-be)
  • 22:11 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7010.magru.wmnet,service=(cdn|ats-be)
  • 22:11 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp700[2-4].magru.wmnet,service=(cdn|ats-be)
  • 22:10 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7006.magru.wmnet,service=(cdn|ats-be)
  • 22:10 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7008.magru.wmnet,service=(cdn|ats-be)
  • 22:10 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7003.magru.wmnet,service=(cdn|ats-be)
  • 22:10 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7001.magru.wmnet,service=(cdn|ats-be)
  • 22:08 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudgw1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 22:08 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudgw1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 22:07 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:07 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt cloudgw1003 - vriley@cumin1002"
  • 22:05 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt cloudgw1003 - vriley@cumin1002"
  • 22:02 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 21:51 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7004.magru.wmnet with reason: Thermal settings testing (T373993)
  • 21:50 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 21:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246 (T384592)', diff saved to https://phabricator.wikimedia.org/P72399 and previous config saved to /var/cache/conftool/dbconfig/20250124-215037-marostegui.json
  • 21:49 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp7004.magru.wmnet,service=cdn
  • 21:47 brett: Testing thermal settings on cp7004 (T373993)
  • 21:43 amastilovic@deploy2002: Finished deploy [airflow-dags/platform_eng@ebb3680]: (no justification provided) (duration: 00m 31s)
  • 21:42 amastilovic@deploy2002: Started deploy [airflow-dags/platform_eng@ebb3680]: (no justification provided)
  • 21:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246', diff saved to https://phabricator.wikimedia.org/P72398 and previous config saved to /var/cache/conftool/dbconfig/20250124-213530-marostegui.json
  • 21:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246', diff saved to https://phabricator.wikimedia.org/P72397 and previous config saved to /var/cache/conftool/dbconfig/20250124-212023-marostegui.json
  • 21:15 amastilovic@deploy2002: Finished deploy [airflow-dags/platform_eng@3907ed7]: (no justification provided) (duration: 00m 10s)
  • 21:15 amastilovic@deploy2002: Started deploy [airflow-dags/platform_eng@3907ed7]: (no justification provided)
  • 21:13 amastilovic@deploy2002: Finished deploy [airflow-dags/platform_eng@3907ed7]: (no justification provided) (duration: 00m 35s)
  • 21:12 amastilovic@deploy2002: Started deploy [airflow-dags/platform_eng@3907ed7]: (no justification provided)
  • 21:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246 (T384592)', diff saved to https://phabricator.wikimedia.org/P72396 and previous config saved to /var/cache/conftool/dbconfig/20250124-210515-marostegui.json
  • 20:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P72393 and previous config saved to /var/cache/conftool/dbconfig/20250124-200419-marostegui.json
  • 19:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P72392 and previous config saved to /var/cache/conftool/dbconfig/20250124-194911-marostegui.json
  • 19:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T384592)', diff saved to https://phabricator.wikimedia.org/P72391 and previous config saved to /var/cache/conftool/dbconfig/20250124-193404-marostegui.json
  • 19:27 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 90 days, 0:00:00 on moscovium.eqiad.wmnet with reason: to be decomed
  • 19:19 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 90 days, 0:00:00 on moscovium.eqiad.wmnet with reason: to be decomed
  • 18:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T384592)', diff saved to https://phabricator.wikimedia.org/P72390 and previous config saved to /var/cache/conftool/dbconfig/20250124-184807-marostegui.json
  • 18:48 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 18:05 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts db2136.codfw.wmnet
  • 18:05 marostegui@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 18:05 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1154-1158].eqiad.wmnet
  • 18:05 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1154-1158].eqiad.wmnet
  • 18:04 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 18:03 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 17:59 marostegui: Removing db2136 from zarcillo T384479
  • 17:57 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2136.codfw.wmnet
  • 17:48 mforns@deploy2002: Finished deploy [airflow-dags/analytics@ebb3680]: bump up mediawiki reduced as part of temp accounts deployment (duration: 01m 00s)
  • 17:47 mforns@deploy2002: Started deploy [airflow-dags/analytics@ebb3680]: bump up mediawiki reduced as part of temp accounts deployment
  • 17:32 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1154.eqiad.wmnet with OS bookworm
  • 17:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1154.eqiad.wmnet with reason: host reimage
  • 17:10 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1154.eqiad.wmnet with reason: host reimage
  • 16:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1155.eqiad.wmnet with OS bookworm
  • 16:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1154
  • 16:54 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1154
  • 16:54 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1154.eqiad.wmnet with OS bookworm
  • 16:47 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1157.eqiad.wmnet with OS bookworm
  • 16:47 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1154.eqiad.wmnet with OS bookworm
  • 16:43 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1158.eqiad.wmnet with OS bookworm
  • 16:43 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2227,2229-2230].codfw.wmnet
  • 16:43 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker[2227,2229-2230].codfw.wmnet
  • 16:43 jayme@cumin1002: START - Cookbook sre.hosts.remove-downtime for wikikube-worker[2227,2229-2230].codfw.wmnet
  • 16:43 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2227,2229-2230].codfw.wmnet
  • 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2227
  • 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2227
  • 16:35 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 16:33 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1156.eqiad.wmnet with OS bookworm
  • 16:33 pt1979@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcephosd1013']
  • 16:32 pt1979@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1013']
  • 16:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1157.eqiad.wmnet with reason: host reimage
  • 16:26 jayme: imported istioctl 1.24.2-1 to bullseye/bookworm-wikimedia T341984
  • 16:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1158.eqiad.wmnet with reason: host reimage
  • 16:24 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2230
  • 16:23 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2230
  • 16:21 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1157.eqiad.wmnet with reason: host reimage
  • 16:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P72388 and previous config saved to /var/cache/conftool/dbconfig/20250124-162144-marostegui.json
  • 16:21 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1158.eqiad.wmnet with reason: host reimage
  • 16:21 andrew@cumin1002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 16:21 andrew@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 16:20 andrew@cumin1002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 16:20 andrew@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 16:16 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2229
  • 16:16 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2229
  • 16:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1156.eqiad.wmnet with reason: host reimage
  • 16:09 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1155.eqiad.wmnet with reason: host reimage
  • 16:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P72387 and previous config saved to /var/cache/conftool/dbconfig/20250124-160637-marostegui.json
  • 16:06 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1156.eqiad.wmnet with reason: host reimage
  • 16:05 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1155.eqiad.wmnet with reason: host reimage
  • 16:04 pt1979@cumin1002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1013']
  • 16:04 pt1979@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1013']
  • 16:00 pt1979@cumin1002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 15:59 pt1979@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 15:57 andrew@cumin1002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 15:57 andrew@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 15:57 andrew@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 15:57 andrew@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 15:56 andrew@cumin1002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 15:56 andrew@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 15:56 andrew@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 15:55 andrew@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 15:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1158
  • 15:52 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1158
  • 15:52 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1158.eqiad.wmnet with OS bookworm
  • 15:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1157
  • 15:52 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1157
  • 15:52 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1157.eqiad.wmnet with OS bookworm
  • 15:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T384592)', diff saved to https://phabricator.wikimedia.org/P72386 and previous config saved to /var/cache/conftool/dbconfig/20250124-155130-marostegui.json
  • 15:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1156
  • 15:50 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1156
  • 15:50 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1156.eqiad.wmnet with OS bookworm
  • 15:49 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1155
  • 15:49 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1155
  • 15:49 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1155.eqiad.wmnet with OS bookworm
  • 15:49 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1154
  • 15:49 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1154
  • 15:49 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1154.eqiad.wmnet with OS bookworm
  • 15:47 andrew@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 15:46 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1154.eqiad.wmnet wikikube-worker1155.eqiad.wmnet wikikube-worker1156.eqiad.wmnet wikikube-worker1157.eqiad.wmnet wikikube-worker1158.eqiad.wmnet on all recursors
  • 15:46 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1154.eqiad.wmnet wikikube-worker1155.eqiad.wmnet wikikube-worker1156.eqiad.wmnet wikikube-worker1157.eqiad.wmnet wikikube-worker1158.eqiad.wmnet on all recursors
  • 15:46 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1017 to wikikube-worker1158
  • 15:46 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1158
  • 15:44 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1158
  • 15:44 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:44 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1017 to wikikube-worker1158 - kamila@cumin1002"
  • 15:44 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1017 to wikikube-worker1158 - kamila@cumin1002"
  • 15:43 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Slavina Stefanova out of all services on: 1221 hosts
  • 15:42 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1016 to wikikube-worker1157
  • 15:42 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1157
  • 15:42 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Slavina Stefanova out of all services on: 1010 hosts
  • 15:41 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:41 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1157
  • 15:41 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:41 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1016 to wikikube-worker1157 - kamila@cumin1002"
  • 15:40 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1016 to wikikube-worker1157 - kamila@cumin1002"
  • 15:40 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1017 to wikikube-worker1158
  • 15:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1015 to wikikube-worker1156
  • 15:39 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1156
  • 15:39 andrew@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 15:38 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 15:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T384592)', diff saved to https://phabricator.wikimedia.org/P72385 and previous config saved to /var/cache/conftool/dbconfig/20250124-153805-marostegui.json
  • 15:37 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 15:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T384592)', diff saved to https://phabricator.wikimedia.org/P72384 and previous config saved to /var/cache/conftool/dbconfig/20250124-153743-marostegui.json
  • 15:36 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:36 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1156
  • 15:36 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:36 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1015 to wikikube-worker1156 - kamila@cumin1002"
  • 15:36 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1016 to wikikube-worker1157
  • 15:33 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1015 to wikikube-worker1156 - kamila@cumin1002"
  • 15:32 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1014 to wikikube-worker1155
  • 15:32 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1155
  • 15:29 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:29 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2227,2229-2230].codfw.wmnet
  • 15:29 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1015 to wikikube-worker1156
  • 15:29 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1155
  • 15:29 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:29 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1014 to wikikube-worker1155 - kamila@cumin1002"
  • 15:29 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1013 to wikikube-worker1154
  • 15:29 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1014 to wikikube-worker1155 - kamila@cumin1002"
  • 15:28 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1154
  • 15:27 jayme@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker[2227,2229-2230].codfw.wmnet with reason: Depooled via sre.k8s.pool-depool-node
  • 15:27 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2227,2229-2230].codfw.wmnet
  • 15:25 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:25 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1154
  • 15:25 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:25 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1013 to wikikube-worker1154 - kamila@cumin1002"
  • 15:25 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1014 to wikikube-worker1155
  • 15:24 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1013 to wikikube-worker1154 - kamila@cumin1002"
  • 15:23 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 15:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P72383 and previous config saved to /var/cache/conftool/dbconfig/20250124-152236-marostegui.json
  • 15:20 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:19 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1013 to wikikube-worker1154
  • 15:19 andrew@cumin1002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 15:18 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host parse[1013-1017].eqiad.wmnet
  • 15:12 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host parse[1013-1017].eqiad.wmnet
  • 15:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P72382 and previous config saved to /var/cache/conftool/dbconfig/20250124-150729-marostegui.json
  • 15:02 andrew@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 15:02 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 14:55 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 14:54 andrew@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 14:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T384592)', diff saved to https://phabricator.wikimedia.org/P72381 and previous config saved to /var/cache/conftool/dbconfig/20250124-145222-marostegui.json
  • 14:46 andrew@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1013.eqiad.wmnet']
  • 14:45 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1151.eqiad.wmnet with OS bookworm
  • 14:43 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1152.eqiad.wmnet with OS bookworm
  • 14:40 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 14:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1150.eqiad.wmnet with OS bookworm
  • 14:33 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1153.eqiad.wmnet with OS bookworm
  • 14:29 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1149.eqiad.wmnet with OS bookworm
  • 14:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1151.eqiad.wmnet with reason: host reimage
  • 14:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1148.eqiad.wmnet with OS bookworm
  • 14:22 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1152.eqiad.wmnet with reason: host reimage
  • 14:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1150.eqiad.wmnet with reason: host reimage
  • 14:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1153.eqiad.wmnet with reason: host reimage
  • 14:11 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
  • 14:10 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1149.eqiad.wmnet with reason: host reimage
  • 14:07 cmooney@cumin1002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
  • 14:07 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1148.eqiad.wmnet with reason: host reimage
  • 14:05 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 14:05 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 14:04 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1151.eqiad.wmnet with reason: host reimage
  • 14:04 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1152.eqiad.wmnet with reason: host reimage
  • 14:04 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1150.eqiad.wmnet with reason: host reimage
  • 14:04 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1153.eqiad.wmnet with reason: host reimage
  • 14:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1222 (T384592)', diff saved to https://phabricator.wikimedia.org/P72380 and previous config saved to /var/cache/conftool/dbconfig/20250124-140410-marostegui.json
  • 14:04 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1149.eqiad.wmnet with reason: host reimage
  • 14:04 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 14:03 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 14:03 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1148.eqiad.wmnet with reason: host reimage
  • 14:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T384592)', diff saved to https://phabricator.wikimedia.org/P72379 and previous config saved to /var/cache/conftool/dbconfig/20250124-140333-marostegui.json
  • 13:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1153
  • 13:48 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1153
  • 13:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1152
  • 13:48 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1152
  • 13:48 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1153.eqiad.wmnet with OS bookworm
  • 13:48 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1152.eqiad.wmnet with OS bookworm
  • 13:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1151
  • 13:48 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1151
  • 13:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P72378 and previous config saved to /var/cache/conftool/dbconfig/20250124-134826-marostegui.json
  • 13:48 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1151.eqiad.wmnet with OS bookworm
  • 13:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1150
  • 13:48 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1150
  • 13:48 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1150.eqiad.wmnet with OS bookworm
  • 13:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1149
  • 13:48 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1149
  • 13:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1148
  • 13:48 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1148
  • 13:48 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1149.eqiad.wmnet with OS bookworm
  • 13:48 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1148.eqiad.wmnet with OS bookworm
  • 13:45 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1148.eqiad.wmnet wikikube-worker1149.eqiad.wmnet wikikube-worker1150.eqiad.wmnet wikikube-worker1151.eqiad.wmnet wikikube-worker1152.eqiad.wmnet wikikube-worker1153.eqiad.wmnet on all recursors
  • 13:45 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1148.eqiad.wmnet wikikube-worker1149.eqiad.wmnet wikikube-worker1150.eqiad.wmnet wikikube-worker1151.eqiad.wmnet wikikube-worker1152.eqiad.wmnet wikikube-worker1153.eqiad.wmnet on all recursors
  • 13:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1012 to wikikube-worker1153
  • 13:44 cmooney@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 13:44 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1153
  • 13:43 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 13:43 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1153
  • 13:43 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:43 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1012 to wikikube-worker1153 - kamila@cumin1002"
  • 13:42 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1012 to wikikube-worker1153 - kamila@cumin1002"
  • 13:41 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1013.eqiad.wmnet with OS bullseye
  • 13:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1011 to wikikube-worker1152
  • 13:40 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1152
  • 13:39 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:39 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1152
  • 13:38 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:38 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1011 to wikikube-worker1152 - kamila@cumin1002"
  • 13:38 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1012 to wikikube-worker1153
  • 13:38 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1011 to wikikube-worker1152 - kamila@cumin1002"
  • 13:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1010 to wikikube-worker1151
  • 13:35 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1151
  • 13:34 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:34 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1151
  • 13:34 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:34 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1010 to wikikube-worker1151 - kamila@cumin1002"
  • 13:34 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1011 to wikikube-worker1152
  • 13:34 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1010 to wikikube-worker1151 - kamila@cumin1002"
  • 13:34 cmooney@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 13:34 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 13:33 cmooney@cumin1002: END (ERROR) - Cookbook sre.netbox.update-extras (exit_code=97) rolling restart_daemons on A:netbox-canary
  • 13:33 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 13:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P72376 and previous config saved to /var/cache/conftool/dbconfig/20250124-133319-marostegui.json
  • 13:32 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1009 to wikikube-worker1150
  • 13:31 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1150
  • 13:30 jynus@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1216.eqiad.wmnet with reason: rebuilding tables
  • 13:30 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:30 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1010 to wikikube-worker1151
  • 13:30 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1150
  • 13:30 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:30 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1009 to wikikube-worker1150 - kamila@cumin1002"
  • 13:29 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1009 to wikikube-worker1150 - kamila@cumin1002"
  • 13:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1008 to wikikube-worker1149
  • 13:27 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1149
  • 13:26 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:26 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1149
  • 13:26 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:26 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1008 to wikikube-worker1149 - kamila@cumin1002"
  • 13:25 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1009 to wikikube-worker1150
  • 13:25 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1008 to wikikube-worker1149 - kamila@cumin1002"
  • 13:23 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1007 to wikikube-worker1148
  • 13:23 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1148
  • 13:22 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:21 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1148
  • 13:21 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:21 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1007 to wikikube-worker1148 - kamila@cumin1002"
  • 13:21 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1008 to wikikube-worker1149
  • 13:21 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1007 to wikikube-worker1148 - kamila@cumin1002"
  • 13:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T384592)', diff saved to https://phabricator.wikimedia.org/P72375 and previous config saved to /var/cache/conftool/dbconfig/20250124-131812-marostegui.json
  • 13:17 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:17 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1007 to wikikube-worker1148
  • 13:16 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host parse[1007-1012].eqiad.wmnet
  • 13:13 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host parse[1007-1012].eqiad.wmnet
  • 13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T384592)', diff saved to https://phabricator.wikimedia.org/P72374 and previous config saved to /var/cache/conftool/dbconfig/20250124-130431-marostegui.json
  • 13:04 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T384592)', diff saved to https://phabricator.wikimedia.org/P72373 and previous config saved to /var/cache/conftool/dbconfig/20250124-130409-marostegui.json
  • 12:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P72372 and previous config saved to /var/cache/conftool/dbconfig/20250124-124902-marostegui.json
  • 12:46 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Revert "Warn if 'preprint', 'dataset', or 'standard' key is missing" (T384661) (duration: 13m 04s)
  • 12:40 lucaswerkmeister-wmde@deploy2002: mvolz, lucaswerkmeister-wmde: Continuing with sync
  • 12:38 lucaswerkmeister-wmde@deploy2002: mvolz, lucaswerkmeister-wmde: Backport for Revert "Warn if 'preprint', 'dataset', or 'standard' key is missing" (T384661) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P72371 and previous config saved to /var/cache/conftool/dbconfig/20250124-123355-marostegui.json
  • 12:33 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Revert "Warn if 'preprint', 'dataset', or 'standard' key is missing" (T384661)
  • 12:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T384592)', diff saved to https://phabricator.wikimedia.org/P72370 and previous config saved to /var/cache/conftool/dbconfig/20250124-121848-marostegui.json
  • 12:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T384592)', diff saved to https://phabricator.wikimedia.org/P72369 and previous config saved to /var/cache/conftool/dbconfig/20250124-120417-marostegui.json
  • 12:04 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 12:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T384592)', diff saved to https://phabricator.wikimedia.org/P72368 and previous config saved to /var/cache/conftool/dbconfig/20250124-120355-marostegui.json
  • 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ml-etcd2001.codfw.wmnet to plain
  • 11:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P72367 and previous config saved to /var/cache/conftool/dbconfig/20250124-114848-marostegui.json
  • 11:48 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ml-etcd2001.codfw.wmnet to plain
  • 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2020.codfw.wmnet
  • 11:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2020.codfw.wmnet
  • 11:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ml-etcd2001.codfw.wmnet to drbd
  • 11:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P72366 and previous config saved to /var/cache/conftool/dbconfig/20250124-113341-marostegui.json
  • 11:33 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ml-etcd2001.codfw.wmnet to drbd
  • 11:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2020.codfw.wmnet
  • 11:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2020.codfw.wmnet
  • 11:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T384592)', diff saved to https://phabricator.wikimedia.org/P72365 and previous config saved to /var/cache/conftool/dbconfig/20250124-111834-marostegui.json
  • 10:50 fceratto@cumin1002: dbctl commit (dc=all): 'Remove db2140 from dbctl T384480', diff saved to https://phabricator.wikimedia.org/P72363 and previous config saved to /var/cache/conftool/dbconfig/20250124-105029-fceratto.json
  • 10:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T384592)', diff saved to https://phabricator.wikimedia.org/P72362 and previous config saved to /var/cache/conftool/dbconfig/20250124-102157-marostegui.json
  • 10:21 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 10:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T384592)', diff saved to https://phabricator.wikimedia.org/P72361 and previous config saved to /var/cache/conftool/dbconfig/20250124-102135-marostegui.json
  • 10:13 mnz@deploy2002: Finished deploy [airflow-dags/research@95b14c7]: (no justification provided) (duration: 00m 43s)
  • 10:12 mnz@deploy2002: Started deploy [airflow-dags/research@95b14c7]: (no justification provided)
  • 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P72360 and previous config saved to /var/cache/conftool/dbconfig/20250124-100628-marostegui.json
  • 10:01 mnz@deploy2002: Finished deploy [airflow-dags/research@ba61f77]: (no justification provided) (duration: 00m 12s)
  • 10:01 mnz@deploy2002: Started deploy [airflow-dags/research@ba61f77]: (no justification provided)
  • 09:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P72359 and previous config saved to /var/cache/conftool/dbconfig/20250124-095121-marostegui.json
  • 09:43 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on netflow1002.eqiad.wmnet with reason: disabling alerts as I'm running gnmic manually rather than with systemd
  • 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T384592)', diff saved to https://phabricator.wikimedia.org/P72358 and previous config saved to /var/cache/conftool/dbconfig/20250124-093614-marostegui.json
  • 09:21 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2022.codfw.wmnet to cluster codfw and group B
  • 09:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2022.codfw.wmnet to cluster codfw and group B
  • 09:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2022.codfw.wmnet
  • 09:14 root@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1216.eqiad.wmnet with OS bookworm
  • 09:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2022.codfw.wmnet
  • 09:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2022.codfw.wmnet with OS bookworm
  • 08:51 root@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1216.eqiad.wmnet with reason: host reimage
  • 08:49 root@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1225.eqiad.wmnet with OS bookworm
  • 08:47 root@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1216.eqiad.wmnet with reason: host reimage
  • 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2022.codfw.wmnet with reason: host reimage
  • 08:42 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2022.codfw.wmnet with reason: host reimage
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T384592)', diff saved to https://phabricator.wikimedia.org/P72357 and previous config saved to /var/cache/conftool/dbconfig/20250124-083638-marostegui.json
  • 08:36 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:36 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 08:30 root@cumin1002: START - Cookbook sre.hosts.reimage for host db1216.eqiad.wmnet with OS bookworm
  • 08:29 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2214.codfw.wmnet with reason: Maintenance
  • 08:25 root@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1225.eqiad.wmnet with reason: host reimage
  • 08:21 root@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1225.eqiad.wmnet with reason: host reimage
  • 08:18 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 08:11 jynus@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1216.eqiad.wmnet with reason: os upgrade
  • 08:08 marostegui: Remove es1023 from es5 eqiad dbmaint T384679
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1044 to es5 master', diff saved to https://phabricator.wikimedia.org/P72356 and previous config saved to /var/cache/conftool/dbconfig/20250124-080804-root.json
  • 08:04 root@cumin1002: START - Cookbook sre.hosts.reimage for host db1225.eqiad.wmnet with OS bookworm
  • 07:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es1022.eqiad.wmnet
  • 07:58 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:58 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es1022.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 07:57 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es1022.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 07:54 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 07:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2022.codfw.wmnet with OS bookworm
  • 07:51 jynus@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: os upgrade
  • 07:48 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts es1022.eqiad.wmnet
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72355 and previous config saved to /var/cache/conftool/dbconfig/20250124-064249-root.json
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'db1222 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72354 and previous config saved to /var/cache/conftool/dbconfig/20250124-063557-root.json
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2229 (T384592)', diff saved to https://phabricator.wikimedia.org/P72353 and previous config saved to /var/cache/conftool/dbconfig/20250124-062855-marostegui.json
  • 06:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72352 and previous config saved to /var/cache/conftool/dbconfig/20250124-062744-root.json
  • 06:20 marostegui@cumin1002: dbctl commit (dc=all): 'db1222 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72351 and previous config saved to /var/cache/conftool/dbconfig/20250124-062052-root.json
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P72350 and previous config saved to /var/cache/conftool/dbconfig/20250124-061348-marostegui.json
  • 06:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72349 and previous config saved to /var/cache/conftool/dbconfig/20250124-061238-root.json
  • 06:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1222 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72348 and previous config saved to /var/cache/conftool/dbconfig/20250124-060547-root.json
  • 05:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P72347 and previous config saved to /var/cache/conftool/dbconfig/20250124-055842-marostegui.json
  • 05:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72346 and previous config saved to /var/cache/conftool/dbconfig/20250124-055733-root.json
  • 05:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1222 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72345 and previous config saved to /var/cache/conftool/dbconfig/20250124-055042-root.json
  • 05:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2229 (T384592)', diff saved to https://phabricator.wikimedia.org/P72344 and previous config saved to /var/cache/conftool/dbconfig/20250124-054335-marostegui.json
  • 05:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72343 and previous config saved to /var/cache/conftool/dbconfig/20250124-054227-root.json
  • 05:35 marostegui@cumin1002: dbctl commit (dc=all): 'db1222 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72342 and previous config saved to /var/cache/conftool/dbconfig/20250124-053535-root.json
  • 05:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2229 (T384592)', diff saved to https://phabricator.wikimedia.org/P72341 and previous config saved to /var/cache/conftool/dbconfig/20250124-051525-marostegui.json
  • 05:15 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2229.codfw.wmnet with reason: Maintenance
  • 05:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T384592)', diff saved to https://phabricator.wikimedia.org/P72340 and previous config saved to /var/cache/conftool/dbconfig/20250124-051503-marostegui.json
  • 04:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P72339 and previous config saved to /var/cache/conftool/dbconfig/20250124-045955-marostegui.json
  • 04:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P72338 and previous config saved to /var/cache/conftool/dbconfig/20250124-044449-marostegui.json
  • 04:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T384592)', diff saved to https://phabricator.wikimedia.org/P72337 and previous config saved to /var/cache/conftool/dbconfig/20250124-042942-marostegui.json
  • 04:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2224 (T384592)', diff saved to https://phabricator.wikimedia.org/P72336 and previous config saved to /var/cache/conftool/dbconfig/20250124-040239-marostegui.json
  • 04:02 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2224.codfw.wmnet with reason: Maintenance
  • 04:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T384592)', diff saved to https://phabricator.wikimedia.org/P72335 and previous config saved to /var/cache/conftool/dbconfig/20250124-040216-marostegui.json
  • 03:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P72334 and previous config saved to /var/cache/conftool/dbconfig/20250124-034709-marostegui.json
  • 03:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P72333 and previous config saved to /var/cache/conftool/dbconfig/20250124-033202-marostegui.json
  • 03:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T384592)', diff saved to https://phabricator.wikimedia.org/P72332 and previous config saved to /var/cache/conftool/dbconfig/20250124-031655-marostegui.json
  • 02:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2217 (T384592)', diff saved to https://phabricator.wikimedia.org/P72331 and previous config saved to /var/cache/conftool/dbconfig/20250124-024851-marostegui.json
  • 02:48 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 02:20 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 02:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T384592)', diff saved to https://phabricator.wikimedia.org/P72330 and previous config saved to /var/cache/conftool/dbconfig/20250124-022005-marostegui.json
  • 02:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P72329 and previous config saved to /var/cache/conftool/dbconfig/20250124-020458-marostegui.json
  • 01:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P72328 and previous config saved to /var/cache/conftool/dbconfig/20250124-014951-marostegui.json
  • 01:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T384592)', diff saved to https://phabricator.wikimedia.org/P72327 and previous config saved to /var/cache/conftool/dbconfig/20250124-013444-marostegui.json
  • 01:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T384592)', diff saved to https://phabricator.wikimedia.org/P72326 and previous config saved to /var/cache/conftool/dbconfig/20250124-012645-marostegui.json
  • 01:26 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 01:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T384592)', diff saved to https://phabricator.wikimedia.org/P72325 and previous config saved to /var/cache/conftool/dbconfig/20250124-012623-marostegui.json
  • 01:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P72324 and previous config saved to /var/cache/conftool/dbconfig/20250124-011116-marostegui.json
  • 00:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P72323 and previous config saved to /var/cache/conftool/dbconfig/20250124-005609-marostegui.json
  • 00:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T384592)', diff saved to https://phabricator.wikimedia.org/P72322 and previous config saved to /var/cache/conftool/dbconfig/20250124-004102-marostegui.json
  • 00:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T384592)', diff saved to https://phabricator.wikimedia.org/P72321 and previous config saved to /var/cache/conftool/dbconfig/20250124-003237-marostegui.json
  • 00:32 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 00:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T384592)', diff saved to https://phabricator.wikimedia.org/P72320 and previous config saved to /var/cache/conftool/dbconfig/20250124-003215-marostegui.json
  • 00:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P72319 and previous config saved to /var/cache/conftool/dbconfig/20250124-001708-marostegui.json
  • 00:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P72318 and previous config saved to /var/cache/conftool/dbconfig/20250124-000200-marostegui.json

2025-01-23

  • 23:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T384592)', diff saved to https://phabricator.wikimedia.org/P72317 and previous config saved to /var/cache/conftool/dbconfig/20250123-234653-marostegui.json
  • 23:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2169 (T384592)', diff saved to https://phabricator.wikimedia.org/P72316 and previous config saved to /var/cache/conftool/dbconfig/20250123-231641-marostegui.json
  • 23:16 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 23:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T384592)', diff saved to https://phabricator.wikimedia.org/P72315 and previous config saved to /var/cache/conftool/dbconfig/20250123-231619-marostegui.json
  • 23:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P72312 and previous config saved to /var/cache/conftool/dbconfig/20250123-230112-marostegui.json
  • 22:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P72311 and previous config saved to /var/cache/conftool/dbconfig/20250123-224604-marostegui.json
  • 22:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T384592)', diff saved to https://phabricator.wikimedia.org/P72310 and previous config saved to /var/cache/conftool/dbconfig/20250123-223057-marostegui.json
  • 22:13 cmooney@cumin1002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cloudsw2-d5-eqiad
  • 22:12 cmooney@cumin1002: START - Cookbook sre.network.tls for network device cloudsw2-d5-eqiad
  • 22:10 cjming: end of UTC late backport window
  • 22:09 cjming@deploy2002: Finished scap sync-world: Backport for For Parsoid calls, treat preprocessing as starting in SOL state (T382464) (duration: 11m 45s)
  • 22:02 cjming@deploy2002: ssastry, cjming: Continuing with sync
  • 22:02 cjming@deploy2002: ssastry, cjming: Backport for For Parsoid calls, treat preprocessing as starting in SOL state (T382464) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:59 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on netflow1002.eqiad.wmnet with reason: disabling alerts as I'm running gnmic manually rather than with systemd
  • 21:58 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T384592)', diff saved to https://phabricator.wikimedia.org/P72309 and previous config saved to /var/cache/conftool/dbconfig/20250123-215855-marostegui.json
  • 21:58 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 21:58 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 21:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T384592)', diff saved to https://phabricator.wikimedia.org/P72308 and previous config saved to /var/cache/conftool/dbconfig/20250123-215828-marostegui.json
  • 21:57 cjming@deploy2002: Started scap sync-world: Backport for For Parsoid calls, treat preprocessing as starting in SOL state (T382464)
  • 21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P72307 and previous config saved to /var/cache/conftool/dbconfig/20250123-214320-marostegui.json
  • 21:36 cjming@deploy2002: Finished scap sync-world: Backport for Profiler: centralize metrics send to a function (duration: 15m 28s)
  • 21:29 cjming@deploy2002: cwhite, cjming: Continuing with sync
  • 21:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P72306 and previous config saved to /var/cache/conftool/dbconfig/20250123-212813-marostegui.json
  • 21:25 cjming@deploy2002: cwhite, cjming: Backport for Profiler: centralize metrics send to a function synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:20 cjming@deploy2002: Started scap sync-world: Backport for Profiler: centralize metrics send to a function
  • 21:14 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow7001.magru.wmnet with reason: disabling alerts as I'm running gnmic manually rather than with systemd
  • 21:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T384592)', diff saved to https://phabricator.wikimedia.org/P72305 and previous config saved to /var/cache/conftool/dbconfig/20250123-211306-marostegui.json
  • 20:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T384592)', diff saved to https://phabricator.wikimedia.org/P72304 and previous config saved to /var/cache/conftool/dbconfig/20250123-204245-marostegui.json
  • 20:42 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 20:12 cdanis@deploy2002: cdanis: Continuing with sync
  • 20:11 cdanis@deploy2002: cdanis: Backport for tracing: lowercase headers before processing them (T384629) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:10 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 20:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T384592)', diff saved to https://phabricator.wikimedia.org/P72303 and previous config saved to /var/cache/conftool/dbconfig/20250123-201038-marostegui.json
  • 20:07 cdanis@deploy2002: Started scap sync-world: Backport for tracing: lowercase headers before processing them (T384629)
  • 19:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P72302 and previous config saved to /var/cache/conftool/dbconfig/20250123-195531-marostegui.json
  • 19:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P72301 and previous config saved to /var/cache/conftool/dbconfig/20250123-194024-marostegui.json
  • 19:33 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.44.0-wmf.13 refs T382364
  • 19:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T384592)', diff saved to https://phabricator.wikimedia.org/P72300 and previous config saved to /var/cache/conftool/dbconfig/20250123-192517-marostegui.json
  • 19:22 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudgw1004.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 19:22 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudgw1004.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 19:21 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudgw1004.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 19:21 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudgw1004.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 19:19 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudgw1004.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 19:19 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudgw1004.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 19:18 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:18 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt clou~dgw1004 - vriley@cumin1002"
  • 19:18 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt clou~dgw1004 - vriley@cumin1002"
  • 19:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1231 (T384592)', diff saved to https://phabricator.wikimedia.org/P72299 and previous config saved to /var/cache/conftool/dbconfig/20250123-191808-marostegui.json
  • 19:18 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 19:15 brennen: 1.44.0-wmf.13 train (T382364): no current blockers, logs relatively clean, rolling to all wikis.
  • 19:14 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 19:14 vriley@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 19:11 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 19:04 taavi: fix my netbox account T373702
  • 18:44 cmooney@cumin1002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr2-eqdfw
  • 18:44 cmooney@cumin1002: START - Cookbook sre.network.tls for network device cr2-eqdfw
  • 18:43 cmooney@cumin1002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr2-eqord
  • 18:43 cmooney@cumin1002: START - Cookbook sre.network.tls for network device cr2-eqord
  • 18:32 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 18:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T384592)', diff saved to https://phabricator.wikimedia.org/P72298 and previous config saved to /var/cache/conftool/dbconfig/20250123-183202-marostegui.json
  • 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P72297 and previous config saved to /var/cache/conftool/dbconfig/20250123-181655-marostegui.json
  • 18:09 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1142-1147].eqiad.wmnet
  • 18:09 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1142-1147].eqiad.wmnet
  • 18:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P72296 and previous config saved to /var/cache/conftool/dbconfig/20250123-180148-marostegui.json
  • 17:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T384592)', diff saved to https://phabricator.wikimedia.org/P72295 and previous config saved to /var/cache/conftool/dbconfig/20250123-174641-marostegui.json
  • 17:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T384592)', diff saved to https://phabricator.wikimedia.org/P72294 and previous config saved to /var/cache/conftool/dbconfig/20250123-173852-marostegui.json
  • 17:38 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 17:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T384592)', diff saved to https://phabricator.wikimedia.org/P72293 and previous config saved to /var/cache/conftool/dbconfig/20250123-173830-marostegui.json
  • 17:33 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cassandra-dev2001
  • 17:33 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cassandra-dev2001
  • 17:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P72292 and previous config saved to /var/cache/conftool/dbconfig/20250123-172323-marostegui.json
  • 17:20 papaul: power down cassandra-dev2001 for maintenance
  • 17:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P72290 and previous config saved to /var/cache/conftool/dbconfig/20250123-170816-marostegui.json
  • 17:00 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:59 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T384592)', diff saved to https://phabricator.wikimedia.org/P72289 and previous config saved to /var/cache/conftool/dbconfig/20250123-165309-marostegui.json
  • 16:47 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1222.eqiad.wmnet with reason: Index rebuild
  • 16:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T384592)', diff saved to https://phabricator.wikimedia.org/P72288 and previous config saved to /var/cache/conftool/dbconfig/20250123-164453-marostegui.json
  • 16:44 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 16:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T384592)', diff saved to https://phabricator.wikimedia.org/P72287 and previous config saved to /var/cache/conftool/dbconfig/20250123-164431-marostegui.json
  • 16:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1222 T384597', diff saved to https://phabricator.wikimedia.org/P72286 and previous config saved to /var/cache/conftool/dbconfig/20250123-164415-marostegui.json
  • 16:43 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db1162 to s2 primary T384597', diff saved to https://phabricator.wikimedia.org/P72285 and previous config saved to /var/cache/conftool/dbconfig/20250123-164322-root.json
  • 16:42 marostegui: Starting s2 eqiad failover from db1222 to db1162 - T384597
  • 16:40 dcausse@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 16:39 dcausse@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 16:37 dcausse@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 16:37 dcausse@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 16:34 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 16:33 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 16:33 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (1 nodes at a time) for ElasticSearch cluster cloudelastic: demonstration - bking@cumin2002 - T380752
  • 16:33 dcausse@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
  • 16:31 dcausse@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
  • 16:29 dcausse@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
  • 16:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P72284 and previous config saved to /var/cache/conftool/dbconfig/20250123-162924-marostegui.json
  • 16:28 dcausse@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
  • 16:26 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s2 T384597
  • 16:25 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1162 with weight 0 T384597', diff saved to https://phabricator.wikimedia.org/P72283 and previous config saved to /var/cache/conftool/dbconfig/20250123-162552-root.json
  • 16:23 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
  • 16:22 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ganeti2022.codfw.wmnet with reason: remove from cluster for reimage
  • 16:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1162 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72282 and previous config saved to /var/cache/conftool/dbconfig/20250123-162235-root.json
  • 16:22 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
  • 16:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P72281 and previous config saved to /var/cache/conftool/dbconfig/20250123-161417-marostegui.json
  • 16:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1162 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72280 and previous config saved to /var/cache/conftool/dbconfig/20250123-160730-root.json
  • 15:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T384592)', diff saved to https://phabricator.wikimedia.org/P72279 and previous config saved to /var/cache/conftool/dbconfig/20250123-155910-marostegui.json
  • 15:56 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
  • 15:55 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
  • 15:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T384592)', diff saved to https://phabricator.wikimedia.org/P72278 and previous config saved to /var/cache/conftool/dbconfig/20250123-155045-marostegui.json
  • 15:50 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 15:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T384592)', diff saved to https://phabricator.wikimedia.org/P72277 and previous config saved to /var/cache/conftool/dbconfig/20250123-155023-marostegui.json
  • 15:50 fceratto@cumin1002: dbctl commit (dc=all): 'Removing db2136 T384479', diff saved to https://phabricator.wikimedia.org/P72276 and previous config saved to /var/cache/conftool/dbconfig/20250123-155016-fceratto.json
  • 15:48 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2189
  • 15:48 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2189
  • 15:36 kamila@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1142-1147].eqiad.wmnet
  • 15:36 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1142-1147].eqiad.wmnet
  • 15:35 brennen@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.44.0-wmf.13 refs T382364
  • 15:35 kamila@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1142-1147].eqiad.wmnet
  • 15:35 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1142-1147].eqiad.wmnet
  • 15:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P72274 and previous config saved to /var/cache/conftool/dbconfig/20250123-153405-marostegui.json
  • 15:31 kamila@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1142-1147].eqiad.wmnet
  • 15:31 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1142-1147].eqiad.wmnet
  • 15:23 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1142.eqiad.wmnet with OS bookworm
  • 15:21 brennen: 1.44.0-wmf.13 train (T382364): unblocked, rolling to group1
  • 15:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P72273 and previous config saved to /var/cache/conftool/dbconfig/20250123-151858-marostegui.json
  • 15:18 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1147.eqiad.wmnet with OS bookworm
  • 15:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1143.eqiad.wmnet with OS bookworm
  • 15:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1146.eqiad.wmnet with OS bookworm
  • 15:11 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1145.eqiad.wmnet with OS bookworm
  • 15:06 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1144.eqiad.wmnet with OS bookworm
  • 15:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1142.eqiad.wmnet with reason: host reimage
  • 15:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T384592)', diff saved to https://phabricator.wikimedia.org/P72272 and previous config saved to /var/cache/conftool/dbconfig/20250123-150351-marostegui.json
  • 14:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1147.eqiad.wmnet with reason: host reimage
  • 14:56 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1143.eqiad.wmnet with reason: host reimage
  • 14:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T384592)', diff saved to https://phabricator.wikimedia.org/P72271 and previous config saved to /var/cache/conftool/dbconfig/20250123-145540-marostegui.json
  • 14:55 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 14:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T384592)', diff saved to https://phabricator.wikimedia.org/P72270 and previous config saved to /var/cache/conftool/dbconfig/20250123-145518-marostegui.json
  • 14:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1146.eqiad.wmnet with reason: host reimage
  • 14:53 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (1 nodes at a time) for ElasticSearch cluster cloudelastic: demonstration - bking@cumin2002 - T380752
  • 14:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1145.eqiad.wmnet with reason: host reimage
  • 14:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2022.codfw.wmnet
  • 14:46 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1144.eqiad.wmnet with reason: host reimage
  • 14:43 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1147.eqiad.wmnet with reason: host reimage
  • 14:43 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1146.eqiad.wmnet with reason: host reimage
  • 14:43 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1145.eqiad.wmnet with reason: host reimage
  • 14:43 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1143.eqiad.wmnet with reason: host reimage
  • 14:42 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1142.eqiad.wmnet with reason: host reimage
  • 14:42 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1144.eqiad.wmnet with reason: host reimage
  • 14:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P72269 and previous config saved to /var/cache/conftool/dbconfig/20250123-144011-marostegui.json
  • 14:39 vgutierrez: updating acme-chief on acmechief1002
  • 14:33 root@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1239.eqiad.wmnet with OS bookworm
  • 14:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1147
  • 14:28 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1147
  • 14:28 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1147.eqiad.wmnet with OS bookworm
  • 14:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1146
  • 14:27 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1146
  • 14:27 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1146.eqiad.wmnet with OS bookworm
  • 14:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1145
  • 14:27 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1145
  • 14:27 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1145.eqiad.wmnet with OS bookworm
  • 14:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1144
  • 14:27 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1144
  • 14:27 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1144.eqiad.wmnet with OS bookworm
  • 14:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1143
  • 14:27 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1143
  • 14:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72267 and previous config saved to /var/cache/conftool/dbconfig/20250123-142715-root.json
  • 14:27 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1143.eqiad.wmnet with OS bookworm
  • 14:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1142
  • 14:27 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1142
  • 14:27 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1142.eqiad.wmnet with OS bookworm
  • 14:26 TheresNoTime: UTC afternoon backport window done
  • 14:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P72266 and previous config saved to /var/cache/conftool/dbconfig/20250123-142504-marostegui.json
  • 14:21 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1142.eqiad.wmnet wikikube-worker1143.eqiad.wmnet wikikube-worker1144.eqiad.wmnet wikikube-worker1145.eqiad.wmnet wikikube-worker1146.eqiad.wmnet wikikube-worker1147.eqiad.wmnet on all recursors
  • 14:21 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1142.eqiad.wmnet wikikube-worker1143.eqiad.wmnet wikikube-worker1144.eqiad.wmnet wikikube-worker1145.eqiad.wmnet wikikube-worker1146.eqiad.wmnet wikikube-worker1147.eqiad.wmnet on all recursors
  • 14:21 samtar@deploy2002: Finished scap sync-world: Backport for cirrus: stop writing to wikitech index from the MW JobQueue, cirrus: cleanup unused settings (T374702) (duration: 12m 00s)
  • 14:14 samtar@deploy2002: dcausse, samtar: Continuing with sync
  • 14:13 samtar@deploy2002: dcausse, samtar: Backport for cirrus: stop writing to wikitech index from the MW JobQueue, cirrus: cleanup unused settings (T374702) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72264 and previous config saved to /var/cache/conftool/dbconfig/20250123-141209-root.json
  • 14:10 root@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1239.eqiad.wmnet with reason: host reimage
  • 14:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T384592)', diff saved to https://phabricator.wikimedia.org/P72263 and previous config saved to /var/cache/conftool/dbconfig/20250123-140957-marostegui.json
  • 14:09 samtar@deploy2002: Started scap sync-world: Backport for cirrus: stop writing to wikitech index from the MW JobQueue, cirrus: cleanup unused settings (T374702)
  • 14:07 root@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1239.eqiad.wmnet with reason: host reimage
  • 14:07 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-restart-tcp-mss-clamper (exit_code=0) rolling restart_daemons on A:cp-text_eqiad
  • 14:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T384592)', diff saved to https://phabricator.wikimedia.org/P72262 and previous config saved to /var/cache/conftool/dbconfig/20250123-140649-marostegui.json
  • 14:06 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 14:06 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 14:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repool db1165', diff saved to https://phabricator.wikimedia.org/P72261 and previous config saved to /var/cache/conftool/dbconfig/20250123-140524-marostegui.json
  • 13:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72259 and previous config saved to /var/cache/conftool/dbconfig/20250123-135704-root.json
  • 13:57 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72258 and previous config saved to /var/cache/conftool/dbconfig/20250123-135704-root.json
  • 13:56 fceratto@cumin1002: dbctl commit (dc=all): 'Depool db2140 T384480', diff saved to https://phabricator.wikimedia.org/P72257 and previous config saved to /var/cache/conftool/dbconfig/20250123-135655-fceratto.json
  • 13:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1006 to wikikube-worker1147
  • 13:54 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1147
  • 13:52 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1147
  • 13:52 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:52 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1006 to wikikube-worker1147 - kamila@cumin1002"
  • 13:50 root@cumin1002: START - Cookbook sre.hosts.reimage for host db1239.eqiad.wmnet with OS bookworm
  • 13:50 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1006 to wikikube-worker1147 - kamila@cumin1002"
  • 13:49 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-restart-tcp-mss-clamper rolling restart_daemons on A:cp-text_eqiad
  • 13:49 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-restart-tcp-mss-clamper (exit_code=0) rolling restart_daemons on A:cp-text_esams
  • 13:46 jynus@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1239.eqiad.wmnet with reason: reimage
  • 13:42 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1005 to wikikube-worker1146
  • 13:42 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1146
  • 13:41 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:41 godog: bounce mtail on centrallog2002 - high system cpu usage and perf top reports native_queued_spin_lock_slowpath
  • 13:41 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1006 to wikikube-worker1147
  • 13:38 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1146
  • 13:38 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:38 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1005 to wikikube-worker1146 - kamila@cumin1002"
  • 13:37 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1005 to wikikube-worker1146 - kamila@cumin1002"
  • 13:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72256 and previous config saved to /var/cache/conftool/dbconfig/20250123-133311-root.json
  • 13:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72255 and previous config saved to /var/cache/conftool/dbconfig/20250123-133304-root.json
  • 13:31 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-restart-tcp-mss-clamper rolling restart_daemons on A:cp-text_esams
  • 13:31 ladsgroup@deploy2002: Finished scap sync-world: Backport for file: Add caller to write queries (T384481) (duration: 09m 43s)
  • 13:30 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1004 to wikikube-worker1145
  • 13:30 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1145
  • 13:29 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow1002.eqiad.wmnet with reason: disabling alerts as I'm running gnmic manually rather than with systemd
  • 13:29 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:28 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1145
  • 13:28 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:28 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1004 to wikikube-worker1145 - kamila@cumin1002"
  • 13:28 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1005 to wikikube-worker1146
  • 13:28 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1004 to wikikube-worker1145 - kamila@cumin1002"
  • 13:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1003 to wikikube-worker1144
  • 13:24 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1144
  • 13:24 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 13:24 ladsgroup@deploy2002: ladsgroup: Backport for file: Add caller to write queries (T384481) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:23 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:23 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1144
  • 13:23 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:23 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1003 to wikikube-worker1144 - kamila@cumin1002"
  • 13:23 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1004 to wikikube-worker1145
  • 13:23 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1003 to wikikube-worker1144 - kamila@cumin1002"
  • 13:21 ladsgroup@deploy2002: Started scap sync-world: Backport for file: Add caller to write queries (T384481)
  • 13:20 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1002 to wikikube-worker1143
  • 13:20 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1143
  • 13:19 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:19 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1143
  • 13:19 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:18 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1002 to wikikube-worker1143 - kamila@cumin1002"
  • 13:18 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1003 to wikikube-worker1144
  • 13:18 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1002 to wikikube-worker1143 - kamila@cumin1002"
  • 13:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72253 and previous config saved to /var/cache/conftool/dbconfig/20250123-131805-root.json
  • 13:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72252 and previous config saved to /var/cache/conftool/dbconfig/20250123-131758-root.json
  • 13:17 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1162.eqiad.wmnet with reason: Index rebuild
  • 13:16 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from parse1001 to wikikube-worker1142
  • 13:16 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1142
  • 13:15 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:15 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1142
  • 13:14 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:14 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1001 to wikikube-worker1142 - kamila@cumin1002"
  • 13:14 dcausse@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 13:14 cmooney@cumin1002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device fasw2-c1a-eqiad
  • 13:14 cmooney@cumin1002: START - Cookbook sre.network.tls for network device fasw2-c1a-eqiad
  • 13:14 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1002 to wikikube-worker1143
  • 13:14 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming parse1001 to wikikube-worker1142 - kamila@cumin1002"
  • 13:14 dcausse@deploy2002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
  • 13:11 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:10 kamila@cumin1002: START - Cookbook sre.hosts.rename from parse1001 to wikikube-worker1142
  • 13:09 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host parse[1001-1006].eqiad.wmnet
  • 13:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1162 T384597', diff saved to https://phabricator.wikimedia.org/P72251 and previous config saved to /var/cache/conftool/dbconfig/20250123-130937-marostegui.json
  • 13:07 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Index
  • 13:06 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host parse[1001-1006].eqiad.wmnet
  • 13:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72250 and previous config saved to /var/cache/conftool/dbconfig/20250123-130253-root.json
  • 12:51 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on netflow1002.eqiad.wmnet with reason: disabling alerts as I'm running gnmic manually rather than with systemd
  • 12:41 topranks: restarting gnmic.service on netflow1002
  • 12:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T384592)', diff saved to https://phabricator.wikimedia.org/P72249 and previous config saved to /var/cache/conftool/dbconfig/20250123-123708-marostegui.json
  • 12:37 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:36 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 12:31 marostegui: Deploy schema change on s8 codfw with replication dbmaint T384592
  • 12:23 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 12:23 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 12:21 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 12:21 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 12:17 federico3: Removing db2134 from zarcillo T384476
  • 12:13 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-restart-tcp-mss-clamper (exit_code=0) rolling restart_daemons on A:cp-text_drmrs
  • 12:13 fceratto@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2134.codfw.wmnet
  • 12:13 fceratto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:13 fceratto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2134.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1002"
  • 12:12 fceratto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2134.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1002"
  • 12:05 fceratto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:00 fceratto@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2134.codfw.wmnet
  • 11:53 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-restart-tcp-mss-clamper rolling restart_daemons on A:cp-text_drmrs
  • 11:52 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-restart-tcp-mss-clamper (exit_code=0) rolling restart_daemons on A:cp-text_codfw
  • 11:51 urbanecm@deploy2002: Finished scap sync-world: Backport for Remove GEInfoboxTemplatesTest (T384254), Remove GEInfoboxTemplatesTest (T384254) (duration: 15m 23s)
  • 11:49 root@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1204.eqiad.wmnet with OS bookworm
  • 11:37 vgutierrez: upload acme-chief 0.38 to apt.wm.org (bookworm-wikimedia)
  • 11:35 urbanecm@deploy2002: Started scap sync-world: Backport for Remove GEInfoboxTemplatesTest (T384254), Remove GEInfoboxTemplatesTest (T384254)
  • 11:35 urbanecm@deploy2002: Sync cancelled.
  • 11:34 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-restart-tcp-mss-clamper rolling restart_daemons on A:cp-text_codfw
  • 11:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2022.codfw.wmnet
  • 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2022.codfw.wmnet
  • 11:31 fceratto@cumin1002: dbctl commit (dc=all): 'Depool db2136 T384479', diff saved to https://phabricator.wikimedia.org/P72247 and previous config saved to /var/cache/conftool/dbconfig/20250123-113157-fceratto.json
  • 11:28 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-restart-tcp-mss-clamper (exit_code=0) rolling restart_daemons on A:cp-text_eqsin
  • 11:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2022.codfw.wmnet
  • 11:26 root@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
  • 11:23 root@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
  • 11:19 urbanecm@deploy2002: urbanecm: Backport for Remove GEInfoboxTemplatesTest (T384254), Remove GEInfoboxTemplatesTest (T384254) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 11:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 11:14 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2032.codfw.wmnet to cluster codfw and group B
  • 11:14 urbanecm@deploy2002: Started scap sync-world: Backport for Remove GEInfoboxTemplatesTest (T384254), Remove GEInfoboxTemplatesTest (T384254)
  • 11:14 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2032.codfw.wmnet to cluster codfw and group B
  • 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet
  • 11:08 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-restart-tcp-mss-clamper rolling restart_daemons on A:cp-text_eqsin
  • 11:06 root@cumin1002: START - Cookbook sre.hosts.reimage for host db1204.eqiad.wmnet with OS bookworm
  • 11:04 jynus@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1205.eqiad.wmnet with reason: os upgrade
  • 11:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet
  • 11:04 jynus@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1204.eqiad.wmnet with reason: os upgrade
  • 10:57 jynus: pausing media backups on eqiad for maintenance T383902
  • 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2032.codfw.wmnet with OS bookworm
  • 10:46 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 10:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2032.codfw.wmnet with reason: host reimage
  • 10:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2032.codfw.wmnet with reason: host reimage
  • 10:26 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on ms-be2075.codfw.wmnet with reason: hardware broken awaiting vendor action
  • 10:24 dcausse@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:24 dcausse@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:22 dcausse@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:19 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-restart-tcp-mss-clamper (exit_code=0) rolling restart_daemons on A:cp-text_magru
  • 10:18 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams and A:cp
  • 10:14 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2032.codfw.wmnet with OS bookworm
  • 10:10 dcausse@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:08 moritzm: installing routinator security updates
  • 10:05 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:05 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:04 btullis@dns1004: END - running authdns-update
  • 10:02 btullis@dns1004: START - running authdns-update
  • 10:01 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-restart-tcp-mss-clamper rolling restart_daemons on A:cp-text_magru
  • 09:55 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ganeti2032.codfw.wmnet with reason: remove from cluster for reimage
  • 09:54 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-restart-tcp-mss-clamper (exit_code=0) rolling restart_daemons on A:cp-text_ulsfo
  • 09:53 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams and A:cp
  • 09:53 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams and A:cp
  • 09:51 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:51 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
  • 09:35 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-restart-tcp-mss-clamper rolling restart_daemons on A:cp-text_ulsfo
  • 09:30 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams and A:cp
  • 09:22 dcausse@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:22 dcausse@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:57 dcausse@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 08:57 dcausse@deploy2002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:53 dcausse@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 08:52 dcausse@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:49 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 08:48 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repool pc2 after kernel reboots', diff saved to https://phabricator.wikimedia.org/P72244 and previous config saved to /var/cache/conftool/dbconfig/20250123-083524-marostegui.json
  • 08:26 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2012.codfw.wmnet with reason: Kernel reboot
  • 08:26 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1012.eqiad.wmnet with reason: Kernel reboot
  • 08:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depool pc2 for kernel reboots', diff saved to https://phabricator.wikimedia.org/P72242 and previous config saved to /var/cache/conftool/dbconfig/20250123-082545-marostegui.json
  • 08:03 moritzm: installing glibc updates on bullseye
  • 07:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repool pc1 after kernel reboots', diff saved to https://phabricator.wikimedia.org/P72241 and previous config saved to /var/cache/conftool/dbconfig/20250123-074759-marostegui.json
  • 07:36 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1011.eqiad.wmnet with reason: Kernel reboot
  • 07:36 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2011.codfw.wmnet with reason: Kernel reboot
  • 07:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depool pc1 for kernel reboots', diff saved to https://phabricator.wikimedia.org/P72240 and previous config saved to /var/cache/conftool/dbconfig/20250123-073557-marostegui.json
  • 07:29 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 07:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
  • 07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Remove es1022 from dbctl T384566', diff saved to https://phabricator.wikimedia.org/P72239 and previous config saved to /var/cache/conftool/dbconfig/20250123-071529-root.json
  • 07:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es1021.eqiad.wmnet
  • 07:08 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:08 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es1021.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 07:08 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es1021.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 07:01 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 06:55 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts es1021.eqiad.wmnet
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2189 T383709', diff saved to https://phabricator.wikimedia.org/P72237 and previous config saved to /var/cache/conftool/dbconfig/20250123-064241-marostegui.json
  • 06:42 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 6:00:00 on db2189.codfw.wmnet with reason: Onsite work
  • 06:41 marostegui: Powering off db2189 for onsite maintenance T383709
  • 02:35 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 02:10 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1012.eqiad.wmnet with reason: host reimage
  • 02:06 andrew@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1012.eqiad.wmnet with reason: host reimage
  • 01:50 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 01:49 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 01:27 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1012.eqiad.wmnet with reason: host reimage
  • 01:23 andrew@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1012.eqiad.wmnet with reason: host reimage
  • 01:06 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 01:06 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 01:00 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 00:59 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 00:54 tzatziki: removing 2 files for legal compliance
  • 00:44 tzatziki: removing 1 file for legal complaince
  • 00:42 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 00:41 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bullseye

2025-01-22

  • 23:58 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 23:57 andrew@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 22:24 dmartin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 22:23 dmartin@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 22:23 dmartin@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 22:22 dmartin@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 22:20 dmartin@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 22:20 dmartin@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 22:20 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1135-1141].eqiad.wmnet
  • 22:20 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1135-1141].eqiad.wmnet
  • 22:13 dmartin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 22:12 dmartin@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 22:12 dmartin@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 22:11 dmartin@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 22:06 dmartin@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 22:06 dmartin@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 22:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1139.eqiad.wmnet with OS bookworm
  • 21:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1138.eqiad.wmnet with OS bookworm
  • 21:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1137.eqiad.wmnet with OS bookworm
  • 21:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1140.eqiad.wmnet with OS bookworm
  • 21:49 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1135.eqiad.wmnet with OS bookworm
  • 21:45 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1141.eqiad.wmnet with OS bookworm
  • 21:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1139.eqiad.wmnet with reason: host reimage
  • 21:42 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1136.eqiad.wmnet with OS bookworm
  • 21:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1138.eqiad.wmnet with reason: host reimage
  • 21:37 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1137.eqiad.wmnet with reason: host reimage
  • 21:36 dzahn@dns1004: END - running authdns-update
  • 21:34 cjming: end of UTC late backport window
  • 21:34 dzahn@dns1004: START - running authdns-update
  • 21:34 cjming@deploy2002: Finished scap sync-world: Backport for Add a few more contextual attributes to web base (T373715) (duration: 11m 41s)
  • 21:33 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on netflow7001.magru.wmnet with reason: disabling alerts as I'm running gnmic manually rather than with systemd
  • 21:33 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1140.eqiad.wmnet with reason: host reimage
  • 21:29 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1135.eqiad.wmnet with reason: host reimage
  • 21:27 cjming@deploy2002: cjming: Continuing with sync
  • 21:27 cjming@deploy2002: cjming: Backport for Add a few more contextual attributes to web base (T373715) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:26 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 21:26 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1141.eqiad.wmnet with reason: host reimage
  • 21:22 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1136.eqiad.wmnet with reason: host reimage
  • 21:22 cjming@deploy2002: Started scap sync-world: Backport for Add a few more contextual attributes to web base (T373715)
  • 21:20 cjming@deploy2002: Finished scap sync-world: Backport for Enable ExLab test 1 experiment to wikitech (T373715) (duration: 13m 22s)
  • 21:20 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1135.eqiad.wmnet with reason: host reimage
  • 21:20 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1141.eqiad.wmnet with reason: host reimage
  • 21:20 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1139.eqiad.wmnet with reason: host reimage
  • 21:19 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1140.eqiad.wmnet with reason: host reimage
  • 21:19 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1138.eqiad.wmnet with reason: host reimage
  • 21:19 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1137.eqiad.wmnet with reason: host reimage
  • 21:19 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1136.eqiad.wmnet with reason: host reimage
  • 21:14 cjming@deploy2002: cjming: Continuing with sync
  • 21:13 cjming@deploy2002: cjming: Backport for Enable ExLab test 1 experiment to wikitech (T373715) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:12 andrew@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 21:07 cjming@deploy2002: Started scap sync-world: Backport for Enable ExLab test 1 experiment to wikitech (T373715)
  • 21:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1135
  • 21:04 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1135
  • 21:04 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1135.eqiad.wmnet with OS bookworm
  • 21:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1141
  • 21:04 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1141
  • 21:04 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1141.eqiad.wmnet with OS bookworm
  • 21:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1139
  • 21:04 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1139
  • 21:04 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1139.eqiad.wmnet with OS bookworm
  • 21:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1140
  • 21:03 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1140
  • 21:03 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1140.eqiad.wmnet with OS bookworm
  • 21:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1138
  • 21:03 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1138
  • 21:03 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1138.eqiad.wmnet with OS bookworm
  • 21:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1137
  • 21:03 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1137
  • 21:03 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1137.eqiad.wmnet with OS bookworm
  • 21:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1136
  • 21:03 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1136
  • 21:03 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1136.eqiad.wmnet with OS bookworm
  • 21:02 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1135.eqiad.wmnet wikikube-worker1136.eqiad.wmnet wikikube-worker1137.eqiad.wmnet wikikube-worker1138.eqiad.wmnet wikikube-worker1139.eqiad.wmnet wikikube-worker1140.eqiad.wmnet wikikube-worker1141.eqiad.wmnet on all recursors
  • 21:02 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1135.eqiad.wmnet wikikube-worker1136.eqiad.wmnet wikikube-worker1137.eqiad.wmnet wikikube-worker1138.eqiad.wmnet wikikube-worker1139.eqiad.wmnet wikikube-worker1140.eqiad.wmnet wikikube-worker1141.eqiad.wmnet on all recursors
  • 21:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1488 to wikikube-worker1141
  • 21:01 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1141
  • 21:00 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1141
  • 21:00 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:00 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1488 to wikikube-worker1141 - kamila@cumin1002"
  • 21:00 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1488 to wikikube-worker1141 - kamila@cumin1002"
  • 20:59 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1137
  • 20:58 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1487 to wikikube-worker1140
  • 20:58 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1137
  • 20:58 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1140
  • 20:57 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 20:56 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1140
  • 20:56 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1488 to wikikube-worker1141
  • 20:56 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:56 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1487 to wikikube-worker1140 - kamila@cumin1002"
  • 20:56 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1487 to wikikube-worker1140 - kamila@cumin1002"
  • 20:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1486 to wikikube-worker1139
  • 20:53 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1139
  • 20:52 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 20:51 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1139
  • 20:51 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:51 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1486 to wikikube-worker1139 - kamila@cumin1002"
  • 20:51 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1487 to wikikube-worker1140
  • 20:51 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1486 to wikikube-worker1139 - kamila@cumin1002"
  • 20:49 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1485 to wikikube-worker1138
  • 20:48 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1138
  • 20:48 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 20:47 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1486 to wikikube-worker1139
  • 20:47 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1138
  • 20:47 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:47 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1485 to wikikube-worker1138 - kamila@cumin1002"
  • 20:47 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1485 to wikikube-worker1138 - kamila@cumin1002"
  • 20:45 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1484 to wikikube-worker1137
  • 20:44 kamila@cumin1002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wikikube-worker1137
  • 20:43 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 20:43 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1485 to wikikube-worker1138
  • 20:42 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1137
  • 20:42 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:42 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1484 to wikikube-worker1137 - kamila@cumin1002"
  • 20:42 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1484 to wikikube-worker1137 - kamila@cumin1002"
  • 20:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1483 to wikikube-worker1136
  • 20:40 aokoth@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Update
  • 20:40 dzahn@cumin2002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: security release 20250122
  • 20:40 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1136
  • 20:39 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 20:38 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1136
  • 20:38 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:38 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1483 to wikikube-worker1136 - kamila@cumin1002"
  • 20:38 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1483 to wikikube-worker1136 - kamila@cumin1002"
  • 20:37 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1484 to wikikube-worker1137
  • 20:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1482 to wikikube-worker1135
  • 20:35 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1135
  • 20:34 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 20:34 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1135
  • 20:34 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:34 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1482 to wikikube-worker1135 - kamila@cumin1002"
  • 20:34 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1483 to wikikube-worker1136
  • 20:34 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1482 to wikikube-worker1135 - kamila@cumin1002"
  • 20:32 aokoth@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Update
  • 20:31 aokoth@cumin1002: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Update
  • 20:31 aokoth@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Update
  • 20:31 aokoth@cumin1002: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Update
  • 20:31 aokoth@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Update
  • 20:31 aokoth@cumin1002: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Update
  • 20:31 aokoth@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Update
  • 20:30 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 20:30 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1482 to wikikube-worker1135
  • 20:12 ebysans@deploy2002: Started deploy [analytics/refinery@28dce47]: Temp accounts deployment [analytics/refinery@28dce471]
  • 19:55 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 19:54 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 19:49 ladsgroup@deploy2002: Finished scap sync-world: Backport for file migration: Set group0 to write both (T384481) (duration: 12m 41s)
  • 19:42 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 19:39 ladsgroup@deploy2002: ladsgroup: Backport for file migration: Set group0 to write both (T384481) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:38 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 19:37 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 19:36 ladsgroup@deploy2002: Started scap sync-world: Backport for file migration: Set group0 to write both (T384481)
  • 19:33 ejegg: payments-wiki upgraded from ffda3c78 to 975be8a2
  • 19:17 andrew@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudnet1008-dev.eqiad.wmnet
  • 19:17 andrew@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:15 andrew@cumin1002: START - Cookbook sre.dns.netbox
  • 19:09 andrew@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudnet1008-dev.eqiad.wmnet
  • 19:08 andrew@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudnet1007-dev.eqiad.wmnet
  • 19:08 andrew@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:05 andrew@cumin1002: START - Cookbook sre.dns.netbox
  • 19:03 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 18:59 andrew@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudnet1007-dev.eqiad.wmnet
  • 18:59 andrew@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudnet1008-dev.eqiad.wmnet
  • 18:59 andrew@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:57 andrew@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudnet1007-dev.eqiad.wmnet
  • 18:57 andrew@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:57 andrew@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet1007-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 18:57 andrew@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet1007-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 18:56 andrew@cumin1002: START - Cookbook sre.dns.netbox
  • 18:51 andrew@cumin1002: START - Cookbook sre.dns.netbox
  • 18:43 andrew@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudnet1008-dev.eqiad.wmnet
  • 18:43 andrew@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudnet1007-dev.eqiad.wmnet
  • 18:35 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 18:25 swfrench@deploy2002: Finished scap sync-world: Backport for Add variables for incremental enrollment in PHP 8.1 (T377042) (duration: 22m 03s)
  • 18:22 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 18:22 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 18:16 swfrench@deploy2002: swfrench: Continuing with sync
  • 18:16 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 18:16 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 18:15 swfrench-wmf: verified PHP_ENGINE / PHP_ENGINE_STICKY enrollment behavior in mwdebug - T377042
  • 18:12 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1129-1134].eqiad.wmnet
  • 18:12 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1129-1134].eqiad.wmnet
  • 18:08 swfrench@deploy2002: swfrench: Backport for Add variables for incremental enrollment in PHP 8.1 (T377042) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:03 swfrench@deploy2002: Started scap sync-world: Backport for Add variables for incremental enrollment in PHP 8.1 (T377042)
  • 18:02 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 18:01 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 17:58 tchin@deploy2002: Finished deploy [airflow-dags/analytics@07104ff]: Deploying latest dags for analytics airflow instance T357684 (duration: 01m 53s)
  • 17:58 tchin@deploy2002: Started deploy [airflow-dags/analytics@07104ff]: Deploying latest dags for analytics airflow instance T357684
  • 17:56 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1130.eqiad.wmnet with OS bookworm
  • 17:55 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 17:55 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 17:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1130.eqiad.wmnet with reason: host reimage
  • 17:32 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1130.eqiad.wmnet with reason: host reimage
  • 17:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1134.eqiad.wmnet with OS bookworm
  • 17:26 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1132.eqiad.wmnet with OS bookworm
  • 17:23 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 17:23 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 17:21 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1133.eqiad.wmnet with OS bookworm
  • 17:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1131.eqiad.wmnet with OS bookworm
  • 17:16 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1130
  • 17:16 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1130
  • 17:16 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1130.eqiad.wmnet with OS bookworm
  • 17:14 kamila@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1130.eqiad.wmnet with OS bookworm
  • 17:14 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1129.eqiad.wmnet with OS bookworm
  • 17:12 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1134.eqiad.wmnet with reason: host reimage
  • 17:08 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1134.eqiad.wmnet with reason: host reimage
  • 17:06 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1132.eqiad.wmnet with reason: host reimage
  • 17:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1133.eqiad.wmnet with reason: host reimage
  • 16:58 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1131.eqiad.wmnet with reason: host reimage
  • 16:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1129.eqiad.wmnet with reason: host reimage
  • 16:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1134
  • 16:53 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1134
  • 16:53 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1134.eqiad.wmnet with OS bookworm
  • 16:52 kamila@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1134.eqiad.wmnet with OS bookworm
  • 16:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1133.eqiad.wmnet with reason: host reimage
  • 16:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1132.eqiad.wmnet with reason: host reimage
  • 16:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1131.eqiad.wmnet with reason: host reimage
  • 16:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1129.eqiad.wmnet with reason: host reimage
  • 16:49 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 16:49 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 16:41 vgutierrez@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:41 vgutierrez@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: clean up test-lb IPs - vgutierrez@cumin1002"
  • 16:41 vgutierrez@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: clean up test-lb IPs - vgutierrez@cumin1002"
  • 16:37 vgutierrez@cumin1002: START - Cookbook sre.dns.netbox
  • 16:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1134
  • 16:36 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1134
  • 16:36 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1134.eqiad.wmnet with OS bookworm
  • 16:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1133
  • 16:36 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1133
  • 16:36 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1133.eqiad.wmnet with OS bookworm
  • 16:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1132
  • 16:36 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1132
  • 16:36 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1132.eqiad.wmnet with OS bookworm
  • 16:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1131
  • 16:36 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1131
  • 16:36 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1131.eqiad.wmnet with OS bookworm
  • 16:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1130
  • 16:36 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1130
  • 16:36 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1130.eqiad.wmnet with OS bookworm
  • 16:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1129
  • 16:35 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1129
  • 16:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1129.eqiad.wmnet with OS bookworm
  • 16:33 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1129.eqiad.wmnet wikikube-worker1130.eqiad.wmnet wikikube-worker1131.eqiad.wmnet wikikube-worker1132.eqiad.wmnet wikikube-worker1133.eqiad.wmnet wikikube-worker1134.eqiad.wmnet on all recursors
  • 16:33 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1129.eqiad.wmnet wikikube-worker1130.eqiad.wmnet wikikube-worker1131.eqiad.wmnet wikikube-worker1132.eqiad.wmnet wikikube-worker1133.eqiad.wmnet wikikube-worker1134.eqiad.wmnet on all recursors
  • 16:33 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1481 to wikikube-worker1134
  • 16:32 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1134
  • 16:31 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1134
  • 16:31 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:31 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1481 to wikikube-worker1134 - kamila@cumin1002"
  • 16:30 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1481 to wikikube-worker1134 - kamila@cumin1002"
  • 16:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1480 to wikikube-worker1133
  • 16:28 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1133
  • 16:27 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:26 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1133
  • 16:26 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:26 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1480 to wikikube-worker1133 - kamila@cumin1002"
  • 16:26 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1480 to wikikube-worker1133 - kamila@cumin1002"
  • 16:26 vgutierrez@cumin1002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-high-traffic1 (T384486)
  • 16:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1479 to wikikube-worker1132
  • 16:23 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 16:23 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 16:23 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1132
  • 16:23 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1481 to wikikube-worker1134
  • 16:22 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:22 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1132
  • 16:22 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:22 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1479 to wikikube-worker1132 - kamila@cumin1002"
  • 16:22 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1480 to wikikube-worker1133
  • 16:22 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1479 to wikikube-worker1132 - kamila@cumin1002"
  • 16:20 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 16:20 vgutierrez@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-high-traffic1 (T384486)
  • 16:20 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 16:19 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1478 to wikikube-worker1131
  • 16:19 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1131
  • 16:18 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:18 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1479 to wikikube-worker1132
  • 16:17 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1131
  • 16:17 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:17 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1478 to wikikube-worker1131 - kamila@cumin1002"
  • 16:17 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1478 to wikikube-worker1131 - kamila@cumin1002"
  • 16:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1477 to wikikube-worker1130
  • 16:14 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1130
  • 16:14 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:13 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1478 to wikikube-worker1131
  • 16:13 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1130
  • 16:13 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:13 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1477 to wikikube-worker1130 - kamila@cumin1002"
  • 16:13 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1477 to wikikube-worker1130 - kamila@cumin1002"
  • 16:10 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1476 to wikikube-worker1129
  • 16:10 moritzm: installing rsync regression updates on bullseye
  • 16:10 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1129
  • 16:09 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:09 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1129
  • 16:09 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1477 to wikikube-worker1130
  • 16:09 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:09 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1476 to wikikube-worker1129 - kamila@cumin1002"
  • 16:08 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1476 to wikikube-worker1129 - kamila@cumin1002"
  • 16:08 vgutierrez@cumin1002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs[1020].eqiad.wmnet,lvs5006.eqsin.wmnet,lvs3010.esams.wmnet,lvs7003.magru.wmnet,lvs4010.ulsfo.wmnet} and A:lvs (T384486)
  • 16:07 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2166
  • 16:07 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 16:07 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2166
  • 16:05 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:04 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1476 to wikikube-worker1129
  • 16:04 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[1476-1481].eqiad.wmnet
  • 16:03 vgutierrez@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs[1020].eqiad.wmnet,lvs5006.eqsin.wmnet,lvs3010.esams.wmnet,lvs7003.magru.wmnet,lvs4010.ulsfo.wmnet} and A:lvs (T384486)
  • 16:02 vgutierrez@cumin1002: END (FAIL) - Cookbook sre.loadbalancer.restart-pybal (exit_code=99) rolling-restart of pybal on A:lvs-secondary (T384486)
  • 15:55 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[1476-1481].eqiad.wmnet
  • 15:54 vgutierrez@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary (T384486)
  • 15:24 fceratto@cumin1002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2189 slowly with 10 steps - Repool host after fixing indexes and performing OS updates
  • 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2004.codfw.wmnet to plain
  • 15:22 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2004.codfw.wmnet to plain
  • 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
  • 15:21 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
  • 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2004.codfw.wmnet to drbd
  • 15:11 swfrench-wmf: switched dbctl pc section objects to flavor "parsercache" - T383324
  • 14:56 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2004.codfw.wmnet to drbd
  • 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
  • 14:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
  • 14:28 urbanecm@deploy2002: Finished scap sync-world: Backport for Disable sidebar cache on the auth domain (T383916) (duration: 14m 06s)
  • 14:23 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2021.codfw.wmnet to cluster codfw and group B
  • 14:22 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2021.codfw.wmnet to cluster codfw and group B
  • 14:21 urbanecm@deploy2002: urbanecm, matmarex: Continuing with sync
  • 14:20 urbanecm@deploy2002: urbanecm, matmarex: Backport for Disable sidebar cache on the auth domain (T383916) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:14 urbanecm@deploy2002: Started scap sync-world: Backport for Disable sidebar cache on the auth domain (T383916)
  • 14:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2021.codfw.wmnet
  • 14:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2021.codfw.wmnet
  • 14:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2021.codfw.wmnet with OS bookworm
  • 13:56 urbanecm: Deployed security patch for T384244
  • 13:46 federico3: Removing db2133 from zarcillo T384343
  • 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2021.codfw.wmnet with reason: host reimage
  • 13:40 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 13:39 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 13:39 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 13:38 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 13:38 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 13:38 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2021.codfw.wmnet with reason: host reimage
  • 13:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2133.codfw.wmnet
  • 13:37 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:37 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2133.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 13:37 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2133.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 13:37 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 13:34 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 13:29 urbanecm: Deploying security patch for T384244
  • 13:29 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2133.codfw.wmnet
  • 13:15 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow7001.magru.wmnet with reason: disabling alerts as I'm running gnmic manually rather than with systemd
  • 13:08 fceratto@cumin1002: START - Cookbook sre.mysql.pool db2189 slowly with 10 steps - Repool host after fixing indexes and performing OS updates
  • 13:08 federico3: repooling db2189 as per T384202
  • 12:48 root@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1205.eqiad.wmnet with OS bookworm
  • 12:47 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2021.codfw.wmnet with OS bookworm
  • 12:34 Amir1: creating new schema of file tables everywhere (T368113)
  • 12:25 root@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
  • 12:24 hnowlan: disabling puppet on A:cp to test r/1113178
  • 12:22 root@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
  • 12:20 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
  • 12:19 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/zotero: apply
  • 12:18 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
  • 12:17 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
  • 12:17 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
  • 12:16 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
  • 12:11 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ganeti2021.codfw.wmnet with reason: remove from cluster for reimage
  • 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2021.codfw.wmnet
  • 12:05 root@cumin1002: START - Cookbook sre.hosts.reimage for host db1205.eqiad.wmnet with OS bookworm
  • 12:02 jynus@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1205.eqiad.wmnet with reason: os upgrade
  • 11:43 vgutierrez: testing acme-chief 0.38 in acmechief-test1001
  • 11:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72220 and previous config saved to /var/cache/conftool/dbconfig/20250122-113404-root.json
  • 11:32 urbanecm@deploy2002: Finished scap sync-world: Backport for ValidatorFactory: Allow extensions to register validators (T384246), ValidatorFactory: Allow extensions to register validators (T384246) (duration: 11m 55s)
  • 11:20 urbanecm@deploy2002: Started scap sync-world: Backport for ValidatorFactory: Allow extensions to register validators (T384246), ValidatorFactory: Allow extensions to register validators (T384246)
  • 11:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72218 and previous config saved to /var/cache/conftool/dbconfig/20250122-111859-root.json
  • 11:14 marostegui@cumin1002: dbctl commit (dc=all): 'Remove es1021 from dbctl T384418', diff saved to https://phabricator.wikimedia.org/P72217 and previous config saved to /var/cache/conftool/dbconfig/20250122-111428-root.json
  • 11:10 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2166.codfw.wmnet with reason: Onsite work
  • 11:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2166 T383709', diff saved to https://phabricator.wikimedia.org/P72216 and previous config saved to /var/cache/conftool/dbconfig/20250122-111019-marostegui.json
  • 11:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72215 and previous config saved to /var/cache/conftool/dbconfig/20250122-110354-root.json
  • 10:59 marostegui: Deploy schema change in codfw x1 with replication on the master dbmaint T381759
  • 10:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72214 and previous config saved to /var/cache/conftool/dbconfig/20250122-104848-root.json
  • 10:38 topranks: disable-pupept on netflow7001 to run gnmic in foregrand for debug/development T369384
  • 10:38 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow7001.magru.wmnet with reason: disabling alerts as I'm running gnmic manually rather than with systemd
  • 10:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72213 and previous config saved to /var/cache/conftool/dbconfig/20250122-103342-root.json
  • 10:03 root@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1240.eqiad.wmnet with OS bookworm
  • 09:40 root@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1240.eqiad.wmnet with reason: host reimage
  • 09:38 root@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1240.eqiad.wmnet with reason: host reimage
  • 09:21 root@cumin1002: START - Cookbook sre.hosts.reimage for host db1240.eqiad.wmnet with OS bookworm
  • 09:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2021.codfw.wmnet
  • 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2021.codfw.wmnet
  • 09:16 jynus@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: os upgrade
  • 09:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2021.codfw.wmnet
  • 08:24 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2019.codfw.wmnet to cluster codfw and group B
  • 08:22 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2019.codfw.wmnet to cluster codfw and group B
  • 08:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2019.codfw.wmnet
  • 08:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2019.codfw.wmnet
  • 06:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1021 T384418', diff saved to https://phabricator.wikimedia.org/P72212 and previous config saved to /var/cache/conftool/dbconfig/20250122-065157-marostegui.json
  • 06:20 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2175.codfw.wmnet with reason: rebuilding index
  • 02:35 cstone: civicrm upgraded from 502d9043 to f82f7c98

2025-01-21

  • 22:06 cjming: end of UTC late backport window
  • 22:02 cjming@deploy2002: Finished scap sync-world: Backport for Fix schema version for CTR instrument (T384333) (duration: 12m 05s)
  • 21:55 cjming@deploy2002: cjming: Continuing with sync
  • 21:55 cjming@deploy2002: cjming: Backport for Fix schema version for CTR instrument (T384333) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:50 cjming@deploy2002: Started scap sync-world: Backport for Fix schema version for CTR instrument (T384333)
  • 21:39 cjming@deploy2002: Finished scap sync-world: Backport for cawiki: Create templateeditor & protection level (T384145) (duration: 13m 24s)
  • 21:38 eileen: civicrm upgraded from c671fd23 to 502d9043
  • 21:32 cjming@deploy2002: zhaofjx, cjming: Continuing with sync
  • 21:32 cjming@deploy2002: zhaofjx, cjming: Backport for cawiki: Create templateeditor & protection level (T384145) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:25 cjming@deploy2002: Started scap sync-world: Backport for cawiki: Create templateeditor & protection level (T384145)
  • 21:23 cjming@deploy2002: Finished scap sync-world: Backport for Enable Vector 2022 and dark mode on Azerbaijani wikis (T383942) (duration: 15m 06s)
  • 21:16 cjming@deploy2002: cjming, jdlrobson: Continuing with sync
  • 21:14 cjming@deploy2002: cjming, jdlrobson: Backport for Enable Vector 2022 and dark mode on Azerbaijani wikis (T383942) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:08 cjming@deploy2002: Started scap sync-world: Backport for Enable Vector 2022 and dark mode on Azerbaijani wikis (T383942)
  • 20:35 ecarg@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 20:35 ecarg@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 20:14 herron@cumin1002: dbctl commit (dc=all): 'depool db2175', diff saved to https://phabricator.wikimedia.org/P72208 and previous config saved to /var/cache/conftool/dbconfig/20250121-201444-herron.json
  • 19:43 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow7001.magru.wmnet with reason: disabling alerts as I'm running gnmic manually rather than with systemd
  • 19:32 jebe@deploy2002: Finished deploy [airflow-dags/analytics_product@0aa9d7c]: (no justification provided) (duration: 00m 35s)
  • 19:31 jebe@deploy2002: Started deploy [airflow-dags/analytics_product@0aa9d7c]: (no justification provided)
  • 19:26 brennen@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.44.0-wmf.13 refs T382364
  • 19:25 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 19:25 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 18:49 swfrench-wmf: finished running puppet on A:cp-text for ATS mapping change - T377042
  • 18:38 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1126.eqiad.wmnet with OS bookworm
  • 18:35 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow7001.magru.wmnet with reason: disabling alerts as I'm running gnmic manually rather than with systemd
  • 17:12 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1128.eqiad.wmnet with reason: host reimage
  • 17:09 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1125.eqiad.wmnet with reason: host reimage
  • 17:06 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1124.eqiad.wmnet with reason: host reimage
  • 17:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 17:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 17:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1123.eqiad.wmnet with reason: host reimage
  • 17:00 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1128.eqiad.wmnet with reason: host reimage
  • 17:00 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1127.eqiad.wmnet with reason: host reimage
  • 17:00 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1124.eqiad.wmnet with reason: host reimage
  • 17:00 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1125.eqiad.wmnet with reason: host reimage
  • 16:59 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1123.eqiad.wmnet with reason: host reimage
  • 16:54 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:54 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 16:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1128
  • 16:44 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1128
  • 16:44 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1128.eqiad.wmnet with OS bookworm
  • 16:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1127
  • 16:44 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1127
  • 16:44 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1127.eqiad.wmnet with OS bookworm
  • 16:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1126
  • 16:44 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1126
  • 16:44 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1126.eqiad.wmnet with OS bookworm
  • 16:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1125
  • 16:44 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1125
  • 16:44 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1125.eqiad.wmnet with OS bookworm
  • 16:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1124
  • 16:44 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1124
  • 16:44 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1124.eqiad.wmnet with OS bookworm
  • 16:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1123
  • 16:44 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1123
  • 16:44 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1123.eqiad.wmnet with OS bookworm
  • 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2019.codfw.wmnet with OS bookworm
  • 16:35 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1123.eqiad.wmnet wikikube-worker1124.eqiad.wmnet wikikube-worker1125.eqiad.wmnet wikikube-worker1126.eqiad.wmnet wikikube-worker1127.eqiad.wmnet wikikube-worker1128.eqiad.wmnet on all recursors
  • 16:35 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1123.eqiad.wmnet wikikube-worker1124.eqiad.wmnet wikikube-worker1125.eqiad.wmnet wikikube-worker1126.eqiad.wmnet wikikube-worker1127.eqiad.wmnet wikikube-worker1128.eqiad.wmnet on all recursors
  • 16:34 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for restbase2037.codfw.wmnet
  • 16:34 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for restbase2037.codfw.wmnet
  • 16:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1475 to wikikube-worker1128
  • 16:34 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1128
  • 16:33 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1128
  • 16:33 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:33 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1475 to wikikube-worker1128 - kamila@cumin1002"
  • 16:32 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1475 to wikikube-worker1128 - kamila@cumin1002"
  • 16:30 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1474 to wikikube-worker1127
  • 16:29 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1127
  • 16:28 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:28 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1127
  • 16:28 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:28 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1474 to wikikube-worker1127 - kamila@cumin1002"
  • 16:28 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1475 to wikikube-worker1128
  • 16:27 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1474 to wikikube-worker1127 - kamila@cumin1002"
  • 16:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1473 to wikikube-worker1126
  • 16:24 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1126
  • 16:23 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:23 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1126
  • 16:23 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:23 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1473 to wikikube-worker1126 - kamila@cumin1002"
  • 16:23 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1474 to wikikube-worker1127
  • 16:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2019.codfw.wmnet with reason: host reimage
  • 16:23 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1473 to wikikube-worker1126 - kamila@cumin1002"
  • 16:20 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1472 to wikikube-worker1125
  • 16:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 16:20 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1125
  • 16:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 16:19 papaul: power down ms-be2088 for maintenance
  • 16:19 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2019.codfw.wmnet with reason: host reimage
  • 16:19 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:19 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1473 to wikikube-worker1126
  • 16:19 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1125
  • 16:18 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:18 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1472 to wikikube-worker1125 - kamila@cumin1002"
  • 16:18 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1472 to wikikube-worker1125 - kamila@cumin1002"
  • 16:16 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1471 to wikikube-worker1124
  • 16:15 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1124
  • 16:15 jynus@dns1004: END - running authdns-update
  • 16:14 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:14 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1124
  • 16:14 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:14 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1471 to wikikube-worker1124 - kamila@cumin1002"
  • 16:14 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1472 to wikikube-worker1125
  • 16:14 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1471 to wikikube-worker1124 - kamila@cumin1002"
  • 16:13 jynus@dns1004: START - running authdns-update
  • 16:11 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1470 to wikikube-worker1123
  • 16:11 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1123
  • 16:10 moritzm: installing gstreamer1.0 security updates
  • 16:10 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:10 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1123
  • 16:10 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:10 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1470 to wikikube-worker1123 - kamila@cumin1002"
  • 16:10 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1471 to wikikube-worker1124
  • 16:09 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1470 to wikikube-worker1123 - kamila@cumin1002"
  • 16:06 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:05 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1470 to wikikube-worker1123
  • 16:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 16:03 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[1470-1475].eqiad.wmnet
  • 16:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 16:03 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kafka-main2010.codfw.wmnet
  • 16:03 jayme@cumin1002: START - Cookbook sre.hosts.remove-downtime for kafka-main2010.codfw.wmnet
  • 15:55 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[1470-1475].eqiad.wmnet
  • 15:52 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2010
  • 15:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2010
  • 15:52 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudcephosd2004-dev
  • 15:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudcephosd2004-dev
  • 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2019.codfw.wmnet with OS bookworm
  • 15:43 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:41 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 15:28 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 15:27 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Add known-good regexes for WikibaseQualityConstraints (T380751) (duration: 29m 44s)
  • 15:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 15:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 15:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 15:20 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, audreypenven: Continuing with sync
  • 15:18 jayme@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2010.codfw.wmnet with reason: Server moving within rack
  • 15:07 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 15:07 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 15:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:04 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, audreypenven: Backport for Add known-good regexes for WikibaseQualityConstraints (T380751) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:58 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Add known-good regexes for WikibaseQualityConstraints (T380751)
  • 14:54 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for enable 2 factor authentication for enwiki page movers (T382879) (duration: 16m 10s)
  • 14:49 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:49 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:47 lucaswerkmeister-wmde@deploy2002: novemlinguae, lucaswerkmeister-wmde: Continuing with sync
  • 14:45 lucaswerkmeister-wmde@deploy2002: novemlinguae, lucaswerkmeister-wmde: Backport for enable 2 factor authentication for enwiki page movers (T382879) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2019.codfw.wmnet
  • 14:41 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ganeti2019.codfw.wmnet with reason: remove from cluster for reimage
  • 14:38 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for enable 2 factor authentication for enwiki page movers (T382879)
  • 14:36 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Remove KartographerParsoidSupport flag from configuration (T340134) (duration: 19m 46s)
  • 14:35 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:35 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:27 lucaswerkmeister-wmde@deploy2002: ihurbain, lucaswerkmeister-wmde: Continuing with sync
  • 14:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 14:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 14:24 lucaswerkmeister-wmde@deploy2002: ihurbain, lucaswerkmeister-wmde: Backport for Remove KartographerParsoidSupport flag from configuration (T340134) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 14:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 14:16 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Remove KartographerParsoidSupport flag from configuration (T340134)
  • 14:10 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 14:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 14:09 btullis@cumin1002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid analytics cluster: Roll restart of Druid jvm daemons.
  • 13:55 Emperor: hard-reboot ms-fe1014
  • 13:54 mvernon@cumin2002: conftool action : set/pooled=no; selector: name=ms-fe1014.eqiad.wmnet
  • 13:43 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 13:43 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 13:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 13:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 13:22 btullis@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on an-launcher1002.eqiad.wmnet with reason: Migrating to kubernetes
  • 13:22 btullis@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-launcher1002.eqiad.wmnet with reason: Migrating to kubernetes
  • 13:20 btullis@cumin1002: START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons.
  • 13:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 13:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 13:01 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 13:01 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 12:59 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 12:59 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 12:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2203 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72198 and previous config saved to /var/cache/conftool/dbconfig/20250121-124857-root.json
  • 12:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72197 and previous config saved to /var/cache/conftool/dbconfig/20250121-124750-root.json
  • 12:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2019.codfw.wmnet
  • 12:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2203 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72194 and previous config saved to /var/cache/conftool/dbconfig/20250121-123352-root.json
  • 12:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2004.codfw.wmnet to plain
  • 12:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72193 and previous config saved to /var/cache/conftool/dbconfig/20250121-123245-root.json
  • 12:32 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2004.codfw.wmnet to plain
  • 12:32 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2019.codfw.wmnet
  • 12:32 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 12:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2019.codfw.wmnet
  • 12:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2004.codfw.wmnet to drbd
  • 12:27 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 12:27 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 12:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2203 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72190 and previous config saved to /var/cache/conftool/dbconfig/20250121-121847-root.json
  • 12:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72189 and previous config saved to /var/cache/conftool/dbconfig/20250121-121739-root.json
  • 12:15 kart_: Updated cxserver to 2025-01-20-172318-production (T377966, T377813)
  • 12:15 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 12:14 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 12:10 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 12:09 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 12:09 fceratto@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2189.codfw.wmnet
  • 12:08 hnowlan@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on restbase2037.codfw.wmnet with reason: Memory issues, rebooting frequently. Depooled. T383820
  • 12:05 federico3: updating db2189.codfw.wmnet for https://phabricator.wikimedia.org/T384202
  • 12:05 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 12:04 fceratto@cumin1002: START - Cookbook sre.mysql.upgrade for db2189.codfw.wmnet
  • 12:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2203 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72187 and previous config saved to /var/cache/conftool/dbconfig/20250121-120341-root.json
  • 12:03 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 12:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72186 and previous config saved to /var/cache/conftool/dbconfig/20250121-120234-root.json
  • 12:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2004.codfw.wmnet to drbd
  • 11:59 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2019.codfw.wmnet
  • 11:59 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2019.codfw.wmnet
  • 11:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2019.codfw.wmnet
  • 11:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2019.codfw.wmnet
  • 11:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2203 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72185 and previous config saved to /var/cache/conftool/dbconfig/20250121-114836-root.json
  • 11:48 hnowlan@cumin2002: conftool action : set/pooled=no; selector: name=restbase2037.codfw.wmnet
  • 11:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72184 and previous config saved to /var/cache/conftool/dbconfig/20250121-114728-root.json
  • 11:44 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2189.codfw.wmnet with reason: rebuilding index
  • 11:34 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 11:32 jiji@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 11:32 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 11:31 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 11:30 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 11:30 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 11:29 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 11:29 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 11:29 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 11:28 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 11:28 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 11:27 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 11:26 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 11:26 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 11:26 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 11:25 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 11:19 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 11:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 11:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 11:14 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 11:14 jiji@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 11:13 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 11:13 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 11:13 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 11:12 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 11:11 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 11:11 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 11:10 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 11:10 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 11:09 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 11:09 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 11:08 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 11:01 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2024.codfw.wmnet to cluster codfw and group A
  • 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2024.codfw.wmnet to cluster codfw and group A
  • 10:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2148 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72182 and previous config saved to /var/cache/conftool/dbconfig/20250121-104658-root.json
  • 10:45 marostegui@cumin1002: dbctl commit (dc=all): 'db2207 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72181 and previous config saved to /var/cache/conftool/dbconfig/20250121-104559-root.json
  • 10:40 topranks: de-pref Chicago routes learnt on on core routers in Dallas
  • 10:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2148 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72180 and previous config saved to /var/cache/conftool/dbconfig/20250121-103153-root.json
  • 10:30 marostegui@cumin1002: dbctl commit (dc=all): 'db2207 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72179 and previous config saved to /var/cache/conftool/dbconfig/20250121-103053-root.json
  • 10:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2024.codfw.wmnet
  • 10:26 topranks: adjust VRRP priorities for public and analytics vlans on eqiad CRs to balance traffic
  • 10:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2024.codfw.wmnet
  • 10:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2148 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72177 and previous config saved to /var/cache/conftool/dbconfig/20250121-101648-root.json
  • 10:15 marostegui@cumin1002: dbctl commit (dc=all): 'db2207 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72176 and previous config saved to /var/cache/conftool/dbconfig/20250121-101548-root.json
  • 10:11 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 10:11 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 10:03 moritzm: installing python-tornado security updates
  • 10:03 moritzm: installing intel-microcode security updates
  • 10:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2148 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72175 and previous config saved to /var/cache/conftool/dbconfig/20250121-100142-root.json
  • 10:00 marostegui@cumin1002: dbctl commit (dc=all): 'db2207 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72174 and previous config saved to /var/cache/conftool/dbconfig/20250121-100042-root.json
  • 10:00 godog: set udp_localhost-info retention.bytes=300000000000 on kafka-logging (back to original value) - T384233
  • 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2024.codfw.wmnet with OS bookworm
  • 09:52 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:52 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:47 godog: set udp_localhost-info retention.bytes=100000000000 on kafka-logging - T384233
  • 09:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2148 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72173 and previous config saved to /var/cache/conftool/dbconfig/20250121-094637-root.json
  • 09:45 marostegui@cumin1002: dbctl commit (dc=all): 'db2207 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72172 and previous config saved to /var/cache/conftool/dbconfig/20250121-094537-root.json
  • 09:39 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:39 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2024.codfw.wmnet with reason: host reimage
  • 09:33 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2024.codfw.wmnet with reason: host reimage
  • 09:32 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:32 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:30 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.resource-report (exit_code=0)
  • 09:30 jmm@cumin2002: START - Cookbook sre.ganeti.resource-report
  • 09:30 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:30 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:29 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:29 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2024.codfw.wmnet with OS bookworm
  • 09:04 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:03 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:59 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 08:58 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:51 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2216.codfw.wmnet onto db2203.codfw.wmnet
  • 08:36 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ganeti2024.codfw.wmnet with reason: remove from cluster for reimage
  • 08:02 kevinbazira@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 07:59 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 07:56 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 07:42 aqu@deploy2002: Finished deploy [airflow-dags/analytics@868de0c]: 202412 Backfill: Fixes on ExternalTaskMarker experiment (duration: 00m 32s)
  • 07:41 aqu@deploy2002: Started deploy [airflow-dags/analytics@868de0c]: 202412 Backfill: Fixes on ExternalTaskMarker experiment
  • 07:29 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Bugfixes - oblivian@cumin1002"
  • 07:29 oblivian@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Bugfixes - oblivian@cumin1002
  • 07:28 oblivian@cumin1002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Bugfixes - oblivian@cumin1002
  • 07:28 oblivian@cumin1002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Bugfixes - oblivian@cumin1002"
  • 07:02 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db2216.codfw.wmnet onto db2203.codfw.wmnet
  • 06:58 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: rebuilding index
  • 06:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2216 T384273', diff saved to https://phabricator.wikimedia.org/P72169 and previous config saved to /var/cache/conftool/dbconfig/20250121-065640-marostegui.json
  • 06:52 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2203.codfw.wmnet with reason: rebuilding index
  • 06:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2203 T383690', diff saved to https://phabricator.wikimedia.org/P72168 and previous config saved to /var/cache/conftool/dbconfig/20250121-065114-marostegui.json
  • 06:51 marostegui@dns1006: END - running authdns-update
  • 06:49 marostegui@dns1006: START - running authdns-update
  • 06:49 marostegui@dns1006: START - running authdns-update
  • 05:04 mwpresync@deploy2002: Pruned MediaWiki: 1.44.0-wmf.8 (duration: 04m 55s)
  • 05:01 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.44.0-wmf.13 refs T382364 (duration: 58m 53s)
  • 04:02 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.44.0-wmf.13 refs T382364

2025-01-20

  • 23:37 jynus: run OPTIMIZE TABLE recentchanges on db2207
  • 22:14 urbanecm: [urbanecm@deploy2002 ~]$ mwscript-k8s -f namespaceDupes.php -- --wiki=uzwiktionary --fix --add-prefix=BROKEN # T362620, logs posted to the task
  • 22:08 urbanecm: Run mwscript-k8s -f namespaceDupes.php -- --wiki=uzwiktionary --fix # T362620 # logs are at P72163
  • 22:03 urbanecm@deploy2002: Finished scap sync-world: Backport for Update uzwiktionary project namespace and site name (T362620) (duration: 12m 45s)
  • 21:56 urbanecm@deploy2002: nmw03, urbanecm: Continuing with sync
  • 21:54 urbanecm@deploy2002: nmw03, urbanecm: Backport for Update uzwiktionary project namespace and site name (T362620) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:50 urbanecm@deploy2002: Started scap sync-world: Backport for Update uzwiktionary project namespace and site name (T362620)
  • 21:43 urbanecm@deploy2002: Finished scap sync-world: Backport for Add simplewiki to mobile-anon-talk (T383161), beta: Enable $wgCampaignEventsEnableEventTopics (T380817) (duration: 13m 21s)
  • 21:36 urbanecm@deploy2002: pppery, daimona, urbanecm: Continuing with sync
  • 21:35 urbanecm@deploy2002: pppery, daimona, urbanecm: Backport for Add simplewiki to mobile-anon-talk (T383161), beta: Enable $wgCampaignEventsEnableEventTopics (T380817) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:30 urbanecm@deploy2002: Started scap sync-world: Backport for Add simplewiki to mobile-anon-talk (T383161), beta: Enable $wgCampaignEventsEnableEventTopics (T380817)
  • 19:45 Amir1: dropping blobs table where it's empty (T376627)
  • 18:00 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 17:59 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 17:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2024.codfw.wmnet
  • 17:02 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 02m 42s)
  • 16:59 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 13m 40s)
  • 16:21 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1117-1122].eqiad.wmnet
  • 16:20 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1117-1122].eqiad.wmnet
  • 16:05 brouberol@deploy2002: Finished deploy [airflow-dags/search@8c96899]: (no justification provided) (duration: 00m 31s)
  • 16:04 brouberol@deploy2002: Started deploy [airflow-dags/search@8c96899]: (no justification provided)
  • 16:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
  • 16:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
  • 16:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 16:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 16:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 16:01 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 16:01 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 16:01 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 15:53 dcausse@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 15:53 dcausse@deploy2002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
  • 15:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 15:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 15:45 moritzm: installing python-tornado security updates
  • 15:43 brouberol@deploy2002: Finished deploy [airflow-dags/analytics_test@516e8f2]: (no justification provided) (duration: 00m 21s)
  • 15:43 dcausse@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 15:43 brouberol@deploy2002: Started deploy [airflow-dags/analytics_test@516e8f2]: (no justification provided)
  • 15:43 dcausse@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 15:39 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Make known-good regex check strict (T380751) (duration: 10m 35s)
  • 15:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2282.codfw.wmnet
  • 15:37 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:37 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mw2282.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jelto@cumin1002"
  • 15:36 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mw2282.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jelto@cumin1002"
  • 15:32 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Continuing with sync
  • 15:32 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for Make known-good regex check strict (T380751) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:29 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Make known-good regex check strict (T380751)
  • 15:26 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 15:22 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 15:21 jelto@cumin1002: START - Cookbook sre.hosts.decommission for hosts mw2282.codfw.wmnet
  • 15:21 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 15:18 jelto@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host mw2282.codfw.wmnet
  • 15:18 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2282.codfw.wmnet
  • 15:18 jelto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw2282.codfw.wmnet with reason: decommissioning host
  • 15:18 volans: issues power off via mgmt UI for db2131 (failed to power off during decommissioning)
  • 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'db2212 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72162 and previous config saved to /var/cache/conftool/dbconfig/20250120-151402-root.json
  • 15:04 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:02 taavi@deploy2002: Finished scap sync-world: Backport for wikitech: Drop obsolete oauthadmin group (T384122), wikitech: Drop oathauth group (T384123) (duration: 12m 21s)
  • 14:58 marostegui@cumin1002: dbctl commit (dc=all): 'db2212 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72161 and previous config saved to /var/cache/conftool/dbconfig/20250120-145856-root.json
  • 14:56 taavi@deploy2002: taavi: Continuing with sync
  • 14:55 taavi@deploy2002: taavi: Backport for wikitech: Drop obsolete oauthadmin group (T384122), wikitech: Drop oathauth group (T384123) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2024.codfw.wmnet
  • 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2024.codfw.wmnet
  • 14:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2024.codfw.wmnet
  • 14:50 taavi@deploy2002: Started scap sync-world: Backport for wikitech: Drop obsolete oauthadmin group (T384122), wikitech: Drop oathauth group (T384123)
  • 14:48 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Add dedicated experimentation lab test module (T373715) (duration: 12m 12s)
  • 14:45 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1120.eqiad.wmnet with OS bookworm
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'db2212 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72160 and previous config saved to /var/cache/conftool/dbconfig/20250120-144351-root.json
  • 14:42 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1119.eqiad.wmnet with OS bookworm
  • 14:41 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, cjming: Continuing with sync
  • 14:40 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, cjming: Backport for Add dedicated experimentation lab test module (T373715) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1122.eqiad.wmnet with OS bookworm
  • 14:38 vgutierrez@dns1004: END - running authdns-update
  • 14:36 vgutierrez@dns1004: START - running authdns-update
  • 14:36 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Add dedicated experimentation lab test module (T373715)
  • 14:33 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1118.eqiad.wmnet with OS bookworm
  • 14:30 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1121.eqiad.wmnet with OS bookworm
  • 14:30 oblivian@deploy2002: Finished scap sync-world: Backport for Use a bespoke database configuration for dumps (T382947) (duration: 18m 47s)
  • 14:29 volans@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on sretest1002.eqiad.wmnet with reason: testing cumin
  • 14:28 marostegui@cumin1002: dbctl commit (dc=all): 'db2212 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72159 and previous config saved to /var/cache/conftool/dbconfig/20250120-142846-root.json
  • 14:28 volans: upgraded cumin to v5.0.0 on cumin2002
  • 14:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1117.eqiad.wmnet with OS bookworm
  • 14:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1120.eqiad.wmnet with reason: host reimage
  • 14:23 oblivian@deploy2002: oblivian: Continuing with sync
  • 14:22 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2023.codfw.wmnet to cluster codfw and group A
  • 14:22 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1119.eqiad.wmnet with reason: host reimage
  • 14:22 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2023.codfw.wmnet to cluster codfw and group A
  • 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2023.codfw.wmnet
  • 14:18 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1122.eqiad.wmnet with reason: host reimage
  • 14:16 oblivian@deploy2002: oblivian: Backport for Use a bespoke database configuration for dumps (T382947) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:14 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1118.eqiad.wmnet with reason: host reimage
  • 14:13 marostegui@cumin1002: dbctl commit (dc=all): 'db2212 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72158 and previous config saved to /var/cache/conftool/dbconfig/20250120-141340-root.json
  • 14:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2023.codfw.wmnet
  • 14:11 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1121.eqiad.wmnet with reason: host reimage
  • 14:11 oblivian@deploy2002: Started scap sync-world: Backport for Use a bespoke database configuration for dumps (T382947)
  • 14:09 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1117.eqiad.wmnet with reason: host reimage
  • 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2023.codfw.wmnet with OS bookworm
  • 14:06 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1120.eqiad.wmnet with reason: host reimage
  • 14:06 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1121.eqiad.wmnet with reason: host reimage
  • 14:06 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1119.eqiad.wmnet with reason: host reimage
  • 14:06 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1122.eqiad.wmnet with reason: host reimage
  • 14:06 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1118.eqiad.wmnet with reason: host reimage
  • 14:06 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1117.eqiad.wmnet with reason: host reimage
  • 14:05 kevinbazira@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 14:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 13:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 13:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 13:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
  • 13:56 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
  • 13:56 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 13:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 13:55 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 13:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 13:54 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 13:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1122
  • 13:50 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1122
  • 13:50 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1122.eqiad.wmnet with OS bookworm
  • 13:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1121
  • 13:50 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1121
  • 13:50 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1121.eqiad.wmnet with OS bookworm
  • 13:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1120
  • 13:50 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1120
  • 13:50 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1120.eqiad.wmnet with OS bookworm
  • 13:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1119
  • 13:50 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1119
  • 13:50 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1119.eqiad.wmnet with OS bookworm
  • 13:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1118
  • 13:50 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1118
  • 13:50 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1118.eqiad.wmnet with OS bookworm
  • 13:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1117
  • 13:50 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1117
  • 13:49 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1117.eqiad.wmnet with OS bookworm
  • 13:47 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1117.eqiad.wmnet wikikube-worker1118.eqiad.wmnet wikikube-worker1119.eqiad.wmnet wikikube-worker1120.eqiad.wmnet wikikube-worker1121.eqiad.wmnet wikikube-worker1122.eqiad.wmnet on all recursors
  • 13:47 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1117.eqiad.wmnet wikikube-worker1118.eqiad.wmnet wikikube-worker1119.eqiad.wmnet wikikube-worker1120.eqiad.wmnet wikikube-worker1121.eqiad.wmnet wikikube-worker1122.eqiad.wmnet on all recursors
  • 13:47 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1469 to wikikube-worker1122
  • 13:47 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1122
  • 13:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2023.codfw.wmnet with reason: host reimage
  • 13:44 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1122
  • 13:44 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:44 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1469 to wikikube-worker1122 - kamila@cumin1002"
  • 13:44 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1469 to wikikube-worker1122 - kamila@cumin1002"
  • 13:43 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2023.codfw.wmnet with reason: host reimage
  • 13:43 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:42 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1468 to wikikube-worker1121
  • 13:41 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1121
  • 13:40 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:40 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1121
  • 13:40 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:40 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1468 to wikikube-worker1121 - kamila@cumin1002"
  • 13:40 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1469 to wikikube-worker1122
  • 13:39 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1468 to wikikube-worker1121 - kamila@cumin1002"
  • 13:38 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1467 to wikikube-worker1120
  • 13:37 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1120
  • 13:36 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:36 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1120
  • 13:36 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:36 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1467 to wikikube-worker1120 - kamila@cumin1002"
  • 13:36 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1468 to wikikube-worker1121
  • 13:35 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1467 to wikikube-worker1120 - kamila@cumin1002"
  • 13:32 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1466 to wikikube-worker1119
  • 13:31 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1119
  • 13:31 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:30 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1467 to wikikube-worker1120
  • 13:30 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1119
  • 13:30 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:30 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1466 to wikikube-worker1119 - kamila@cumin1002"
  • 13:30 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1466 to wikikube-worker1119 - kamila@cumin1002"
  • 13:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1465 to wikikube-worker1118
  • 13:27 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1118
  • 13:26 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1118
  • 13:26 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:26 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:26 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1465 to wikikube-worker1118 - kamila@cumin1002"
  • 13:25 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1465 to wikikube-worker1118 - kamila@cumin1002"
  • 13:25 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1466 to wikikube-worker1119
  • 13:23 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1464 to wikikube-worker1117
  • 13:22 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1117
  • 13:21 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:21 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1117
  • 13:21 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:21 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1464 to wikikube-worker1117 - kamila@cumin1002"
  • 13:21 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1465 to wikikube-worker1118
  • 13:19 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1464 to wikikube-worker1117 - kamila@cumin1002"
  • 13:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:15 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:15 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1464 to wikikube-worker1117
  • 13:14 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: rebuilding index
  • 13:13 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] Remove tybanner campaigns (T380405), [Growth] Add fundraising- as a prefix for fundraising campaign (T380405) (duration: 12m 52s)
  • 13:12 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2189.codfw.wmnet with reason: rebuilding
  • 13:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P72157 and previous config saved to /var/cache/conftool/dbconfig/20250120-131157-marostegui.json
  • 13:11 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[1464-1469].eqiad.wmnet
  • 13:07 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[1464-1469].eqiad.wmnet
  • 13:06 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 13:05 urbanecm@deploy2002: urbanecm: Backport for [Growth] Remove tybanner campaigns (T380405), [Growth] Add fundraising- as a prefix for fundraising campaign (T380405) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:00 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] Remove tybanner campaigns (T380405), [Growth] Add fundraising- as a prefix for fundraising campaign (T380405)
  • 12:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 12:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 12:55 federico3: Removing db2131 from zarcillo T384001
  • 12:49 fceratto@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts db2131.codfw.wmnet
  • 12:49 fceratto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:49 fceratto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2131.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1002"
  • 12:48 fceratto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2131.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1002"
  • 12:47 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2023.codfw.wmnet with OS bookworm
  • 12:39 fceratto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:37 marostegui@cumin1002: dbctl commit (dc=all): 'Deploy pc7 in both eqiad and codfw T383235', diff saved to https://phabricator.wikimedia.org/P72155 and previous config saved to /var/cache/conftool/dbconfig/20250120-123741-marostegui.json
  • 12:34 fceratto@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2131.codfw.wmnet
  • 12:13 fceratto@cumin1002: dbctl commit (dc=all): 'Remove db2131 from dbctl T384001', diff saved to https://phabricator.wikimedia.org/P72154 and previous config saved to /var/cache/conftool/dbconfig/20250120-121318-fceratto.json
  • 11:48 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2131.codfw.wmnet with reason: Downtime db2131
  • 11:46 aqu@deploy2002: Finished deploy [airflow-dags/analytics@663a2f0]: 202412 Backfill: Fix ExternalTaskSensor missing filters (duration: 00m 35s)
  • 11:45 aqu@deploy2002: Started deploy [airflow-dags/analytics@663a2f0]: 202412 Backfill: Fix ExternalTaskSensor missing filters
  • 11:43 zabe@deploy2002: Finished scap sync-world: Backport for Fix logo issues (duration: 12m 25s)
  • 11:43 fceratto@cumin1002: dbctl commit (dc=all): 'Depool db2131 T384001', diff saved to https://phabricator.wikimedia.org/P72153 and previous config saved to /var/cache/conftool/dbconfig/20250120-114306-fceratto.json
  • 11:41 federico3: depooling db2131 as per T384001
  • 11:36 zabe@deploy2002: zabe: Continuing with sync
  • 11:36 zabe@deploy2002: zabe: Backport for Fix logo issues synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:31 zabe@deploy2002: Started scap sync-world: Backport for Fix logo issues
  • 11:23 marostegui@dns1006: END - running authdns-update
  • 11:22 marostegui@dns1006: START - running authdns-update
  • 11:16 root@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1245.eqiad.wmnet with OS bookworm
  • 11:14 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ganeti2023.codfw.wmnet with reason: remove from cluster for reimage
  • 11:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2023.codfw.wmnet
  • 10:54 root@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1245.eqiad.wmnet with reason: host reimage
  • 10:50 root@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1245.eqiad.wmnet with reason: host reimage
  • 10:34 root@cumin1002: START - Cookbook sre.hosts.reimage for host db1245.eqiad.wmnet with OS bookworm
  • 10:27 jynus@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1245.eqiad.wmnet with reason: os upgrade
  • 09:51 moritzm: installing intel-microcode security updates
  • 09:31 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Revert^2 "Pin wgCheckUserEnableTempAccountsOnboardingDialog as false" (duration: 11m 45s)
  • 09:24 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 09:24 dreamyjazz@deploy2002: dreamyjazz: Backport for Revert^2 "Pin wgCheckUserEnableTempAccountsOnboardingDialog as false" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to plain
  • 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to plain
  • 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2023.codfw.wmnet
  • 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2023.codfw.wmnet
  • 09:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2023.codfw.wmnet
  • 09:19 dreamyjazz@deploy2002: Started scap sync-world: Backport for Revert^2 "Pin wgCheckUserEnableTempAccountsOnboardingDialog as false"
  • 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
  • 09:16 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2212.codfw.wmnet with reason: rebuilding
  • 09:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2212', diff saved to https://phabricator.wikimedia.org/P72151 and previous config saved to /var/cache/conftool/dbconfig/20250120-091545-marostegui.json
  • 09:14 cjming: end of UTC morning backport window
  • 09:13 cjming@deploy2002: Finished scap sync-world: Backport for Enable the text experiment on testwiki only (T373715) (duration: 14m 32s)
  • 09:05 cjming@deploy2002: cjming: Continuing with sync
  • 09:03 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
  • 09:03 cjming@deploy2002: cjming: Backport for Enable the text experiment on testwiki only (T373715) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:58 cjming@deploy2002: Started scap sync-world: Backport for Enable the text experiment on testwiki only (T373715)
  • 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2023.codfw.wmnet
  • 08:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2023.codfw.wmnet
  • 08:55 awight@deploy2002: Finished scap sync-world: Backport for Switch to explicit numbering for Parsoid footnote markers (T382310) (duration: 23m 38s)
  • 08:45 awight@deploy2002: awight: Continuing with sync
  • 08:44 awight@deploy2002: awight: Backport for Switch to explicit numbering for Parsoid footnote markers (T382310) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:34 moritzm: installing Linux 6.1.124 on Bookworm hosts
  • 08:31 awight@deploy2002: Started scap sync-world: Backport for Switch to explicit numbering for Parsoid footnote markers (T382310)
  • 08:30 moritzm: installing python-aiohttp security updates
  • 08:15 marostegui: Deploy schema change on x1 codfw master db2196 with replication dbmaint T384176
  • 08:04 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2027.codfw.wmnet to cluster codfw and group A
  • 08:03 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2027.codfw.wmnet to cluster codfw and group A
  • 07:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2027.codfw.wmnet
  • 07:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2027.codfw.wmnet
  • 07:36 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on pc2017.codfw.wmnet,pc1017.eqiad.wmnet with reason: reorganizing pc7

2025-01-18

  • 07:40 Ammar: T384109 Ran mwscript-k8s --comment="T384109" -f -- extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=hywiktionary --logwiki=metawiki 'Մարիամ11' 'GrigorYAN'
  • 07:40 Ammar: T384109 Ran mwscript-k8s --comment="T384109" -f -- extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=loginwiki --logwiki=metawiki 'Ruhaina Sheikh' 'Lee Ailiseu'
  • 01:07 mstyles@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 01:07 mstyles@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 01:07 mstyles@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 01:07 mstyles@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 01:06 mstyles@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 01:06 mstyles@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 01:05 mstyles@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 01:04 mstyles@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 01:02 mstyles@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 01:01 mstyles@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply

2025-01-17

  • 18:18 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:18 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for newly assigned wmcs private ipv6 entries - cmooney@cumin1002"
  • 18:18 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for newly assigned wmcs private ipv6 entries - cmooney@cumin1002"
  • 18:13 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 18:12 cmooney@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 18:08 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 18:05 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:05 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for newly assigned wmcs private ipv6 entries - cmooney@cumin1002"
  • 18:05 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for newly assigned wmcs private ipv6 entries - cmooney@cumin1002"
  • 17:59 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 17:44 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw[2259,2263-2266].codfw.wmnet
  • 17:44 hnowlan@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:44 hnowlan@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mw[2259,2263-2266].codfw.wmnet decommissioned, removing all IPs except the asset tag one - hnowlan@cumin2002"
  • 17:42 denisse@deploy2002: Finished deploy [librenms/librenms@f049593]: Upgrade LibreNMS to 24.12.0 - T384036 (duration: 00m 14s)
  • 17:41 denisse@deploy2002: Started deploy [librenms/librenms@f049593]: Upgrade LibreNMS to 24.12.0 - T384036
  • 17:40 denisse: Upgrading LibreNMS in production - T384036
  • 17:40 hnowlan@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mw[2259,2263-2266].codfw.wmnet decommissioned, removing all IPs except the asset tag one - hnowlan@cumin2002"
  • 17:36 hnowlan@cumin2002: START - Cookbook sre.dns.netbox
  • 17:25 hnowlan@cumin2002: START - Cookbook sre.hosts.decommission for hosts mw[2259,2263-2266].codfw.wmnet
  • 17:12 xcollazo@deploy2002: Finished deploy [airflow-dags/analytics@b0cd4df]: Deploy latest DAGs for 'analytics' Airflow instance. T366542. (duration: 00m 32s)
  • 17:11 xcollazo@deploy2002: Started deploy [airflow-dags/analytics@b0cd4df]: Deploy latest DAGs for 'analytics' Airflow instance. T366542.
  • 15:40 topranks: manually restarting netbox service on netbox1003
  • 15:39 cmooney@cumin1002: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox
  • 15:26 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 15:20 cmooney@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 15:20 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 15:13 moritzm: powercycle prometheus1006
  • 14:48 marostegui@cumin1002: dbctl commit (dc=all): 'es1046 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P72147 and previous config saved to /var/cache/conftool/dbconfig/20250117-144824-root.json
  • 14:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2027.codfw.wmnet with OS bookworm
  • 14:33 marostegui@cumin1002: dbctl commit (dc=all): 'es1046 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P72146 and previous config saved to /var/cache/conftool/dbconfig/20250117-143318-root.json
  • 14:18 marostegui@cumin1002: dbctl commit (dc=all): 'es1046 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P72145 and previous config saved to /var/cache/conftool/dbconfig/20250117-141813-root.json
  • 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2027.codfw.wmnet with reason: host reimage
  • 14:12 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2027.codfw.wmnet with reason: host reimage
  • 14:03 marostegui@cumin1002: dbctl commit (dc=all): 'es1046 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P72144 and previous config saved to /var/cache/conftool/dbconfig/20250117-140308-root.json
  • 14:01 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:01 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 14:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:48 marostegui@cumin1002: dbctl commit (dc=all): 'es1046 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P72143 and previous config saved to /var/cache/conftool/dbconfig/20250117-134801-root.json
  • 13:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:32 marostegui@cumin1002: dbctl commit (dc=all): 'es1046 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P72142 and previous config saved to /var/cache/conftool/dbconfig/20250117-133256-root.json
  • 13:17 marostegui@cumin1002: dbctl commit (dc=all): 'es1046 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P72140 and previous config saved to /var/cache/conftool/dbconfig/20250117-131751-root.json
  • 13:02 marostegui@cumin1002: dbctl commit (dc=all): 'es1046 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P72139 and previous config saved to /var/cache/conftool/dbconfig/20250117-130245-root.json
  • 12:47 marostegui@cumin1002: dbctl commit (dc=all): 'es1046 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P72138 and previous config saved to /var/cache/conftool/dbconfig/20250117-124740-root.json
  • 12:32 marostegui@cumin1002: dbctl commit (dc=all): 'es1046 (re)pooling @ 1%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P72137 and previous config saved to /var/cache/conftool/dbconfig/20250117-123235-root.json
  • 12:31 marostegui@cumin1002: dbctl commit (dc=all): 'Add es1046 to dbctl depooled T382569', diff saved to https://phabricator.wikimedia.org/P72136 and previous config saved to /var/cache/conftool/dbconfig/20250117-123153-marostegui.json
  • 12:25 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2141.codfw.wmnet with OS bookworm
  • 12:07 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1111-1116].eqiad.wmnet
  • 12:07 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1111-1116].eqiad.wmnet
  • 12:02 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 12:02 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 12:01 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2141.codfw.wmnet with reason: host reimage
  • 12:00 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 12:00 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 11:58 moritzm: installing Linux 6.1.124 on Bookworm hosts
  • 11:57 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2141.codfw.wmnet with reason: host reimage
  • 11:42 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2027.codfw.wmnet with OS bookworm
  • 11:41 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host db2141.codfw.wmnet with OS bookworm
  • 11:37 jynus@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: reimage
  • 11:31 kamila@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1111-1116].eqiad.wmnet
  • 11:31 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1111-1116].eqiad.wmnet
  • 10:58 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2240-2241].codfw.wmnet
  • 10:58 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2240-2241].codfw.wmnet
  • 10:55 jelto: homer 'cr*codfw*' commit 'T377877'
  • 10:54 jelto: homer 'lsw1-b3-codfw*' commit 'T377877'
  • 10:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2241.codfw.wmnet with OS bookworm
  • 10:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2241.codfw.wmnet with reason: host reimage
  • 10:26 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2241.codfw.wmnet with reason: host reimage
  • 10:19 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2240.codfw.wmnet with OS bookworm
  • 10:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2241
  • 10:08 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2241
  • 10:08 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2241.codfw.wmnet with OS bookworm
  • 10:08 jelto@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker2241.codfw.wmnet with OS bookworm
  • 09:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2240.codfw.wmnet with reason: host reimage
  • 09:55 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2240.codfw.wmnet with reason: host reimage
  • 09:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2241
  • 09:42 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2241
  • 09:42 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2241
  • 09:42 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2241.codfw.wmnet 158.16.192.10.in-addr.arpa 8.5.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:42 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2241.codfw.wmnet 158.16.192.10.in-addr.arpa 8.5.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:42 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:42 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2241 - jelto@cumin1002"
  • 09:42 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2241 - jelto@cumin1002"
  • 09:39 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:39 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2241
  • 09:38 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2241.codfw.wmnet with OS bookworm
  • 09:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2240
  • 09:37 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2240
  • 09:37 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2240
  • 09:37 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2240.codfw.wmnet 157.16.192.10.in-addr.arpa 7.5.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:37 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2240.codfw.wmnet 157.16.192.10.in-addr.arpa 7.5.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:37 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:37 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2240 - jelto@cumin1002"
  • 09:37 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2240 - jelto@cumin1002"
  • 09:33 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:33 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2240
  • 09:33 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2240.codfw.wmnet with OS bookworm
  • 09:14 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2240.codfw.wmnet wikikube-worker2241.codfw.wmnet on all recursors
  • 09:14 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2240.codfw.wmnet wikikube-worker2241.codfw.wmnet on all recursors
  • 09:13 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2240 wikikube-worker2241 on all recursors
  • 09:13 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2240 wikikube-worker2241 on all recursors
  • 09:08 Emperor: depool / restart / repool ms-fe2010 T360913
  • 09:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2311 to wikikube-worker2241
  • 09:07 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2241
  • 09:07 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2241
  • 09:07 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:07 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2311 to wikikube-worker2241 - jelto@cumin1002"
  • 09:06 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2311 to wikikube-worker2241 - jelto@cumin1002"
  • 09:03 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:03 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2311 to wikikube-worker2241
  • 09:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2310 to wikikube-worker2240
  • 09:01 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2240
  • 09:00 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2240
  • 09:00 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:00 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2310 to wikikube-worker2240 - jelto@cumin1002"
  • 08:59 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2310 to wikikube-worker2240 - jelto@cumin1002"
  • 08:56 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:55 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2310 to wikikube-worker2240
  • 08:35 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2282,2310-2311].codfw.wmnet
  • 08:33 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2282,2310-2311].codfw.wmnet
  • 07:48 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72130 and previous config saved to /var/cache/conftool/dbconfig/20250117-074806-root.json
  • 07:33 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72129 and previous config saved to /var/cache/conftool/dbconfig/20250117-073301-root.json
  • 07:17 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72128 and previous config saved to /var/cache/conftool/dbconfig/20250117-071755-root.json
  • 07:02 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72127 and previous config saved to /var/cache/conftool/dbconfig/20250117-070250-root.json
  • 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72125 and previous config saved to /var/cache/conftool/dbconfig/20250117-064745-root.json
  • 06:27 moritzm: installing rsync security regression updates
  • 05:33 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 05:32 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:32 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:31 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 05:27 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:26 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 00:23 zabe: zabe@deploy2002:/srv/mediawiki-staging$ mwscript-k8s -f -- createAndPromote.php --wiki=arbcom_zhwiki --sysop --bureaucrat --force ZhaoFJx
  • 00:21 zabe: zabe@deploy2002:/srv/mediawiki-staging$ mwscript-k8s -f -- createAndPromote.php --wiki=arbcom_zhwiki ZhaoFJx REDACTED
  • 00:18 zabe@deploy2002: Finished scap sync-world: Backport for Update interwiki cache, Update composer.lock, Activate arbcom_zhwiki (T380119) (duration: 11m 46s)
  • 00:11 zabe@deploy2002: zabe: Continuing with sync
  • 00:11 zabe@deploy2002: zabe: Backport for Update interwiki cache, Update composer.lock, Activate arbcom_zhwiki (T380119) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 00:07 zabe@deploy2002: Started scap sync-world: Backport for Update interwiki cache, Update composer.lock, Activate arbcom_zhwiki (T380119)

2025-01-16

  • 23:58 zabe@deploy2002: Finished scap sync-world: Creating arbcom_zhwiki (T380119) (duration: 12m 43s)
  • 23:51 zabe@deploy2002: zabe: Continuing with sync
  • 23:50 zabe@deploy2002: zabe: Creating arbcom_zhwiki (T380119) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:45 zabe@deploy2002: Started scap sync-world: Creating arbcom_zhwiki (T380119)
  • 23:28 toyofuku@deploy2002: Finished scap sync-world: Backport for Beta: Update schemas in InitialiseSettings-labs.php (T382080) (duration: 11m 23s)
  • 23:21 toyofuku@deploy2002: toyofuku, jdlrobson: Continuing with sync
  • 23:21 toyofuku@deploy2002: toyofuku, jdlrobson: Backport for Beta: Update schemas in InitialiseSettings-labs.php (T382080) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:17 toyofuku@deploy2002: Started scap sync-world: Backport for Beta: Update schemas in InitialiseSettings-labs.php (T382080)
  • 23:14 toyofuku@deploy2002: Finished scap sync-world: Backport for Stop expanding sections by default on Wiktionary (T376446) (duration: 11m 32s)
  • 23:07 toyofuku@deploy2002: jdlrobson, toyofuku: Continuing with sync
  • 23:07 toyofuku@deploy2002: jdlrobson, toyofuku: Backport for Stop expanding sections by default on Wiktionary (T376446) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:02 toyofuku@deploy2002: Started scap sync-world: Backport for Stop expanding sections by default on Wiktionary (T376446)
  • 23:01 toyofuku@deploy2002: Finished scap sync-world: Backport for Remove `wgVectorStickyHeader` from InitialiseSettings.php (T332728) (duration: 12m 13s)
  • 22:53 toyofuku@deploy2002: ksarabia, toyofuku: Continuing with sync
  • 22:53 toyofuku@deploy2002: ksarabia, toyofuku: Backport for Remove `wgVectorStickyHeader` from InitialiseSettings.php (T332728) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:49 toyofuku@deploy2002: Started scap sync-world: Backport for Remove `wgVectorStickyHeader` from InitialiseSettings.php (T332728)
  • 22:45 toyofuku@deploy2002: Finished scap sync-world: Backport for Configure streams for web empty search AB test (T380926) (duration: 16m 47s)
  • 22:38 toyofuku@deploy2002: bwang, toyofuku: Continuing with sync
  • 22:33 toyofuku@deploy2002: bwang, toyofuku: Backport for Configure streams for web empty search AB test (T380926) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:28 toyofuku@deploy2002: Started scap sync-world: Backport for Configure streams for web empty search AB test (T380926)
  • 22:15 cjming@deploy2002: Finished scap sync-world: Backport for Analytics - Set parameters to Refine content history reconcile (T369845) (duration: 12m 37s)
  • 22:08 cjming@deploy2002: aqu, cjming: Continuing with sync
  • 22:07 cjming@deploy2002: aqu, cjming: Backport for Analytics - Set parameters to Refine content history reconcile (T369845) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:03 cjming@deploy2002: Started scap sync-world: Backport for Analytics - Set parameters to Refine content history reconcile (T369845)
  • 22:01 cjming@deploy2002: Finished scap sync-world: Backport for testwiki: Enable MetricsPlatform stream config fetching and merging (T381964) (duration: 13m 10s)
  • 22:01 volans: uploaded cumin_5.0.0 to apt.wikimedia.org bullseye-wikimedia
  • 21:54 cjming@deploy2002: phuedx, cjming: Continuing with sync
  • 21:54 cjming@deploy2002: phuedx, cjming: Backport for testwiki: Enable MetricsPlatform stream config fetching and merging (T381964) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:49 cdanis@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
  • 21:49 cdanis@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
  • 21:48 cjming@deploy2002: Started scap sync-world: Backport for testwiki: Enable MetricsPlatform stream config fetching and merging (T381964)
  • 21:48 cdanis@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
  • 21:47 cdanis@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
  • 21:47 cjming@deploy2002: Finished scap sync-world: Backport for Enable MetricsPlatform extension everywhere (T381964) (duration: 11m 43s)
  • 21:40 cjming@deploy2002: cjming, phuedx: Continuing with sync
  • 21:40 cjming@deploy2002: cjming, phuedx: Backport for Enable MetricsPlatform extension everywhere (T381964) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:35 cjming@deploy2002: Started scap sync-world: Backport for Enable MetricsPlatform extension everywhere (T381964)
  • 21:33 cjming@deploy2002: Finished scap sync-world: Backport for Beta Cluster: Update MetricsPlatform extension config (T381964) (duration: 12m 55s)
  • 21:26 cjming@deploy2002: cjming, phuedx: Continuing with sync
  • 21:26 cjming@deploy2002: cjming, phuedx: Backport for Beta Cluster: Update MetricsPlatform extension config (T381964) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:20 cjming@deploy2002: Started scap sync-world: Backport for Beta Cluster: Update MetricsPlatform extension config (T381964)
  • 21:17 jsn@deploy2002: Finished scap sync-world: Backport for Increase Nuke max age to 90 days (attempt 2) (T380846) (duration: 13m 12s)
  • 21:09 jsn@deploy2002: jsn, chlod: Continuing with sync
  • 21:08 jsn@deploy2002: jsn, chlod: Backport for Increase Nuke max age to 90 days (attempt 2) (T380846) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:04 jsn@deploy2002: Started scap sync-world: Backport for Increase Nuke max age to 90 days (attempt 2) (T380846)
  • 20:59 catrope@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
  • 20:58 catrope@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
  • 20:58 catrope@deploy2002: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
  • 20:57 catrope@deploy2002: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
  • 20:51 catrope@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
  • 20:50 catrope@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
  • 20:28 kamila@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1111-1116].eqiad.wmnet
  • 20:28 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1111-1116].eqiad.wmnet
  • 20:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1114.eqiad.wmnet with OS bookworm
  • 20:08 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1114.eqiad.wmnet with reason: host reimage
  • 20:04 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1114.eqiad.wmnet with reason: host reimage
  • 19:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1114
  • 19:48 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1114
  • 19:48 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1114.eqiad.wmnet with OS bookworm
  • 19:19 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.44.0-wmf.12 refs T382363
  • 19:06 brennen: 1.44.0-wmf.12 train (T382363): no current blockers and logs calm, rolling to all wikis.
  • 18:40 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 18:39 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 18:38 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 18:38 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 18:37 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 18:37 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 18:28 swfrench@deploy2002: Finished scap sync-world: Deployment to switch next release files to 8.1 - T377040 (duration: 03m 50s)
  • 18:24 swfrench@deploy2002: Started scap sync-world: Deployment to switch next release files to 8.1 - T377040
  • 18:22 cmooney@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 138398
  • 18:22 cmooney@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 138398
  • 17:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1113.eqiad.wmnet with OS bookworm
  • 17:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1112.eqiad.wmnet with OS bookworm
  • 17:47 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1116.eqiad.wmnet with OS bookworm
  • 17:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1115.eqiad.wmnet with OS bookworm
  • 17:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1111.eqiad.wmnet with OS bookworm
  • 17:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1113.eqiad.wmnet with reason: host reimage
  • 17:34 zabe@deploy2002: Finished scap sync-world: T380119 (duration: 10m 59s)
  • 17:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1112.eqiad.wmnet with reason: host reimage
  • 17:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1116.eqiad.wmnet with reason: host reimage
  • 17:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1115.eqiad.wmnet with reason: host reimage
  • 17:23 zabe@deploy2002: Started scap sync-world: T380119
  • 17:23 zabe@deploy2002: sync-world aborted: T380119 (duration: 01m 16s)
  • 17:22 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1111.eqiad.wmnet with reason: host reimage
  • 17:22 zabe@deploy2002: Started scap sync-world: T380119
  • 17:19 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1116.eqiad.wmnet with reason: host reimage
  • 17:19 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1113.eqiad.wmnet with reason: host reimage
  • 17:19 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1115.eqiad.wmnet with reason: host reimage
  • 17:18 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1112.eqiad.wmnet with reason: host reimage
  • 17:18 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1111.eqiad.wmnet with reason: host reimage
  • 17:14 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1114.eqiad.wmnet on all recursors
  • 17:14 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1114.eqiad.wmnet on all recursors
  • 17:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1116
  • 17:03 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1116
  • 17:03 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1116.eqiad.wmnet with OS bookworm
  • 17:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1115
  • 17:03 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1115
  • 17:03 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1115.eqiad.wmnet with OS bookworm
  • 17:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1113
  • 17:03 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1113
  • 17:03 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1113.eqiad.wmnet with OS bookworm
  • 17:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1112
  • 17:02 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1112
  • 17:02 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1112.eqiad.wmnet with OS bookworm
  • 17:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1111
  • 17:02 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1111
  • 17:02 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1111.eqiad.wmnet with OS bookworm
  • 16:59 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1111.eqiad.wmnet wikikube-worker1112.eqiad.wmnet wikikube-worker1113.eqiad.wmnet wikikube-worker1114.eqiad.wmnet wikikube-worker1115.eqiad.wmnet wikikube-worker1116.eqiad.wmnet on all recursors
  • 16:59 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1111.eqiad.wmnet wikikube-worker1112.eqiad.wmnet wikikube-worker1113.eqiad.wmnet wikikube-worker1114.eqiad.wmnet wikikube-worker1115.eqiad.wmnet wikikube-worker1116.eqiad.wmnet on all recursors
  • 16:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1463 to wikikube-worker1116
  • 16:58 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1116
  • 16:56 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1116
  • 16:56 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:56 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1463 to wikikube-worker1116 - kamila@cumin1002"
  • 16:56 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1463 to wikikube-worker1116 - kamila@cumin1002"
  • 16:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1462 to wikikube-worker1115
  • 16:53 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1115
  • 16:52 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:52 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1115
  • 16:52 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:52 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1462 to wikikube-worker1115 - kamila@cumin1002"
  • 16:52 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1462 to wikikube-worker1115 - kamila@cumin1002"
  • 16:49 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1463 to wikikube-worker1116
  • 16:49 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1461 to wikikube-worker1114
  • 16:48 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1114
  • 16:48 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:47 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1114
  • 16:47 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:47 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1461 to wikikube-worker1114 - kamila@cumin1002"
  • 16:47 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1461 to wikikube-worker1114 - kamila@cumin1002"
  • 16:45 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1462 to wikikube-worker1115
  • 16:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1460 to wikikube-worker1113
  • 16:44 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1113
  • 16:43 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:43 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1113
  • 16:43 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:43 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1460 to wikikube-worker1113 - kamila@cumin1002"
  • 16:42 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1460 to wikikube-worker1113 - kamila@cumin1002"
  • 16:39 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1449 to wikikube-worker1112
  • 16:39 topranks: manually restarting netbox service on netbox1003 to update interface validator
  • 16:39 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1112
  • 16:38 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:38 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1461 to wikikube-worker1114
  • 16:38 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1112
  • 16:38 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:38 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1449 to wikikube-worker1112 - kamila@cumin1002"
  • 16:38 cmooney@cumin1002: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox
  • 16:37 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 16:37 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 16:37 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1449 to wikikube-worker1112 - kamila@cumin1002"
  • 16:37 cmooney@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 16:37 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 16:36 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1460 to wikikube-worker1113
  • 16:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1448 to wikikube-worker1111
  • 16:36 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 16:35 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 16:35 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1111
  • 16:35 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 16:34 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1111
  • 16:34 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:34 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1448 to wikikube-worker1111 - kamila@cumin1002"
  • 16:34 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:34 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1448 to wikikube-worker1111 - kamila@cumin1002"
  • 16:33 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1449 to wikikube-worker1112
  • 16:29 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:28 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1448 to wikikube-worker1111
  • 16:25 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[1448-1449,1460-1463].eqiad.wmnet
  • 16:22 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[1448-1449,1460-1463].eqiad.wmnet
  • 16:21 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 16:21 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 16:19 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 16:19 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 16:05 ladsgroup@deploy2002: Finished scap sync-world: Backport for dbconfig: Order json output entries (duration: 18m 52s)
  • 15:58 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 15:54 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1107-1110].eqiad.wmnet
  • 15:54 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1107-1110].eqiad.wmnet
  • 15:53 ladsgroup@deploy2002: ladsgroup: Backport for dbconfig: Order json output entries synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1107.eqiad.wmnet with OS bookworm
  • 15:51 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
  • 15:51 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
  • 15:46 ladsgroup@deploy2002: Started scap sync-world: Backport for dbconfig: Order json output entries
  • 15:44 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Check known-good regex patterns directly (T380751) (duration: 14m 32s)
  • 15:37 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Continuing with sync
  • 15:36 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for Check known-good regex patterns directly (T380751) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:33 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1107.eqiad.wmnet with reason: host reimage
  • 15:33 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2236-2239].codfw.wmnet
  • 15:33 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2236-2239].codfw.wmnet
  • 15:32 jelto: homer 'cr*codfw*' commit 'T377877'
  • 15:31 jelto: homer 'lsw1-c3-codfw*' commit 'T377877'
  • 15:30 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1107.eqiad.wmnet with reason: host reimage
  • 15:29 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2238.codfw.wmnet with OS bookworm
  • 15:29 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Check known-good regex patterns directly (T380751)
  • 15:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1110.eqiad.wmnet with OS bookworm
  • 15:20 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1109.eqiad.wmnet with OS bookworm
  • 15:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72124 and previous config saved to /var/cache/conftool/dbconfig/20250116-151950-root.json
  • 15:18 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Disable distinct-values constraint checks on Commons (T369079) (duration: 10m 51s)
  • 15:14 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1107
  • 15:14 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1107
  • 15:14 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1107.eqiad.wmnet with OS bookworm
  • 15:13 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Continuing with sync
  • 15:13 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for Disable distinct-values constraint checks on Commons (T369079) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1108.eqiad.wmnet with OS bookworm
  • 15:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2238.codfw.wmnet with reason: host reimage
  • 15:07 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Disable distinct-values constraint checks on Commons (T369079)
  • 15:05 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1110.eqiad.wmnet with reason: host reimage
  • 15:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72123 and previous config saved to /var/cache/conftool/dbconfig/20250116-150446-root.json
  • 15:03 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2238.codfw.wmnet with reason: host reimage
  • 15:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1109.eqiad.wmnet with reason: host reimage
  • 14:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2239.codfw.wmnet with OS bookworm
  • 14:57 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1110.eqiad.wmnet with reason: host reimage
  • 14:57 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1109.eqiad.wmnet with reason: host reimage
  • 14:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1108.eqiad.wmnet with reason: host reimage
  • 14:51 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Add z-index to `.tux-more-notices` (T383669) (duration: 22m 26s)
  • 14:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1108.eqiad.wmnet with reason: host reimage
  • 14:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72122 and previous config saved to /var/cache/conftool/dbconfig/20250116-144940-root.json
  • 14:46 lucaswerkmeister-wmde@deploy2002: abi, lucaswerkmeister-wmde: Continuing with sync
  • 14:45 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2238
  • 14:45 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2238
  • 14:45 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2238.codfw.wmnet with OS bookworm
  • 14:45 jelto@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker2238.codfw.wmnet with OS bookworm
  • 14:41 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1110
  • 14:41 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1110
  • 14:41 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1110.eqiad.wmnet with OS bookworm
  • 14:41 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1109
  • 14:41 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1109
  • 14:41 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1109.eqiad.wmnet with OS bookworm
  • 14:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2239.codfw.wmnet with reason: host reimage
  • 14:36 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2239.codfw.wmnet with reason: host reimage
  • 14:36 lucaswerkmeister-wmde@deploy2002: abi, lucaswerkmeister-wmde: Backport for Add z-index to `.tux-more-notices` (T383669) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2237.codfw.wmnet with OS bookworm
  • 14:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1108
  • 14:34 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1108
  • 14:34 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1108.eqiad.wmnet with OS bookworm
  • 14:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72121 and previous config saved to /var/cache/conftool/dbconfig/20250116-143435-root.json
  • 14:34 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1110.eqiad.wmnet with OS bookworm
  • 14:33 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1109.eqiad.wmnet with OS bookworm
  • 14:33 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1108.eqiad.wmnet with OS bookworm
  • 14:30 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1107.eqiad.wmnet on all recursors
  • 14:29 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1107.eqiad.wmnet on all recursors
  • 14:29 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Add z-index to `.tux-more-notices` (T383669)
  • 14:28 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp
  • 14:27 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Turn on Parsoid Read Views on test2wiki (T378645) (duration: 20m 22s)
  • 14:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1110
  • 14:27 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1110
  • 14:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1109
  • 14:27 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1109
  • 14:27 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1110.eqiad.wmnet with OS bookworm
  • 14:26 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1108
  • 14:26 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1108
  • 14:26 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1109.eqiad.wmnet with OS bookworm
  • 14:26 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1108.eqiad.wmnet with OS bookworm
  • 14:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2236.codfw.wmnet with OS bookworm
  • 14:23 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1107.eqiad.wmnet wikikube-worker1108.eqiad.wmnet wikikube-worker1109.eqiad.wmnet wikikube-worker1110.eqiad.wmnet on all recursors
  • 14:23 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1107.eqiad.wmnet wikikube-worker1108.eqiad.wmnet wikikube-worker1109.eqiad.wmnet wikikube-worker1110.eqiad.wmnet on all recursors
  • 14:22 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1441 to wikikube-worker1109
  • 14:22 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1109
  • 14:21 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1109
  • 14:21 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:20 lucaswerkmeister-wmde@deploy2002: ssastry, lucaswerkmeister-wmde: Continuing with sync
  • 14:18 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 14:18 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1442 to wikikube-worker1110
  • 14:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2239
  • 14:18 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2239
  • 14:18 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2239
  • 14:18 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2239.codfw.wmnet 116.32.192.10.in-addr.arpa 6.1.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:18 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2239.codfw.wmnet 116.32.192.10.in-addr.arpa 6.1.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:18 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:17 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1110
  • 14:17 lucaswerkmeister-wmde@deploy2002: ssastry, lucaswerkmeister-wmde: Backport for Turn on Parsoid Read Views on test2wiki (T378645) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:16 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1110
  • 14:16 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:16 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1442 to wikikube-worker1110 - kamila@cumin1002"
  • 14:15 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:15 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1442 to wikikube-worker1110 - kamila@cumin1002"
  • 14:15 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72120 and previous config saved to /var/cache/conftool/dbconfig/20250116-141530-root.json
  • 14:15 jelto@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 14:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2237.codfw.wmnet with reason: host reimage
  • 14:13 marostegui@dns1006: END - running authdns-update
  • 14:11 marostegui@dns1006: START - running authdns-update
  • 14:10 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2237.codfw.wmnet with reason: host reimage
  • 14:10 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 14:09 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp
  • 14:08 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp
  • 14:07 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1442 to wikikube-worker1110
  • 14:07 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1441 to wikikube-worker1109
  • 14:07 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1440 to wikikube-worker1108
  • 14:06 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Turn on Parsoid Read Views on test2wiki (T378645)
  • 14:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2236.codfw.wmnet with reason: host reimage
  • 14:06 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1108
  • 14:05 marostegui@cumin1002: dbctl commit (dc=all): 'Pool pc6 eqiad and codfw dbmaint T383234', diff saved to https://phabricator.wikimedia.org/P72119 and previous config saved to /var/cache/conftool/dbconfig/20250116-140523-marostegui.json
  • 14:05 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:05 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1108
  • 14:05 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:04 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2239
  • 14:04 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2239.codfw.wmnet with OS bookworm
  • 14:02 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 14:02 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2236.codfw.wmnet with reason: host reimage
  • 14:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2238
  • 14:02 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2238
  • 14:01 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2238
  • 14:01 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2238.codfw.wmnet 115.32.192.10.in-addr.arpa 5.1.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:01 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2238.codfw.wmnet 115.32.192.10.in-addr.arpa 5.1.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:01 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:01 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2238 - jelto@cumin1002"
  • 14:01 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2238 - jelto@cumin1002"
  • 13:57 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:57 jynus@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts dbprov2002.codfw.wmnet
  • 13:57 jynus@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:57 jynus@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1002"
  • 13:57 jynus@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1002"
  • 13:56 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2238
  • 13:56 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2238.codfw.wmnet with OS bookworm
  • 13:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2237
  • 13:53 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2237
  • 13:53 jynus@cumin1002: START - Cookbook sre.dns.netbox
  • 13:52 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2237
  • 13:52 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2237.codfw.wmnet 113.32.192.10.in-addr.arpa 3.1.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:52 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2237.codfw.wmnet 113.32.192.10.in-addr.arpa 3.1.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:52 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:52 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2237 - jelto@cumin1002"
  • 13:52 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2237 - jelto@cumin1002"
  • 13:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1439 to wikikube-worker1107
  • 13:49 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp
  • 13:49 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1440 to wikikube-worker1108
  • 13:49 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1107
  • 13:48 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:48 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1107
  • 13:48 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:48 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1439 to wikikube-worker1107 - kamila@cumin1002"
  • 13:48 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1439 to wikikube-worker1107 - kamila@cumin1002"
  • 13:45 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2237
  • 13:45 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2237.codfw.wmnet with OS bookworm
  • 13:44 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2236
  • 13:44 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2236
  • 13:44 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:44 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2236
  • 13:44 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2236.codfw.wmnet 112.32.192.10.in-addr.arpa 2.1.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:44 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2236.codfw.wmnet 112.32.192.10.in-addr.arpa 2.1.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:44 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:44 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2236 - jelto@cumin1002"
  • 13:43 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2236 - jelto@cumin1002"
  • 13:43 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1439 to wikikube-worker1107
  • 13:42 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[1439-1442].eqiad.wmnet
  • 13:40 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:40 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2236
  • 13:40 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2236.codfw.wmnet with OS bookworm
  • 13:39 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[1439-1442].eqiad.wmnet
  • 13:38 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2236.codfw.wmnet wikikube-worker2237.codfw.wmnet wikikube-worker2238.codfw.wmnet wikikube-worker2239.codfw.wmnet on all recursors
  • 13:38 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2236.codfw.wmnet wikikube-worker2237.codfw.wmnet wikikube-worker2238.codfw.wmnet wikikube-worker2239.codfw.wmnet on all recursors
  • 13:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2338 to wikikube-worker2239
  • 13:37 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2239
  • 13:36 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2239
  • 13:36 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:36 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2338 to wikikube-worker2239 - jelto@cumin1002"
  • 13:36 jynus@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbprov2002.codfw.wmnet
  • 13:36 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2338 to wikikube-worker2239 - jelto@cumin1002"
  • 13:35 jynus@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts dbprov2002.codfw.wmnet
  • 13:35 jynus@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 13:32 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:32 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2338 to wikikube-worker2239
  • 13:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2337 to wikikube-worker2238
  • 13:30 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2238
  • 13:29 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2238
  • 13:29 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:29 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2337 to wikikube-worker2238 - jelto@cumin1002"
  • 13:29 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2337 to wikikube-worker2238 - jelto@cumin1002"
  • 13:26 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:25 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2337 to wikikube-worker2238
  • 13:24 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2336 to wikikube-worker2237
  • 13:23 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2237
  • 13:23 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2237
  • 13:23 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:23 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2336 to wikikube-worker2237 - jelto@cumin1002"
  • 13:23 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2336 to wikikube-worker2237 - jelto@cumin1002"
  • 13:18 hashar: Upgraded Gerrit to 3.10.4
  • 13:12 hashar@deploy2002: Finished deploy [gerrit/gerrit@5c2347d]: Gerrit to 3.10.4 - T383597 (duration: 00m 10s)
  • 13:12 hashar@deploy2002: Started deploy [gerrit/gerrit@5c2347d]: Gerrit to 3.10.4 - T383597
  • 13:09 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:08 hashar@deploy2002: Finished deploy [gerrit/gerrit@5c2347d]: Gerrit to 3.10.4 - T383597 (duration: 00m 08s)
  • 13:08 hashar@deploy2002: Started deploy [gerrit/gerrit@5c2347d]: Gerrit to 3.10.4 - T383597
  • 13:08 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2336 to wikikube-worker2237
  • 13:04 jynus@cumin1002: START - Cookbook sre.dns.netbox
  • 12:56 jynus@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbprov2002.codfw.wmnet
  • 12:52 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbprov2001.codfw.wmnet
  • 12:52 jynus@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2335 to wikikube-worker2236
  • 12:50 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2236
  • 12:50 jynus@cumin1002: START - Cookbook sre.dns.netbox
  • 12:50 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2236
  • 12:50 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:50 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2335 to wikikube-worker2236 - jelto@cumin1002"
  • 12:49 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2335 to wikikube-worker2236 - jelto@cumin1002"
  • 12:45 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:45 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2335 to wikikube-worker2236
  • 12:44 marostegui@dns1006: END - running authdns-update
  • 12:42 marostegui@dns1006: START - running authdns-update
  • 12:42 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2335-2338].codfw.wmnet
  • 12:40 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2335-2338].codfw.wmnet
  • 12:39 jynus@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbprov2001.codfw.wmnet
  • 12:38 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbprov1002.eqiad.wmnet
  • 12:38 jynus@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:38 jynus@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1002"
  • 12:36 marostegui@cumin1002: dbctl commit (dc=all): 'Promote pc1014 to pc4 eqiad master dbmaint and enable pc4 back in eqiad and codfw T383398', diff saved to https://phabricator.wikimedia.org/P72118 and previous config saved to /var/cache/conftool/dbconfig/20250116-123656-marostegui.json
  • 12:33 jynus@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1002"
  • 12:32 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on pc2014.codfw.wmnet,pc[1014,1016].eqiad.wmnet with reason: reorganizing pc4
  • 12:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depool pc4 eqiad and codfw dbmaint T383398', diff saved to https://phabricator.wikimedia.org/P72117 and previous config saved to /var/cache/conftool/dbconfig/20250116-123129-marostegui.json
  • 12:29 jynus@cumin1002: START - Cookbook sre.dns.netbox
  • 12:27 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2232-2235].codfw.wmnet
  • 12:27 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2232-2235].codfw.wmnet
  • 12:24 jelto: homer 'cr*codfw*' commit 'T377877'
  • 12:23 jynus@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbprov1002.eqiad.wmnet
  • 12:23 jelto: homer 'lsw1-c6-codfw*' commit 'T377877'
  • 12:23 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbprov1001.eqiad.wmnet
  • 12:23 jynus@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:23 jynus@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1002"
  • 12:22 jynus@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1002"
  • 12:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2235.codfw.wmnet with OS bookworm
  • 12:20 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru and A:cp
  • 12:17 jynus@cumin1002: START - Cookbook sre.dns.netbox
  • 12:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2234.codfw.wmnet with OS bookworm
  • 12:09 jynus@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbprov1001.eqiad.wmnet
  • 12:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2233.codfw.wmnet with OS bookworm
  • 12:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2235.codfw.wmnet with reason: host reimage
  • 12:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2232.codfw.wmnet with OS bookworm
  • 11:56 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2235.codfw.wmnet with reason: host reimage
  • 11:54 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru and A:cp
  • 11:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2234.codfw.wmnet with reason: host reimage
  • 11:50 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2234.codfw.wmnet with reason: host reimage
  • 11:49 kart_: Updated cxserver to 2025-01-16-103443-production (T383854, T377966)
  • 11:47 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 11:46 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 11:46 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 11:46 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 11:45 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2233.codfw.wmnet with reason: host reimage
  • 11:42 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 11:42 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 11:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2232.codfw.wmnet with reason: host reimage
  • 11:40 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2233.codfw.wmnet with reason: host reimage
  • 11:38 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2232.codfw.wmnet with reason: host reimage
  • 11:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2235
  • 11:38 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2235
  • 11:38 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2235.codfw.wmnet with OS bookworm
  • 11:37 jelto@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2235.codfw.wmnet
  • 11:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2235.codfw.wmnet with OS bullseye
  • 11:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2234
  • 11:32 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2234
  • 11:32 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2234.codfw.wmnet with OS bookworm
  • 11:30 jelto@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2234.codfw.wmnet
  • 11:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2234.codfw.wmnet with OS bullseye
  • 11:29 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2149.codfw.wmnet
  • 11:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2233
  • 11:22 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2233
  • 11:22 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2149', diff saved to https://phabricator.wikimedia.org/P72114 and previous config saved to /var/cache/conftool/dbconfig/20250116-112235-marostegui.json
  • 11:22 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2233.codfw.wmnet with OS bookworm
  • 11:22 root@cumin1002: START - Cookbook sre.mysql.upgrade for db2149.codfw.wmnet
  • 11:21 jelto@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2233.codfw.wmnet
  • 11:21 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2233.codfw.wmnet with OS bullseye
  • 11:20 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2232
  • 11:20 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2232
  • 11:20 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2232.codfw.wmnet with OS bookworm
  • 11:19 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2232.codfw.wmnet with OS bookworm
  • 11:19 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2232.codfw.wmnet with OS bookworm
  • 11:17 jelto@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2232.codfw.wmnet
  • 11:17 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2235.codfw.wmnet with reason: host reimage
  • 11:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2232.codfw.wmnet with OS bullseye
  • 11:13 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2235.codfw.wmnet with reason: host reimage
  • 11:10 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2234.codfw.wmnet with reason: host reimage
  • 11:06 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2234.codfw.wmnet with reason: host reimage
  • 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'es1045 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72113 and previous config saved to /var/cache/conftool/dbconfig/20250116-110538-root.json
  • 11:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2233.codfw.wmnet with reason: host reimage
  • 11:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2132.codfw.wmnet
  • 11:01 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:58 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 10:58 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2235
  • 10:58 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2235
  • 10:58 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2235
  • 10:57 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2235.codfw.wmnet 203.32.192.10.in-addr.arpa 3.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:57 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2235.codfw.wmnet 203.32.192.10.in-addr.arpa 3.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:57 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:57 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2235 - jelto@cumin1002"
  • 10:57 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2233.codfw.wmnet with reason: host reimage
  • 10:57 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2235 - jelto@cumin1002"
  • 10:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2232.codfw.wmnet with reason: host reimage
  • 10:53 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2132.codfw.wmnet
  • 10:53 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:53 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2235
  • 10:53 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2235.codfw.wmnet with OS bullseye
  • 10:53 jelto@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2235.codfw.wmnet
  • 10:50 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2234
  • 10:50 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2234
  • 10:50 marostegui@cumin1002: dbctl commit (dc=all): 'es1045 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72112 and previous config saved to /var/cache/conftool/dbconfig/20250116-105032-root.json
  • 10:50 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2232.codfw.wmnet with reason: host reimage
  • 10:50 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2234
  • 10:50 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2234.codfw.wmnet 202.32.192.10.in-addr.arpa 2.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:50 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2234.codfw.wmnet 202.32.192.10.in-addr.arpa 2.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:50 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:49 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2234 - jelto@cumin1002"
  • 10:49 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2234 - jelto@cumin1002"
  • 10:46 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:46 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2234
  • 10:46 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2234.codfw.wmnet with OS bullseye
  • 10:45 jelto@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2234.codfw.wmnet
  • 10:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2233
  • 10:42 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2233
  • 10:42 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2233
  • 10:42 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2233.codfw.wmnet 201.32.192.10.in-addr.arpa 1.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:42 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2233.codfw.wmnet 201.32.192.10.in-addr.arpa 1.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:42 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:42 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2233 - jelto@cumin1002"
  • 10:41 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2233 - jelto@cumin1002"
  • 10:38 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:38 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2233
  • 10:38 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2233.codfw.wmnet with OS bullseye
  • 10:37 jelto@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2233.codfw.wmnet
  • 10:35 marostegui@cumin1002: dbctl commit (dc=all): 'es1045 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72111 and previous config saved to /var/cache/conftool/dbconfig/20250116-103527-root.json
  • 10:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2232
  • 10:34 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2232
  • 10:34 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2232
  • 10:34 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2232.codfw.wmnet 200.32.192.10.in-addr.arpa 0.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:34 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2232.codfw.wmnet 200.32.192.10.in-addr.arpa 0.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:34 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:34 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2232 - jelto@cumin1002"
  • 10:34 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2232 - jelto@cumin1002"
  • 10:31 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:30 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2232
  • 10:30 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2232.codfw.wmnet with OS bullseye
  • 10:30 jelto@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2232.codfw.wmnet
  • 10:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 10:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 10:26 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2232.codfw.wmnet wikikube-worker2233.codfw.wmnet wikikube-worker2234.codfw.wmnet wikikube-worker2235.codfw.wmnet on all recursors
  • 10:26 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2232.codfw.wmnet wikikube-worker2233.codfw.wmnet wikikube-worker2234.codfw.wmnet wikikube-worker2235.codfw.wmnet on all recursors
  • 10:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2353 to wikikube-worker2235
  • 10:25 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2235
  • 10:25 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2235
  • 10:25 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:25 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2353 to wikikube-worker2235 - jelto@cumin1002"
  • 10:25 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2353 to wikikube-worker2235 - jelto@cumin1002"
  • 10:21 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:20 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2353 to wikikube-worker2235
  • 10:20 marostegui@cumin1002: dbctl commit (dc=all): 'es1045 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72110 and previous config saved to /var/cache/conftool/dbconfig/20250116-102021-root.json
  • 10:19 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=99) from mw2353 to wikikube-worker2235
  • 10:19 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2353 to wikikube-worker2235
  • 10:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2352 to wikikube-worker2234
  • 10:17 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2234
  • 10:17 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2234
  • 10:17 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:17 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2352 to wikikube-worker2234 - jelto@cumin1002"
  • 10:17 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2352 to wikikube-worker2234 - jelto@cumin1002"
  • 10:13 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:13 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2352 to wikikube-worker2234
  • 10:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2351 to wikikube-worker2233
  • 10:11 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2233
  • 10:11 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2233
  • 10:11 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:11 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2351 to wikikube-worker2233 - jelto@cumin1002"
  • 10:10 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2351 to wikikube-worker2233 - jelto@cumin1002"
  • 10:07 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:06 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2351 to wikikube-worker2233
  • 10:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2350 to wikikube-worker2232
  • 10:05 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2232
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'es1045 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72109 and previous config saved to /var/cache/conftool/dbconfig/20250116-100516-root.json
  • 10:05 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2232
  • 10:05 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:05 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2350 to wikikube-worker2232 - jelto@cumin1002"
  • 10:04 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2350 to wikikube-worker2232 - jelto@cumin1002"
  • 10:01 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 10:01 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:00 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2350 to wikikube-worker2232
  • 10:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 09:58 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2350-2353].codfw.wmnet
  • 09:56 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2350-2353].codfw.wmnet
  • 09:54 marostegui@cumin1002: dbctl commit (dc=all): 'Promote pc2015 to pc5 codfw master dbmaint and enable pc5 back in eqiad and codfw T383398', diff saved to https://phabricator.wikimedia.org/P72107 and previous config saved to /var/cache/conftool/dbconfig/20250116-095431-marostegui.json
  • 09:50 marostegui@cumin1002: dbctl commit (dc=all): 'es1045 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P72106 and previous config saved to /var/cache/conftool/dbconfig/20250116-095011-root.json
  • 09:45 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on pc[2015,2017].codfw.wmnet,pc[1015,1017].eqiad.wmnet with reason: reorganizing pc5
  • 09:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depool pc5 eqiad and codfw dbmaint T383398', diff saved to https://phabricator.wikimedia.org/P72104 and previous config saved to /var/cache/conftool/dbconfig/20250116-094439-marostegui.json
  • 09:35 marostegui@cumin1002: dbctl commit (dc=all): 'es1045 (re)pooling @ 4%: Repooling', diff saved to https://phabricator.wikimedia.org/P72103 and previous config saved to /var/cache/conftool/dbconfig/20250116-093505-root.json
  • 09:31 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72102 and previous config saved to /var/cache/conftool/dbconfig/20250116-093113-root.json
  • 09:29 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2228-2231].codfw.wmnet
  • 09:29 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2228-2231].codfw.wmnet
  • 09:27 jelto: homer 'cr*codfw*' commit 'T377877'
  • 09:26 jelto: homer 'lsw1-c6-codfw*' commit 'T377877'
  • 09:25 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2231.codfw.wmnet with OS bookworm
  • 09:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2230.codfw.wmnet with OS bookworm
  • 09:20 marostegui@cumin1002: dbctl commit (dc=all): 'es1045 (re)pooling @ 3%: Repooling', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20250116-091956-root.json
  • 09:16 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72100 and previous config saved to /var/cache/conftool/dbconfig/20250116-091607-root.json
  • 09:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2229.codfw.wmnet with OS bookworm
  • 09:12 hashar: UTC morning backport window completed.
  • 09:08 hashar@deploy2002: Finished scap sync-world: Backport for Add dso and thq to wmgExtraLanguageNames (T383785) (duration: 12m 46s)
  • 09:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2231.codfw.wmnet with reason: host reimage
  • 09:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2228.codfw.wmnet with OS bookworm
  • 09:04 marostegui@cumin1002: dbctl commit (dc=all): 'es1045 (re)pooling @ 2%: Repooling', diff saved to https://phabricator.wikimedia.org/P72099 and previous config saved to /var/cache/conftool/dbconfig/20250116-090451-root.json
  • 09:03 hashar@deploy2002: anzx, hashar: Continuing with sync
  • 09:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2230.codfw.wmnet with reason: host reimage
  • 09:03 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2231.codfw.wmnet with reason: host reimage
  • 09:01 hashar@deploy2002: anzx, hashar: Backport for Add dso and thq to wmgExtraLanguageNames (T383785) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:01 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72098 and previous config saved to /var/cache/conftool/dbconfig/20250116-090102-root.json
  • 08:58 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2230.codfw.wmnet with reason: host reimage
  • 08:55 hashar@deploy2002: Started scap sync-world: Backport for Add dso and thq to wmgExtraLanguageNames (T383785)
  • 08:55 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es1025.eqiad.wmnet with reason: cloning
  • 08:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1025', diff saved to https://phabricator.wikimedia.org/P72097 and previous config saved to /var/cache/conftool/dbconfig/20250116-085439-marostegui.json
  • 08:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2229.codfw.wmnet with reason: host reimage
  • 08:53 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1023 to es5 primary T382569', diff saved to https://phabricator.wikimedia.org/P72096 and previous config saved to /var/cache/conftool/dbconfig/20250116-085305-root.json
  • 08:51 hashar@deploy2002: Finished scap sync-world: Backport for mediawiki.widgets: Fix aborting TitleWidget request breaking it permanently (T383497), mediawiki.widgets: Fix aborting TitleWidget request breaking it permanently (T383497) (duration: 14m 53s)
  • 08:50 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2229.codfw.wmnet with reason: host reimage
  • 08:49 marostegui@cumin1002: dbctl commit (dc=all): 'es1045 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P72095 and previous config saved to /var/cache/conftool/dbconfig/20250116-084946-root.json
  • 08:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2231
  • 08:45 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2231
  • 08:45 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72094 and previous config saved to /var/cache/conftool/dbconfig/20250116-084557-root.json
  • 08:45 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2231
  • 08:45 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2231.codfw.wmnet 207.32.192.10.in-addr.arpa 7.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:45 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2228.codfw.wmnet with reason: host reimage
  • 08:45 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2231.codfw.wmnet 207.32.192.10.in-addr.arpa 7.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:45 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:45 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2231 - jelto@cumin1002"
  • 08:45 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2231 - jelto@cumin1002"
  • 08:42 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:41 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2231
  • 08:41 hashar@deploy2002: matmarex, hashar: Continuing with sync
  • 08:41 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2231.codfw.wmnet with OS bookworm
  • 08:41 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2228.codfw.wmnet with reason: host reimage
  • 08:40 hashar@deploy2002: matmarex, hashar: Backport for mediawiki.widgets: Fix aborting TitleWidget request breaking it permanently (T383497), mediawiki.widgets: Fix aborting TitleWidget request breaking it permanently (T383497) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:40 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2230
  • 08:40 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2230
  • 08:39 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2230
  • 08:39 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2230.codfw.wmnet 206.32.192.10.in-addr.arpa 6.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:39 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2230.codfw.wmnet 206.32.192.10.in-addr.arpa 6.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:39 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:39 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2230 - jelto@cumin1002"
  • 08:39 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2230 - jelto@cumin1002"
  • 08:36 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:36 hashar@deploy2002: Started scap sync-world: Backport for mediawiki.widgets: Fix aborting TitleWidget request breaking it permanently (T383497), mediawiki.widgets: Fix aborting TitleWidget request breaking it permanently (T383497)
  • 08:36 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2230
  • 08:36 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2230.codfw.wmnet with OS bookworm
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Add es1045 to dbctl depooled T382569', diff saved to https://phabricator.wikimedia.org/P72093 and previous config saved to /var/cache/conftool/dbconfig/20250116-083559-marostegui.json
  • 08:32 oblivian@deploy2002: Finished scap sync-world: Backport for ClusterConfig: add support for dumps trait (T382947) (duration: 12m 30s)
  • 08:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2229
  • 08:32 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2229
  • 08:32 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2229
  • 08:32 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2229.codfw.wmnet 205.32.192.10.in-addr.arpa 5.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:32 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2229.codfw.wmnet 205.32.192.10.in-addr.arpa 5.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:32 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:32 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2229 - jelto@cumin1002"
  • 08:32 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2229 - jelto@cumin1002"
  • 08:30 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72092 and previous config saved to /var/cache/conftool/dbconfig/20250116-083051-root.json
  • 08:28 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:28 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2229
  • 08:28 oblivian@deploy2002: oblivian: Continuing with sync
  • 08:28 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2229.codfw.wmnet with OS bookworm
  • 08:26 oblivian@deploy2002: oblivian: Backport for ClusterConfig: add support for dumps trait (T382947) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2228
  • 08:23 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2228
  • 08:23 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2228
  • 08:23 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2228.codfw.wmnet 204.32.192.10.in-addr.arpa 4.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:23 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2228.codfw.wmnet 204.32.192.10.in-addr.arpa 4.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:23 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:23 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2228 - jelto@cumin1002"
  • 08:23 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2228 - jelto@cumin1002"
  • 08:20 oblivian@deploy2002: Started scap sync-world: Backport for ClusterConfig: add support for dumps trait (T382947)
  • 08:20 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:19 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2228
  • 08:19 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2228.codfw.wmnet with OS bookworm
  • 08:17 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2228.codfw.wmnet wikikube-worker2229.codfw.wmnet wikikube-worker2230.codfw.wmnet wikikube-worker2231.codfw.wmnet on all recursors
  • 08:17 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2228.codfw.wmnet wikikube-worker2229.codfw.wmnet wikikube-worker2230.codfw.wmnet wikikube-worker2231.codfw.wmnet on all recursors
  • 08:17 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2357 to wikikube-worker2231
  • 08:16 oblivian@deploy2002: Finished scap sync-world: Backport for Explicitly disable all local imagescaling on k8s (T352515) (duration: 12m 35s)
  • 08:16 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2231
  • 08:16 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2231
  • 08:16 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:16 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2357 to wikikube-worker2231 - jelto@cumin1002"
  • 08:15 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2357 to wikikube-worker2231 - jelto@cumin1002"
  • 08:12 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:12 oblivian@deploy2002: oblivian: Continuing with sync
  • 08:12 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2357 to wikikube-worker2231
  • 08:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2356 to wikikube-worker2230
  • 08:10 oblivian@deploy2002: oblivian: Backport for Explicitly disable all local imagescaling on k8s (T352515) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:10 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2230
  • 08:10 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2230
  • 08:10 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:10 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2356 to wikikube-worker2230 - jelto@cumin1002"
  • 08:10 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2356 to wikikube-worker2230 - jelto@cumin1002"
  • 08:06 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:06 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2356 to wikikube-worker2230
  • 08:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2355 to wikikube-worker2229
  • 08:04 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2229
  • 08:04 oblivian@deploy2002: Started scap sync-world: Backport for Explicitly disable all local imagescaling on k8s (T352515)
  • 08:04 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2229
  • 08:04 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:04 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2355 to wikikube-worker2229 - jelto@cumin1002"
  • 08:03 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2355 to wikikube-worker2229 - jelto@cumin1002"
  • 08:00 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 07:59 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2355 to wikikube-worker2229
  • 07:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2354 to wikikube-worker2228
  • 07:58 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2228
  • 07:58 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2228
  • 07:58 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:58 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2354 to wikikube-worker2228 - jelto@cumin1002"
  • 07:57 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2354 to wikikube-worker2228 - jelto@cumin1002"
  • 07:54 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 07:53 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2354 to wikikube-worker2228
  • 07:47 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2354-2357].codfw.wmnet
  • 07:45 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2354-2357].codfw.wmnet
  • 07:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 07:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 03:21 eileen: civicrm upgraded from 4206a253 to c671fd23
  • 01:01 eileen: config revision changed from 0f227a32 to d7fbb972
  • 00:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T370903)', diff saved to https://phabricator.wikimedia.org/P72091 and previous config saved to /var/cache/conftool/dbconfig/20250116-002137-ladsgroup.json
  • 00:13 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 00:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P72090 and previous config saved to /var/cache/conftool/dbconfig/20250116-000630-ladsgroup.json

2025-01-15

  • 23:52 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
  • 23:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P72089 and previous config saved to /var/cache/conftool/dbconfig/20250115-235123-ladsgroup.json
  • 23:38 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 23:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T370903)', diff saved to https://phabricator.wikimedia.org/P72088 and previous config saved to /var/cache/conftool/dbconfig/20250115-233617-ladsgroup.json
  • 23:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1163 (T370903)', diff saved to https://phabricator.wikimedia.org/P72087 and previous config saved to /var/cache/conftool/dbconfig/20250115-232737-ladsgroup.json
  • 23:27 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 22:49 cwhite: restarted thanos-query-fronted on titan100[12]
  • 22:45 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
  • 22:41 ryankemper@cumin2002: END (ERROR) - Cookbook sre.wdqs.restart (exit_code=97)
  • 22:36 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
  • 22:35 ryankemper@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
  • 22:35 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
  • 22:18 dmartin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 22:17 dmartin@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 22:17 dmartin@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 22:15 dmartin@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 22:12 dmartin@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 22:10 dmartin@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 21:40 tzatziki: Removing 10 files for legal compliance
  • 21:22 samtar@deploy2002: Finished scap sync-world: Backport for do not allow temp users to edit on loginwiki (T383747) (duration: 09m 34s)
  • 21:19 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: sync
  • 21:19 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: sync
  • 21:18 otto@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: sync
  • 21:18 samtar@deploy2002: samtar, jjmc89: Continuing with sync
  • 21:17 samtar@deploy2002: samtar, jjmc89: Backport for do not allow temp users to edit on loginwiki (T383747) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:17 otto@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: sync
  • 21:17 tzatziki: Removing 11 files for legal compliance
  • 21:16 otto@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: sync
  • 21:16 otto@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: sync
  • 21:16 ottomata: roll restarting eventgate-analytics to pick up new stream configuration for haproxy_requestctl
  • 21:13 samtar@deploy2002: Started scap sync-world: Backport for do not allow temp users to edit on loginwiki (T383747)
  • 20:54 tzatziki: Removing 1 file for legal compliance
  • 20:31 tzatziki: Removing 1 file for legal compliance
  • 20:28 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 20:20 tzatziki: Removing 2 files for legal compliance
  • 20:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T371742)', diff saved to https://phabricator.wikimedia.org/P72085 and previous config saved to /var/cache/conftool/dbconfig/20250115-201152-ladsgroup.json
  • 20:01 brennen@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.44.0-wmf.12 refs T382363
  • 19:59 tzatziki: Removing 1 file for legal compliance
  • 19:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P72083 and previous config saved to /var/cache/conftool/dbconfig/20250115-195645-ladsgroup.json
  • 19:55 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 19:54 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 19:46 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 19:46 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 19:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P72082 and previous config saved to /var/cache/conftool/dbconfig/20250115-194138-ladsgroup.json
  • 19:39 brennen@deploy2002: Finished scap sync-world: Backport for RevisionStore: No first revision of non-existing page (T380677) (duration: 17m 00s)
  • 19:34 brennen@deploy2002: bpirkle, brennen: Continuing with sync
  • 19:29 brennen@deploy2002: bpirkle, brennen: Backport for RevisionStore: No first revision of non-existing page (T380677) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T371742)', diff saved to https://phabricator.wikimedia.org/P72081 and previous config saved to /var/cache/conftool/dbconfig/20250115-192631-ladsgroup.json
  • 19:22 brennen@deploy2002: Started scap sync-world: Backport for RevisionStore: No first revision of non-existing page (T380677)
  • 19:21 aqu@deploy2002: Finished deploy [airflow-dags/analytics@cd03eb7]: Cascading backfill under projectview hourly (duration: 01m 06s)
  • 19:20 aqu@deploy2002: Started deploy [airflow-dags/analytics@cd03eb7]: Cascading backfill under projectview hourly
  • 19:15 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 19:15 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 19:08 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 19:08 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 19:07 sukhe@cumin1002: END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: depool site ulsfo [reason: this is a test, not actual depool, no task ID specified]
  • 19:07 sukhe@cumin1002: START - Cookbook sre.dns.admin DNS admin: depool site ulsfo [reason: this is a test, not actual depool, no task ID specified]
  • 18:51 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 18:50 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 18:22 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 18:22 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 18:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1163 (T371742)', diff saved to https://phabricator.wikimedia.org/P72080 and previous config saved to /var/cache/conftool/dbconfig/20250115-181629-ladsgroup.json
  • 18:16 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 18:16 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 18:09 volans@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on sretest1002.eqiad.wmnet with reason: testing cookbook
  • 18:06 volans@cumin2002: START - Cookbook sre.hosts.downtime for 0:05:00 on sretest1002.eqiad.wmnet with reason: testing cookbook
  • 18:05 volans@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on sretest1001.eqiad.wmnet with reason: testing cookbook
  • 17:59 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Added new stream config for haproxy_requestctl (T383392) (duration: 13m 38s)
  • 17:58 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 17:56 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host cloudcephosd1012.eqiad.wmnet
  • 17:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72079 and previous config saved to /var/cache/conftool/dbconfig/20250115-175456-root.json
  • 17:53 cmooney@cumin1002: START - Cookbook sre.hosts.dhcp for host cloudcephosd1012.eqiad.wmnet
  • 17:53 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1102-1106].eqiad.wmnet
  • 17:53 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1102-1106].eqiad.wmnet
  • 17:52 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, fabfur: Continuing with sync
  • 17:52 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, fabfur: Backport for Added new stream config for haproxy_requestctl (T383392) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1103.eqiad.wmnet with OS bookworm
  • 17:51 kamila_: homer cr*eqiad* commit T377876
  • 17:49 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1102.eqiad.wmnet with OS bookworm
  • 17:47 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1105.eqiad.wmnet with OS bookworm
  • 17:45 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Added new stream config for haproxy_requestctl (T383392)
  • 17:43 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1106.eqiad.wmnet with OS bookworm
  • 17:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72077 and previous config saved to /var/cache/conftool/dbconfig/20250115-173951-root.json
  • 17:36 andrew@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 17:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1103.eqiad.wmnet with reason: host reimage
  • 17:30 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1102.eqiad.wmnet with reason: host reimage
  • 17:26 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1105.eqiad.wmnet with reason: host reimage
  • 17:26 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 17:26 andrew@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 17:25 root@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for dbprov1006.eqiad.wmnet: Renew puppet certificate - root@cumin1002
  • 17:24 marostegui@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72076 and previous config saved to /var/cache/conftool/dbconfig/20250115-172445-root.json
  • 17:24 hnowlan: running `decommission` for 5 codfw jobrunners
  • 17:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1106.eqiad.wmnet with reason: host reimage
  • 17:22 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1103.eqiad.wmnet with reason: host reimage
  • 17:21 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1105.eqiad.wmnet with reason: host reimage
  • 17:21 root@cumin1002: START - Cookbook sre.puppet.renew-cert for dbprov1006.eqiad.wmnet: Renew puppet certificate - root@cumin1002
  • 17:21 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1102.eqiad.wmnet with reason: host reimage
  • 17:21 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1106.eqiad.wmnet with reason: host reimage
  • 17:18 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 17:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72075 and previous config saved to /var/cache/conftool/dbconfig/20250115-170940-root.json
  • 17:06 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1103
  • 17:06 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1103
  • 17:06 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1103.eqiad.wmnet with OS bookworm
  • 17:05 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 17:05 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1102
  • 17:05 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1102
  • 17:05 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1102.eqiad.wmnet with OS bookworm
  • 17:05 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1105
  • 17:05 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1105
  • 17:05 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1106
  • 17:05 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1106
  • 17:05 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1106.eqiad.wmnet with OS bookworm
  • 17:05 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1105.eqiad.wmnet with OS bookworm
  • 17:04 andrew@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 17:02 kamila@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1102.eqiad.wmnet with OS bookworm
  • 17:02 kamila@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1106.eqiad.wmnet with OS bookworm
  • 17:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1105.eqiad.wmnet with OS bookworm
  • 17:01 kamila@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1103.eqiad.wmnet with OS bookworm
  • 16:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1104.eqiad.wmnet with OS bookworm
  • 16:58 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 16:56 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bookworm
  • 16:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72074 and previous config saved to /var/cache/conftool/dbconfig/20250115-165434-root.json
  • 16:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1106
  • 16:53 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1106
  • 16:53 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1106.eqiad.wmnet with OS bookworm
  • 16:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1103
  • 16:53 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1103
  • 16:53 kamila@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1106.eqiad.wmnet with OS bookworm
  • 16:53 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1103.eqiad.wmnet with OS bookworm
  • 16:53 kamila@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1103.eqiad.wmnet with OS bookworm
  • 16:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1102
  • 16:52 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1102
  • 16:52 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1102.eqiad.wmnet with OS bookworm
  • 16:52 kamila@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1102.eqiad.wmnet with OS bookworm
  • 16:43 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1105.eqiad.wmnet with reason: host reimage
  • 16:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1104.eqiad.wmnet with reason: host reimage
  • 16:36 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1105.eqiad.wmnet with reason: host reimage
  • 16:36 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1104.eqiad.wmnet with reason: host reimage
  • 16:21 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1106
  • 16:21 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1106
  • 16:21 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1106.eqiad.wmnet with OS bookworm
  • 16:21 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1105
  • 16:21 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1105
  • 16:21 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1105.eqiad.wmnet with OS bookworm
  • 16:21 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1104
  • 16:21 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1104
  • 16:20 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1104.eqiad.wmnet with OS bookworm
  • 16:20 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1103
  • 16:20 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1103
  • 16:20 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1103.eqiad.wmnet with OS bookworm
  • 16:20 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1102
  • 16:20 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1102
  • 16:20 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1102.eqiad.wmnet with OS bookworm
  • 16:20 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2192.codfw.wmnet
  • 16:20 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2192.codfw.wmnet
  • 16:17 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1102.eqiad.wmnet wikikube-worker1103.eqiad.wmnet wikikube-worker1104.eqiad.wmnet wikikube-worker1105.eqiad.wmnet wikikube-worker1106.eqiad.wmnet on all recursors
  • 16:17 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1102.eqiad.wmnet wikikube-worker1103.eqiad.wmnet wikikube-worker1104.eqiad.wmnet wikikube-worker1105.eqiad.wmnet wikikube-worker1106.eqiad.wmnet on all recursors
  • 16:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1425 to wikikube-worker1106
  • 16:17 jelto: homer 'lsw1-d8-codfw*' commit 'T377877'
  • 16:16 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1106
  • 16:16 volans@cumin2002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
  • 16:16 volans@cumin2002: START - Cookbook sre.dns.netbox
  • 16:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2192.codfw.wmnet with OS bookworm
  • 16:15 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jelto@cumin1002"
  • 16:14 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jelto@cumin1002"
  • 16:14 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1106
  • 16:14 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:14 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1425 to wikikube-worker1106 - kamila@cumin1002"
  • 16:13 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1425 to wikikube-worker1106 - kamila@cumin1002"
  • 16:09 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:09 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1424 to wikikube-worker1105
  • 16:08 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1425 to wikikube-worker1106
  • 16:08 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1105
  • 16:07 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1105
  • 16:07 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:07 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1424 to wikikube-worker1105 - kamila@cumin1002"
  • 16:06 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1424 to wikikube-worker1105 - kamila@cumin1002"
  • 16:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1423 to wikikube-worker1104
  • 16:04 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1104
  • 16:02 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:02 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1104
  • 16:02 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:02 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1423 to wikikube-worker1104 - kamila@cumin1002"
  • 16:02 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1423 to wikikube-worker1104 - kamila@cumin1002"
  • 16:01 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1424 to wikikube-worker1105
  • 16:00 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd2004-dev.codfw.wmnet with OS bullseye
  • 16:00 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1422 to wikikube-worker1103
  • 16:00 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov1006.eqiad.wmnet with reason: os upgrade
  • 15:59 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov1006.eqiad.wmnet with reason: os upgrade
  • 15:59 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1103
  • 15:58 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:58 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1103
  • 15:58 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:58 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1422 to wikikube-worker1103 - kamila@cumin1002"
  • 15:58 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1422 to wikikube-worker1103 - kamila@cumin1002"
  • 15:57 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1423 to wikikube-worker1104
  • 15:54 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2130.codfw.wmnet
  • 15:54 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2192.codfw.wmnet with reason: host reimage
  • 15:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1421 to wikikube-worker1102
  • 15:53 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1102
  • 15:53 moritzm: installing libsoup2.4 security updates
  • 15:52 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1422 to wikikube-worker1103
  • 15:51 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 15:51 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1102
  • 15:51 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:51 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1421 to wikikube-worker1102 - kamila@cumin1002"
  • 15:51 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1421 to wikikube-worker1102 - kamila@cumin1002"
  • 15:50 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2192.codfw.wmnet with reason: host reimage
  • 15:48 root@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for dbprov1005.eqiad.wmnet: Renew puppet certificate - root@cumin1002
  • 15:47 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2130.codfw.wmnet
  • 15:46 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:46 root@cumin1002: START - Cookbook sre.puppet.renew-cert for dbprov1005.eqiad.wmnet: Renew puppet certificate - root@cumin1002
  • 15:46 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1421 to wikikube-worker1102
  • 15:46 root@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for dbprov1004.eqiad.wmnet: Renew puppet certificate - root@cumin1002
  • 15:43 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[1421-1425].eqiad.wmnet
  • 15:43 root@cumin1002: START - Cookbook sre.puppet.renew-cert for dbprov1004.eqiad.wmnet: Renew puppet certificate - root@cumin1002
  • 15:40 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[1421-1425].eqiad.wmnet
  • 15:36 moritzm: installing python-django security updates
  • 15:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 15:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 15:30 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2224-2227].codfw.wmnet
  • 15:30 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2224-2227].codfw.wmnet
  • 15:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2192
  • 15:30 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2192
  • 15:30 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2192.codfw.wmnet with OS bookworm
  • 15:28 volans: uploaded spicerack_9.1.0 to apt.wikimedia.org bullseye-wikimedia
  • 15:26 jelto: homer 'cr*codfw*' commit 'T377877'
  • 15:25 jelto: homer 'lsw1-c6-codfw*' commit 'T377877'
  • 15:23 jelto: homer 'lsw1-d3-codfw*' commit 'T377877'
  • 15:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2225.codfw.wmnet with OS bookworm
  • away: UTC afternoon deploys done
  • 15:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2226.codfw.wmnet with OS bookworm
  • 15:13 tgr@deploy2002: Finished scap sync-world: Backport for Enable SUL3 on test wikis, second attempt (T383729) (duration: 27m 56s)
  • 15:13 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov1005.eqiad.wmnet with reason: os upgrade
  • 15:12 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov1005.eqiad.wmnet with reason: os upgrade
  • 15:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2224.codfw.wmnet with OS bookworm
  • 15:09 tgr@deploy2002: tgr: Continuing with sync
  • 15:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2225.codfw.wmnet with reason: host reimage
  • 14:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2226.codfw.wmnet with reason: host reimage
  • 14:52 tgr@deploy2002: tgr: Backport for Enable SUL3 on test wikis, second attempt (T383729) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2224.codfw.wmnet with reason: host reimage
  • 14:48 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2225.codfw.wmnet with reason: host reimage
  • 14:48 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2226.codfw.wmnet with reason: host reimage
  • 14:48 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2224.codfw.wmnet with reason: host reimage
  • 14:45 tgr@deploy2002: Started scap sync-world: Backport for Enable SUL3 on test wikis, second attempt (T383729)
  • 14:43 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Yet more authentication domain overrides (T383729 T373737), Move Beta Cluster favicons to this repository, Replace favicon.php with static.php (T374997) (duration: 15m 52s)
  • 14:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2227.codfw.wmnet with OS bookworm
  • 14:40 aqu: Deployed refinery using scap, then deployed onto hdfs
  • 14:38 lucaswerkmeister-wmde@deploy2002: tgr, matmarex, lucaswerkmeister-wmde: Continuing with sync
  • 14:36 aqu@deploy2002: Finished deploy [analytics/refinery@eee2443] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eee24435] (duration: 00m 28s)
  • 14:35 aqu@deploy2002: Started deploy [analytics/refinery@eee2443] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eee24435]
  • 14:35 aqu@deploy2002: Finished deploy [analytics/refinery@eee2443] (thin): Regular analytics weekly train THIN [analytics/refinery@eee24435] (duration: 00m 33s)
  • 14:35 lucaswerkmeister-wmde@deploy2002: tgr, matmarex, lucaswerkmeister-wmde: Backport for Yet more authentication domain overrides (T383729 T373737), Move Beta Cluster favicons to this repository, Replace favicon.php with static.php (T374997) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:34 aqu@deploy2002: Started deploy [analytics/refinery@eee2443] (thin): Regular analytics weekly train THIN [analytics/refinery@eee24435]
  • 14:34 aqu@deploy2002: Finished deploy [analytics/refinery@eee2443]: Regular analytics weekly train [analytics/refinery@eee24435] (duration: 01m 58s)
  • 14:32 aqu@deploy2002: Started deploy [analytics/refinery@eee2443]: Regular analytics weekly train [analytics/refinery@eee24435]
  • 14:32 aqu: Deploying analytics/refinery
  • 14:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2226
  • 14:31 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2226
  • 14:31 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2226.codfw.wmnet with OS bookworm
  • 14:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2225
  • 14:30 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2225
  • 14:30 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2225.codfw.wmnet with OS bookworm
  • 14:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2224
  • 14:30 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2224
  • 14:30 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2224.codfw.wmnet with OS bookworm
  • 14:30 jelto@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker2226.codfw.wmnet with OS bookworm
  • 14:30 jelto@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker2225.codfw.wmnet with OS bookworm
  • 14:29 jelto@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker2224.codfw.wmnet with OS bookworm
  • 14:27 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Yet more authentication domain overrides (T383729 T373737), Move Beta Cluster favicons to this repository, Replace favicon.php with static.php (T374997)
  • 14:24 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Add entry point names to all entry points under w/ (T383729) (duration: 16m 48s)
  • 14:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2227.codfw.wmnet with reason: host reimage
  • 14:18 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2227.codfw.wmnet with reason: host reimage
  • 14:18 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, tgr: Continuing with sync
  • 14:17 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 14:16 jiji@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 14:14 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, tgr: Backport for Add entry point names to all entry points under w/ (T383729) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:14 dcausse: repooling wdqs1013
  • 14:08 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Add entry point names to all entry points under w/ (T383729)
  • 14:02 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 14:02 jiji@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 14:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2227
  • 14:00 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2227
  • 14:00 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2227
  • 14:00 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2227.codfw.wmnet 18.48.192.10.in-addr.arpa 8.1.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:00 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2227.codfw.wmnet 18.48.192.10.in-addr.arpa 8.1.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:00 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:00 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2227 - jelto@cumin1002"
  • 14:00 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2227 - jelto@cumin1002"
  • 13:56 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:56 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2227
  • 13:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2226
  • 13:56 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2226
  • 13:55 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2226
  • 13:55 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2226.codfw.wmnet 17.48.192.10.in-addr.arpa 7.1.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:55 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2226.codfw.wmnet 17.48.192.10.in-addr.arpa 7.1.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:55 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:55 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2226 - jelto@cumin1002"
  • 13:55 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2226 - jelto@cumin1002"
  • 13:53 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:53 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:52 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2227.codfw.wmnet with OS bookworm
  • 13:51 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:51 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2226
  • 13:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2225
  • 13:51 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2225
  • 13:51 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2225
  • 13:51 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2225.codfw.wmnet 15.48.192.10.in-addr.arpa 5.1.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:51 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2225.codfw.wmnet 15.48.192.10.in-addr.arpa 5.1.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:51 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:51 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2225 - jelto@cumin1002"
  • 13:51 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2225 - jelto@cumin1002"
  • 13:48 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2226.codfw.wmnet with OS bookworm
  • 13:47 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2224
  • 13:47 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2224
  • 13:47 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:47 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2224
  • 13:47 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2224.codfw.wmnet 209.32.192.10.in-addr.arpa 9.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:47 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2224.codfw.wmnet 209.32.192.10.in-addr.arpa 9.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:47 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:47 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2224 - jelto@cumin1002"
  • 13:47 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2224 - jelto@cumin1002"
  • 13:47 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2225
  • 13:46 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2225.codfw.wmnet with OS bookworm
  • 13:44 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:43 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2224
  • 13:43 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2224.codfw.wmnet with OS bookworm
  • 13:40 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2224.codfw.wmnet wikikube-worker2225.codfw.wmnet wikikube-worker2226.codfw.wmnet wikikube-worker2227.codfw.wmnet on all recursors
  • 13:40 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2224.codfw.wmnet wikikube-worker2225.codfw.wmnet wikikube-worker2226.codfw.wmnet wikikube-worker2227.codfw.wmnet on all recursors
  • 13:39 dcausse: errata: depooling wdqs1013 (high lag)
  • 13:39 dcausse: depooling wdqs1003 (high lag)
  • 13:36 moritzm: imported firmware-amd-graphics 20230210-5~bpo11+1 into component/amd-gpu-firmware for bullseye-wikimedia T383557
  • 13:31 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov1004.eqiad.wmnet with reason: os upgrade
  • 13:30 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov1004.eqiad.wmnet with reason: os upgrade
  • 13:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2368 to wikikube-worker2227
  • 13:30 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2227
  • 13:29 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2227
  • 13:29 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:29 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2368 to wikikube-worker2227 - jelto@cumin1002"
  • 13:22 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 13:22 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 13:21 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 13:21 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 13:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1163.eqiad.wmnet with reason: maintenance
  • 13:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1163.eqiad.wmnet with reason: maintenance
  • 13:09 elukey: restart wdqs-blazegraph.service on wdqs1013 - failed http probes
  • 13:06 elukey: `elukey@puppetmaster1001:~$ sudo puppet cert destroy graphite.discovery.wmnet`
  • 13:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1163 T383689', diff saved to https://phabricator.wikimedia.org/P72073 and previous config saved to /var/cache/conftool/dbconfig/20250115-130511-marostegui.json
  • 13:03 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db1184 to s1 primary T383689', diff saved to https://phabricator.wikimedia.org/P72072 and previous config saved to /var/cache/conftool/dbconfig/20250115-130354-root.json
  • 13:03 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2368 to wikikube-worker2227 - jelto@cumin1002"
  • 13:03 marostegui: Starting s1 eqiad failover from db1163 to db1184 - T383689
  • 12:59 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:58 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2368 to wikikube-worker2227
  • 12:58 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db1184 from API T383689', diff saved to https://phabricator.wikimedia.org/P72071 and previous config saved to /var/cache/conftool/dbconfig/20250115-125818-root.json
  • 12:58 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2367 to wikikube-worker2226
  • 12:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 34 hosts with reason: Primary switchover s1 T383689
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2226
  • 12:57 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1184 with weight 0 T383689', diff saved to https://phabricator.wikimedia.org/P72070 and previous config saved to /var/cache/conftool/dbconfig/20250115-125732-root.json
  • 12:57 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2226
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2367 to wikikube-worker2226 - jelto@cumin1002"
  • 12:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 34 hosts with reason: Primary switchover s1 T383689
  • 12:57 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2367 to wikikube-worker2226 - jelto@cumin1002"
  • 12:53 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:53 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2367 to wikikube-worker2226
  • 12:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2366 to wikikube-worker2225
  • 12:51 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2225
  • 12:51 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2225
  • 12:51 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:51 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2366 to wikikube-worker2225 - jelto@cumin1002"
  • 12:50 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2366 to wikikube-worker2225 - jelto@cumin1002"
  • 12:46 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:46 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2366 to wikikube-worker2225
  • 12:45 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2359 to wikikube-worker2224
  • 12:44 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2224
  • 12:44 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2224
  • 12:44 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:44 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2359 to wikikube-worker2224 - jelto@cumin1002"
  • 12:43 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2359 to wikikube-worker2224 - jelto@cumin1002"
  • 12:40 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:39 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2359 to wikikube-worker2224
  • 12:35 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2130 from dbctl T383766', diff saved to https://phabricator.wikimedia.org/P72069 and previous config saved to /var/cache/conftool/dbconfig/20250115-123501-marostegui.json
  • 12:30 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2359,2366-2368].codfw.wmnet
  • 12:28 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2359,2366-2368].codfw.wmnet
  • 12:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2130 T374623', diff saved to https://phabricator.wikimedia.org/P72068 and previous config saved to /var/cache/conftool/dbconfig/20250115-122419-marostegui.json
  • 12:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2134,2160,2234].codfw.wmnet with reason: maintenance
  • 12:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db[2134,2160,2234].codfw.wmnet with reason: maintenance
  • 12:02 kart_: Updated cxserver to 2025-01-15-103159-production (T377966)
  • 11:59 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 11:59 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 11:58 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 11:58 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 11:56 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 11:56 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 11:32 marostegui@cumin1002: dbctl commit (dc=all): 'es1043 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72067 and previous config saved to /var/cache/conftool/dbconfig/20250115-113210-root.json
  • 11:17 marostegui@cumin1002: dbctl commit (dc=all): 'es1043 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72066 and previous config saved to /var/cache/conftool/dbconfig/20250115-111704-root.json
  • 11:15 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2220-2223].codfw.wmnet
  • 11:15 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2220-2223].codfw.wmnet
  • 11:02 marostegui@cumin1002: dbctl commit (dc=all): 'es1043 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72064 and previous config saved to /var/cache/conftool/dbconfig/20250115-110159-root.json
  • 10:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2222.codfw.wmnet with OS bookworm
  • 10:46 marostegui@cumin1002: dbctl commit (dc=all): 'es1043 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72063 and previous config saved to /var/cache/conftool/dbconfig/20250115-104654-root.json
  • 10:44 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2223.codfw.wmnet with OS bookworm
  • 10:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2221.codfw.wmnet with OS bookworm
  • 10:36 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2220.codfw.wmnet with OS bookworm
  • 10:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2222.codfw.wmnet with reason: host reimage
  • 10:31 marostegui@cumin1002: dbctl commit (dc=all): 'es1043 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72062 and previous config saved to /var/cache/conftool/dbconfig/20250115-103149-root.json
  • 10:25 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2223.codfw.wmnet with reason: host reimage
  • 10:21 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2221.codfw.wmnet with reason: host reimage
  • 10:19 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2223.codfw.wmnet with reason: host reimage
  • 10:18 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2222.codfw.wmnet with reason: host reimage
  • 10:17 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2221.codfw.wmnet with reason: host reimage
  • 10:16 marostegui@cumin1002: dbctl commit (dc=all): 'es1043 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P72061 and previous config saved to /var/cache/conftool/dbconfig/20250115-101643-root.json
  • 10:16 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2220.codfw.wmnet with reason: host reimage
  • 10:13 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru and A:cp
  • 10:13 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2220.codfw.wmnet with reason: host reimage
  • 10:08 jayme: re-enabling puppet on nodes using k8s::package resource - T341984
  • 10:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es1024.eqiad.wmnet with reason: cloning
  • 10:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es1024.eqiad.wmnet with reason: cloning
  • 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1024 T382569', diff saved to https://phabricator.wikimedia.org/P72060 and previous config saved to /var/cache/conftool/dbconfig/20250115-100228-marostegui.json
  • 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1025 to eqiad es5 master dbmaint T382569', diff saved to https://phabricator.wikimedia.org/P72059 and previous config saved to /var/cache/conftool/dbconfig/20250115-100207-marostegui.json
  • 10:01 marostegui@cumin1002: dbctl commit (dc=all): 'es1043 (re)pooling @ 4%: Repooling', diff saved to https://phabricator.wikimedia.org/P72058 and previous config saved to /var/cache/conftool/dbconfig/20250115-100138-root.json
  • 10:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2223
  • 10:01 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2223
  • 10:01 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2223.codfw.wmnet with OS bookworm
  • 10:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2222
  • 10:00 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2222
  • 09:59 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2222.codfw.wmnet with OS bookworm
  • 09:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2221
  • 09:59 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2221
  • 09:59 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2221.codfw.wmnet with OS bookworm
  • 09:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2220
  • 09:55 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2220
  • 09:54 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2220.codfw.wmnet with OS bookworm
  • 09:54 jayme: disabling puppet on 543 nodes using k8s::package resource - T341984
  • 09:51 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2220-2223].codfw.wmnet
  • 09:51 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2220-2223].codfw.wmnet
  • 09:50 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru and A:cp
  • 09:47 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp
  • 09:46 marostegui@cumin1002: dbctl commit (dc=all): 'es1043 (re)pooling @ 3%: Repooling', diff saved to https://phabricator.wikimedia.org/P72057 and previous config saved to /var/cache/conftool/dbconfig/20250115-094632-root.json
  • 09:44 jelto@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2223.codfw.wmnet
  • 09:38 jelto@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2222.codfw.wmnet
  • 09:34 urbanecm@deploy2002: Finished scap sync-world: Backport for Add license messages for new Wikinews licenses (T383338), Update French wikinews license to CC-BY-SA 4.0 (T381946), htmlform: fix defaults for namespace and relative in titlesmultiselect (T383133), htmlform: fix defaults for namespace and relative in titlesmultiselect (T383133) (durat
  • 09:31 marostegui@cumin1002: dbctl commit (dc=all): 'es1043 (re)pooling @ 2%: Repooling', diff saved to https://phabricator.wikimedia.org/P72056 and previous config saved to /var/cache/conftool/dbconfig/20250115-093127-root.json
  • 09:30 jelto@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2221.codfw.wmnet
  • 09:21 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp
  • 09:20 urbanecm@deploy2002: matmarex, urbanecm, dreamrimmer: Continuing with sync
  • 09:19 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72055 and previous config saved to /var/cache/conftool/dbconfig/20250115-091943-root.json
  • 09:16 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp
  • 09:16 marostegui@cumin1002: dbctl commit (dc=all): 'es1043 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P72054 and previous config saved to /var/cache/conftool/dbconfig/20250115-091622-root.json
  • 09:15 urbanecm@deploy2002: matmarex, urbanecm, dreamrimmer: Backport for Add license messages for new Wikinews licenses (T383338), Update French wikinews license to CC-BY-SA 4.0 (T381946), htmlform: fix defaults for namespace and relative in titlesmultiselect (T383133), htmlform: fix defaults for namespace and relative in titlesmultiselect (T383133)
  • 09:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2223.codfw.wmnet with OS bullseye
  • 09:11 jelto@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2220.codfw.wmnet
  • 09:10 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2222.codfw.wmnet with OS bullseye
  • 09:04 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72053 and previous config saved to /var/cache/conftool/dbconfig/20250115-090437-root.json
  • 09:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2221.codfw.wmnet with OS bullseye
  • 08:57 jelto: homer lsw1-d3-codfw* commit 'T377877'
  • 08:54 jelto: !log homer cr*codfw* commit 'T377877'
  • 08:54 urbanecm@deploy2002: Started scap sync-world: Backport for Add license messages for new Wikinews licenses (T383338), Update French wikinews license to CC-BY-SA 4.0 (T381946), htmlform: fix defaults for namespace and relative in titlesmultiselect (T383133), htmlform: fix defaults for namespace and relative in titlesmultiselect (T383133)
  • 08:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2223.codfw.wmnet with reason: host reimage
  • 08:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2220.codfw.wmnet with OS bullseye
  • 08:53 urbanecm@deploy2002: sync-world aborted: Backport for Add license messages for new Wikinews licenses (T383338), Update French wikinews license to CC-BY-SA 4.0 (T381946) (duration: 05m 25s)
  • 08:53 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp
  • 08:51 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2223.codfw.wmnet with reason: host reimage
  • 08:49 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72052 and previous config saved to /var/cache/conftool/dbconfig/20250115-084932-root.json
  • 08:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2222.codfw.wmnet with reason: host reimage
  • 08:48 urbanecm@deploy2002: Started scap sync-world: Backport for Add license messages for new Wikinews licenses (T383338), Update French wikinews license to CC-BY-SA 4.0 (T381946)
  • 08:45 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2222.codfw.wmnet with reason: host reimage
  • 08:44 urbanecm@deploy2002: Finished scap sync-world: Backport for Enable abusefilter-log-detail for autoconfirmed users on en.wikibooks (T383332), Growth: Remove temporary config for clearing link recommendations (T379522) (duration: 16m 09s)
  • 08:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2221.codfw.wmnet with reason: host reimage
  • 08:38 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2221.codfw.wmnet with reason: host reimage
  • 08:37 urbanecm@deploy2002: dreamrimmer, urbanecm, migr: Continuing with sync
  • 08:35 urbanecm@deploy2002: dreamrimmer, urbanecm, migr: Backport for Enable abusefilter-log-detail for autoconfirmed users on en.wikibooks (T383332), Growth: Remove temporary config for clearing link recommendations (T379522) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2223
  • 08:35 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2223
  • 08:35 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2223
  • 08:35 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2223.codfw.wmnet 22.48.192.10.in-addr.arpa 2.2.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:35 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2223.codfw.wmnet 22.48.192.10.in-addr.arpa 2.2.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:35 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:35 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2223 - jelto@cumin1002"
  • 08:34 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2223 - jelto@cumin1002"
  • 08:34 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72051 and previous config saved to /var/cache/conftool/dbconfig/20250115-083427-root.json
  • 08:33 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2220.codfw.wmnet with reason: host reimage
  • 08:31 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:31 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2223
  • 08:31 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2223.codfw.wmnet with OS bullseye
  • 08:31 jelto@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2223.codfw.wmnet
  • 08:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2220.codfw.wmnet with reason: host reimage
  • 08:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2222
  • 08:30 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2222
  • 08:29 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2222
  • 08:29 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2222.codfw.wmnet 21.48.192.10.in-addr.arpa 1.2.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:29 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2222.codfw.wmnet 21.48.192.10.in-addr.arpa 1.2.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:29 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:29 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2222 - jelto@cumin1002"
  • 08:29 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2222 - jelto@cumin1002"
  • 08:28 urbanecm@deploy2002: Started scap sync-world: Backport for Enable abusefilter-log-detail for autoconfirmed users on en.wikibooks (T383332), Growth: Remove temporary config for clearing link recommendations (T379522)
  • 08:26 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:25 marostegui@cumin1002: dbctl commit (dc=all): 'Add es1043 to dbctl depooled T382569', diff saved to https://phabricator.wikimedia.org/P72050 and previous config saved to /var/cache/conftool/dbconfig/20250115-082554-marostegui.json
  • 08:25 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2222
  • 08:25 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2222.codfw.wmnet with OS bullseye
  • 08:25 jelto@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2222.codfw.wmnet
  • 08:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2221
  • 08:22 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2221
  • 08:22 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2221
  • 08:22 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2221.codfw.wmnet 20.48.192.10.in-addr.arpa 0.2.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:22 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2221.codfw.wmnet 20.48.192.10.in-addr.arpa 0.2.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:22 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:22 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2221 - jelto@cumin1002"
  • 08:22 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2221 - jelto@cumin1002"
  • 08:19 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72049 and previous config saved to /var/cache/conftool/dbconfig/20250115-081922-root.json
  • 08:19 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:18 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2221
  • 08:18 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2221.codfw.wmnet with OS bullseye
  • 08:18 jelto@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2221.codfw.wmnet
  • 08:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2220
  • 08:14 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2220
  • 08:14 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2220
  • 08:14 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2220.codfw.wmnet 19.48.192.10.in-addr.arpa 9.1.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:14 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2220.codfw.wmnet 19.48.192.10.in-addr.arpa 9.1.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:14 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:14 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2220 - jelto@cumin1002"
  • 08:13 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2220 - jelto@cumin1002"
  • 08:10 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:10 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2220
  • 08:10 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2220.codfw.wmnet with OS bullseye
  • 08:10 jelto@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2220.codfw.wmnet
  • 08:08 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2220.codfw.wmnet wikikube-worker2221.codfw.wmnet wikikube-worker2222.codfw.wmnet wikikube-worker2223.codfw.wmnet on all recursors
  • 08:08 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2220.codfw.wmnet wikikube-worker2221.codfw.wmnet wikikube-worker2222.codfw.wmnet wikikube-worker2223.codfw.wmnet on all recursors
  • 08:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2372 to wikikube-worker2223
  • 07:59 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2223
  • 07:59 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2223
  • 07:59 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:59 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2372 to wikikube-worker2223 - jelto@cumin1002"
  • 07:58 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2372 to wikikube-worker2223 - jelto@cumin1002"
  • 07:55 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 07:55 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2372 to wikikube-worker2223
  • 07:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2371 to wikikube-worker2222
  • 07:50 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2222
  • 07:49 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2222
  • 07:49 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:49 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2371 to wikikube-worker2222 - jelto@cumin1002"
  • 07:43 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2371 to wikikube-worker2222 - jelto@cumin1002"
  • 07:39 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 07:39 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2371 to wikikube-worker2222
  • 07:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2370 to wikikube-worker2221
  • 07:37 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2221
  • 07:37 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2221
  • 07:37 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:37 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2370 to wikikube-worker2221 - jelto@cumin1002"
  • 07:36 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2370 to wikikube-worker2221 - jelto@cumin1002"
  • 07:30 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 07:29 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2370 to wikikube-worker2221
  • 07:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2369 to wikikube-worker2220
  • 07:05 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2220
  • 07:04 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2220
  • 07:04 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:04 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2369 to wikikube-worker2220 - jelto@cumin1002"
  • 07:04 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2369 to wikikube-worker2220 - jelto@cumin1002"
  • 06:59 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 06:58 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2369 to wikikube-worker2220
  • 06:52 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2369-2372].codfw.wmnet
  • 06:50 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2369-2372].codfw.wmnet
  • 05:50 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
  • 05:30 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
  • 05:23 kart_: Updated cxserver to 2025-01-13-044601-production (T382294)
  • 05:22 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 05:22 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:20 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:20 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 05:15 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:14 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 00:09 tzatziki: removing 1 file for legal compliance

2025-01-14

  • 23:54 tzatziki: removing 2 files for legal compliance
  • 23:41 eileen: civicrm upgraded from 1ea537d3 to 4206a253
  • 23:14 tzatziki: removing 2 files for legal compliance
  • away: UTC late deploys done
  • 22:58 tgr@deploy2002: Finished scap sync-world: Backport for Revert "Enable SUL3 on test wikis" (duration: 12m 59s)
  • 22:50 tgr@deploy2002: tgr, trainbranchbot: Continuing with sync
  • 22:49 tzatziki: removing 5 files for legal compliance
  • 22:49 tgr@deploy2002: tgr, trainbranchbot: Backport for Revert "Enable SUL3 on test wikis" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:45 tgr@deploy2002: Started scap sync-world: Backport for Revert "Enable SUL3 on test wikis"
  • 22:31 tzatziki: removing 8 files for legal compliance
  • 22:24 tgr@deploy2002: Sync cancelled.
  • 22:14 tgr@deploy2002: tgr: Backport for Enable SUL3 on test wikis (T383729) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:07 tgr@deploy2002: Started scap sync-world: Backport for Enable SUL3 on test wikis (T383729)
  • 22:02 cjming@deploy2002: Finished scap sync-world: Backport for Add azwiki to mobile-anon-talk dblist (T383394) (duration: 17m 03s)
  • 21:55 cjming@deploy2002: nmw03, cjming: Continuing with sync
  • 21:52 cjming@deploy2002: nmw03, cjming: Backport for Add azwiki to mobile-anon-talk dblist (T383394) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:48 swfrench-wmf: deployed conftool 4.2.0 fleet-wide as of ~ 20:00 UTC (previously 4.1.0)
  • 21:45 cjming@deploy2002: Started scap sync-world: Backport for Add azwiki to mobile-anon-talk dblist (T383394)
  • 21:43 cjming@deploy2002: Finished scap sync-world: Backport for Fix links pointing to m:Help:Export (duration: 13m 00s)
  • 21:36 cjming@deploy2002: tacsipacsi, cjming: Continuing with sync
  • 21:35 cjming@deploy2002: tacsipacsi, cjming: Backport for Fix links pointing to m:Help:Export synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:30 cjming@deploy2002: Started scap sync-world: Backport for Fix links pointing to m:Help:Export
  • 21:28 cjming@deploy2002: Finished scap sync-world: Backport for SUL3: Add auth domain to URL tests (T380574) (duration: 18m 00s)
  • 21:19 cjming@deploy2002: cjming, tgr: Continuing with sync
  • 21:17 cjming@deploy2002: cjming, tgr: Backport for SUL3: Add auth domain to URL tests (T380574) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:10 cjming@deploy2002: Started scap sync-world: Backport for SUL3: Add auth domain to URL tests (T380574)
  • 21:05 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1098-1101].eqiad.wmnet
  • 21:05 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1098-1101].eqiad.wmnet
  • 21:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudelastic[1005-1006].eqiad.wmnet
  • 21:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic[1005-1006].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 21:03 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic[1005-1006].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 20:56 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 20:48 bking@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudelastic[1005-1006].eqiad.wmnet
  • 20:32 bking@cumin2002: conftool action : set/pooled=yes:weight=10; selector: name=cloudelastic1012.eqiad.wmnet
  • 20:32 bking@cumin2002: conftool action : set/pooled=yes:weight=10; selector: name=cloudelastic1011.eqiad.wmnet
  • 20:15 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
  • 20:15 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
  • 20:15 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:14 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:01 cdanis@cumin2002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "[not really into teleological thinking] - cdanis@cumin2002"
  • 20:01 cdanis@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: [not really into teleological thinking] - cdanis@cumin2002
  • 20:01 cdanis@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: [not really into teleological thinking] - cdanis@cumin2002
  • 20:01 cdanis@cumin2002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "[not really into teleological thinking] - cdanis@cumin2002"
  • 19:42 moritzm: installing rsync security updates on bullseye
  • 19:29 brennen@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.44.0-wmf.12 refs T382363
  • 19:12 brennen: 1.44.0-wmf.12 train (T382363): no current blockers, rolling to group0
  • 18:57 swfrench@deploy2002: Finished scap sync-world: k8s-only deploy to clear noop chart version diffs (duration: 02m 15s)
  • 18:55 swfrench@deploy2002: Started scap sync-world: k8s-only deploy to clear noop chart version diffs
  • 18:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 18:49 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 18:41 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 18:41 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 18:26 moritzm: installing rsync security updates on bookworm
  • 18:22 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1101.eqiad.wmnet with OS bookworm
  • 18:06 kamila@deploy2002: Finished scap sync-world: enable auth.wikimedia.org (duration: 17m 55s)
  • 18:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1101.eqiad.wmnet with reason: host reimage
  • 18:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1100.eqiad.wmnet with OS bookworm
  • 18:00 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1101.eqiad.wmnet with reason: host reimage
  • 17:51 kamila@deploy2002: Started scap sync-world: enable auth.wikimedia.org
  • 17:46 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1100.eqiad.wmnet with reason: host reimage
  • 17:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1101
  • 17:44 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1101
  • 17:44 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1101.eqiad.wmnet with OS bookworm
  • 17:42 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1100.eqiad.wmnet with reason: host reimage
  • 17:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:38 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:36 swfrench-wmf: continuing shellbox-syntaxhighlight pilot on PHP 8.1 (all eqiad replicas) - T377038
  • 17:36 kamila@dns1004: END - running authdns-update
  • 17:34 kamila@dns1004: START - running authdns-update
  • 17:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1100
  • 17:27 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1100
  • 17:27 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1100.eqiad.wmnet with OS bookworm
  • 17:16 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:15 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:15 swfrench-wmf: starting shellbox-syntaxhighlight pilot on PHP 8.1 (1 replica, eqiad only) - T377038
  • 17:13 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1100.eqiad.wmnet with OS bookworm
  • 17:13 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1101.eqiad.wmnet with OS bookworm
  • 17:04 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 17:03 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 16:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1098.eqiad.wmnet with OS bookworm
  • 16:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1099.eqiad.wmnet with OS bookworm
  • 16:39 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1098.eqiad.wmnet with reason: host reimage
  • 16:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1099.eqiad.wmnet with reason: host reimage
  • 16:32 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1098.eqiad.wmnet with reason: host reimage
  • 16:31 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1099.eqiad.wmnet with reason: host reimage
  • 16:16 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1098
  • 16:16 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1098
  • 16:16 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1098.eqiad.wmnet with OS bookworm
  • 16:16 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1101
  • 16:16 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1101
  • 16:16 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1101.eqiad.wmnet with OS bookworm
  • 16:16 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1100
  • 16:16 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1100
  • 16:16 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1100.eqiad.wmnet with OS bookworm
  • 16:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1099
  • 16:15 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1099
  • 16:15 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1099.eqiad.wmnet with OS bookworm
  • 16:14 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1098.eqiad.wmnet wikikube-worker1099.eqiad.wmnet wikikube-worker1100.eqiad.wmnet wikikube-worker1101.eqiad.wmnet on all recursors
  • 16:14 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1098.eqiad.wmnet wikikube-worker1099.eqiad.wmnet wikikube-worker1100.eqiad.wmnet wikikube-worker1101.eqiad.wmnet on all recursors
  • 16:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1419 to wikikube-worker1101
  • 16:13 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1101
  • 16:12 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1101
  • 16:12 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:11 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1416 to wikikube-worker1100
  • 16:10 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1100
  • 16:09 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:09 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1100
  • 16:09 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:07 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1414 to wikikube-worker1098
  • 16:07 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1098
  • 16:06 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:05 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1098
  • 16:05 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:05 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1414 to wikikube-worker1098 - kamila@cumin1002"
  • 16:05 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1414 to wikikube-worker1098 - kamila@cumin1002"
  • 16:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1415 to wikikube-worker1099
  • 16:02 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1099
  • 16:01 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:01 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1099
  • 16:01 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:01 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1415 to wikikube-worker1099 - kamila@cumin1002"
  • 16:00 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1415 to wikikube-worker1099 - kamila@cumin1002"
  • 15:58 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1414 to wikikube-worker1098
  • 15:56 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=99) from mw1414 to wikikube-worker1098
  • 15:56 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:56 kamila@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 15:56 kamila@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1414 to wikikube-worker1098 - kamila@cumin1002"
  • 15:56 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1414 to wikikube-worker1098 - kamila@cumin1002"
  • 15:55 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1419 to wikikube-worker1101
  • 15:54 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1416 to wikikube-worker1100
  • 15:54 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1415 to wikikube-worker1099
  • 15:53 moritzm: import prometheus-mysqld-exporter 0.13.0-1~bpo11+1 to the main component of bullseye-wikimedia (import from bullseye-backports which is going away) T383557
  • 15:52 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:52 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1414 to wikikube-worker1098
  • 15:50 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[1414-1416,1419].eqiad.wmnet
  • 15:48 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[1414-1416,1419].eqiad.wmnet
  • 15:47 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2216-2219].codfw.wmnet
  • 15:47 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2216-2219].codfw.wmnet
  • 15:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 15:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 15:45 jelto: homer 'cr*codfw*' commit 'T377877'
  • 15:44 jelto: homer 'lsw1-d3-codfw*' commit 'T377877'
  • 15:34 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:33 Lucas_WMDE: previous deployment also included fix(tracking): TimingMetric:observe records milliseconds (T383208)
  • 15:31 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Revert "Make WikibaseQualityConstraints use split-graph query service" (T374021), Make WikimediaCampaignEvents use split-graph query service (T377956) (duration: 16m 03s)
  • 15:27 marostegui: Stop in sync db2133 db2233 m2 codfw dbmaint T373579
  • 15:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db[2133,2160,2233].codfw.wmnet with reason: cloning
  • 15:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 10:00:00 on db[2133,2160,2233].codfw.wmnet with reason: cloning
  • 15:24 lucaswerkmeister-wmde@deploy2002: stevemunene, lucaswerkmeister-wmde: Continuing with sync
  • 15:21 lucaswerkmeister-wmde@deploy2002: stevemunene, lucaswerkmeister-wmde: Backport for Revert "Make WikibaseQualityConstraints use split-graph query service" (T374021), Make WikimediaCampaignEvents use split-graph query service (T377956) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:15 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Revert "Make WikibaseQualityConstraints use split-graph query service" (T374021), Make WikimediaCampaignEvents use split-graph query service (T377956)
  • 15:12 lucaswerkmeister-wmde@deploy2002: Sync cancelled.
  • 15:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 15:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 15:01 lucaswerkmeister-wmde@deploy2002: stevemunene, lucaswerkmeister-wmde: Backport for Make WikibaseQualityConstraints use split-graph query service (T374021), Make WikimediaCampaignEvents use split-graph query service (T377956) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2219.codfw.wmnet with OS bookworm
  • 14:58 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 14:55 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Make WikibaseQualityConstraints use split-graph query service (T374021), Make WikimediaCampaignEvents use split-graph query service (T377956)
  • 14:53 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for test(2)wiki: Explicitly assign event organizer rights to all users (T376822) (duration: 22m 26s)
  • 14:47 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp
  • 14:46 lucaswerkmeister-wmde@deploy2002: daimona, lucaswerkmeister-wmde: Continuing with sync
  • 14:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 14:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2219.codfw.wmnet with reason: host reimage
  • 14:37 lucaswerkmeister-wmde@deploy2002: daimona, lucaswerkmeister-wmde: Backport for test(2)wiki: Explicitly assign event organizer rights to all users (T376822) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:35 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2219.codfw.wmnet with reason: host reimage
  • 14:31 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for test(2)wiki: Explicitly assign event organizer rights to all users (T376822)
  • 14:29 root@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for dbprov1003.eqiad.wmnet: Renew puppet certificate - root@cumin1002
  • 14:29 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Enable CampaignEvents extension on idwiki, itwiki, mswiki, and plwiki (T383154) (duration: 23m 59s)
  • 14:28 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2218.codfw.wmnet with OS bookworm
  • 14:27 root@cumin1002: START - Cookbook sre.puppet.renew-cert for dbprov1003.eqiad.wmnet: Renew puppet certificate - root@cumin1002
  • 14:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2217.codfw.wmnet with OS bookworm
  • 14:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2216.codfw.wmnet with OS bookworm
  • 14:21 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, daimona: Continuing with sync
  • 14:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2219
  • 14:18 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2219
  • 14:18 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2219.codfw.wmnet with OS bookworm
  • 14:17 jelto@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker2219.codfw.wmnet with OS bookworm
  • 14:15 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, daimona: Backport for Enable CampaignEvents extension on idwiki, itwiki, mswiki, and plwiki (T383154) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2218.codfw.wmnet with reason: host reimage
  • 14:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2217.codfw.wmnet with reason: host reimage
  • 14:05 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2218.codfw.wmnet with reason: host reimage
  • 14:05 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Enable CampaignEvents extension on idwiki, itwiki, mswiki, and plwiki (T383154)
  • 14:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2216.codfw.wmnet with reason: host reimage
  • 14:02 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp
  • 14:01 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 14:01 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 14:01 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2217.codfw.wmnet with reason: host reimage
  • 13:57 jayme: imported kubernetes 1.31.4-1 to bookworm-wikimedia - T341984
  • 13:56 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2216.codfw.wmnet with reason: host reimage
  • 13:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2219
  • 13:53 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2219
  • 13:53 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2219
  • 13:53 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2219.codfw.wmnet 148.48.192.10.in-addr.arpa 8.4.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:53 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2219.codfw.wmnet 148.48.192.10.in-addr.arpa 8.4.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:53 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:53 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2219 - jelto@cumin1002"
  • 13:53 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2219 - jelto@cumin1002"
  • 13:50 jayme: imported calico 3.29.1-1 to bookworm-wikimedia - T341984
  • 13:48 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:48 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2219
  • 13:48 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2219.codfw.wmnet with OS bookworm
  • 13:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2218
  • 13:48 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2218
  • 13:48 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2218
  • 13:48 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2218.codfw.wmnet 147.48.192.10.in-addr.arpa 7.4.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:48 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2218.codfw.wmnet 147.48.192.10.in-addr.arpa 7.4.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:48 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:48 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2218 - jelto@cumin1002"
  • 13:47 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2218 - jelto@cumin1002"
  • 13:44 jayme: imported kubernetes 1.23.14-5 to bullseye/bookworm-wikimedia - T341984
  • 13:44 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:44 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:44 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2218
  • 13:44 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:44 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2217
  • 13:44 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2217
  • 13:44 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2217
  • 13:43 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2217.codfw.wmnet 146.48.192.10.in-addr.arpa 6.4.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:43 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2217.codfw.wmnet 146.48.192.10.in-addr.arpa 6.4.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:43 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:43 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2217 - jelto@cumin1002"
  • 13:43 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2217 - jelto@cumin1002"
  • 13:42 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2218.codfw.wmnet with OS bookworm
  • 13:40 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:40 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2217
  • 13:39 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2217.codfw.wmnet with OS bookworm
  • 13:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2216
  • 13:39 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2216
  • 13:37 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2216
  • 13:37 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2216.codfw.wmnet 145.48.192.10.in-addr.arpa 5.4.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:37 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2216.codfw.wmnet 145.48.192.10.in-addr.arpa 5.4.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:37 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:37 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2216 - jelto@cumin1002"
  • 13:37 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2216 - jelto@cumin1002"
  • 13:34 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:32 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2216
  • 13:32 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2216.codfw.wmnet with OS bookworm
  • 13:29 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2216.codfw.wmnet wikikube-worker2217.codfw.wmnet wikikube-worker2218.codfw.wmnet wikikube-worker2219.codfw.wmnet on all recursors
  • 13:29 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2216.codfw.wmnet wikikube-worker2217.codfw.wmnet wikikube-worker2218.codfw.wmnet wikikube-worker2219.codfw.wmnet on all recursors
  • 13:29 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2376 to wikikube-worker2219
  • 13:28 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2219
  • 13:28 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2219
  • 13:28 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:28 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2376 to wikikube-worker2219 - jelto@cumin1002"
  • 13:28 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2376 to wikikube-worker2219 - jelto@cumin1002"
  • 13:24 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:24 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2376 to wikikube-worker2219
  • 13:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2375 to wikikube-worker2218
  • 13:22 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2218
  • 13:22 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2218
  • 13:22 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:22 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2375 to wikikube-worker2218 - jelto@cumin1002"
  • 13:22 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2375 to wikikube-worker2218 - jelto@cumin1002"
  • 13:17 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:17 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2375 to wikikube-worker2218
  • 13:16 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2374 to wikikube-worker2217
  • 13:16 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2217
  • 13:16 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2217
  • 13:16 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:15 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2374 to wikikube-worker2217 - jelto@cumin1002"
  • 13:15 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2374 to wikikube-worker2217 - jelto@cumin1002"
  • 13:11 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:11 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2374 to wikikube-worker2217
  • 13:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2373 to wikikube-worker2216
  • 13:10 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2216
  • 13:10 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2216
  • 13:10 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:10 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2373 to wikikube-worker2216 - jelto@cumin1002"
  • 13:09 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2373 to wikikube-worker2216 - jelto@cumin1002"
  • 13:06 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:06 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2373 to wikikube-worker2216
  • 13:01 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2373-2376].codfw.wmnet
  • 12:59 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2373-2376].codfw.wmnet
  • 12:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2128.codfw.wmnet
  • 12:52 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:52 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2128.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 12:52 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2128.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 12:49 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 12:44 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2128.codfw.wmnet
  • 12:41 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2212 gradually with 4 steps - Maint over
  • 12:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repool pc5 T383398', diff saved to https://phabricator.wikimedia.org/P72044 and previous config saved to /var/cache/conftool/dbconfig/20250114-120804-marostegui.json
  • 12:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depool pc5 eqiad codfw dbmaint T383398', diff saved to https://phabricator.wikimedia.org/P72043 and previous config saved to /var/cache/conftool/dbconfig/20250114-120234-marostegui.json
  • 12:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc[2015,2017].codfw.wmnet,pc[1014-1015,1017].eqiad.wmnet with reason: maintenance
  • 12:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc[2015,2017].codfw.wmnet,pc[1014-1015,1017].eqiad.wmnet with reason: maintenance
  • 11:55 ladsgroup@cumin1002: START - Cookbook sre.mysql.pool db2212 gradually with 4 steps - Maint over
  • 11:54 vgutierrez@dns1004: END - running authdns-update
  • 11:52 vgutierrez@dns1004: START - running authdns-update
  • 11:48 marostegui@cumin1002: dbctl commit (dc=all): 'es1044 (re)pooling @ 100%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P72041 and previous config saved to /var/cache/conftool/dbconfig/20250114-114804-root.json
  • 11:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es1020.eqiad.wmnet
  • 11:38 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:38 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es1020.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 11:37 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es1020.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 11:34 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 11:33 marostegui@cumin1002: dbctl commit (dc=all): 'es1044 (re)pooling @ 75%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P72040 and previous config saved to /var/cache/conftool/dbconfig/20250114-113259-root.json
  • 11:28 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts es1020.eqiad.wmnet
  • 11:17 marostegui@cumin1002: dbctl commit (dc=all): 'es1044 (re)pooling @ 50%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P72039 and previous config saved to /var/cache/conftool/dbconfig/20250114-111754-root.json
  • 11:16 marostegui@cumin1002: dbctl commit (dc=all): 'Remove es1020 from dbctl for decommission T383578', diff saved to https://phabricator.wikimedia.org/P72038 and previous config saved to /var/cache/conftool/dbconfig/20250114-111647-marostegui.json
  • 11:14 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2239.codfw.wmnet with reason: reboot
  • 11:14 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2239.codfw.wmnet with reason: reboot
  • 11:13 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov1003.eqiad.wmnet with reason: os upgrade
  • 11:13 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov1003.eqiad.wmnet with reason: os upgrade
  • 11:05 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp
  • 11:02 marostegui@cumin1002: dbctl commit (dc=all): 'es1044 (re)pooling @ 25%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P72037 and previous config saved to /var/cache/conftool/dbconfig/20250114-110248-root.json
  • 10:47 marostegui@cumin1002: dbctl commit (dc=all): 'es1044 (re)pooling @ 10%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P72036 and previous config saved to /var/cache/conftool/dbconfig/20250114-104743-root.json
  • 10:45 marostegui@dns1006: END - running authdns-update
  • 10:43 marostegui@dns1006: START - running authdns-update
  • 10:43 marostegui@dns1006: END - running authdns-update
  • 10:41 marostegui@dns1006: START - running authdns-update
  • 10:37 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp
  • 10:35 marostegui: Reboot db2235 m5 codfw master
  • 10:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2235.codfw.wmnet with reason: upgrade
  • 10:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on db2235.codfw.wmnet with reason: upgrade
  • 10:32 marostegui@cumin1002: dbctl commit (dc=all): 'es1044 (re)pooling @ 5%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P72035 and previous config saved to /var/cache/conftool/dbconfig/20250114-103238-root.json
  • 10:27 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw and A:cp
  • 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'es1044 (re)pooling @ 4%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P72034 and previous config saved to /var/cache/conftool/dbconfig/20250114-101732-root.json
  • 10:05 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw and A:cp
  • 10:03 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp
  • 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'es1044 (re)pooling @ 3%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P72033 and previous config saved to /var/cache/conftool/dbconfig/20250114-100227-root.json
  • 10:00 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 09:59 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 09:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repool pc4 T383398', diff saved to https://phabricator.wikimedia.org/P72032 and previous config saved to /var/cache/conftool/dbconfig/20250114-095404-marostegui.json
  • 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Promote pc2014 to codfw pc4 master dbmaint T383398', diff saved to https://phabricator.wikimedia.org/P72031 and previous config saved to /var/cache/conftool/dbconfig/20250114-095320-marostegui.json
  • 09:50 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2212-2215].codfw.wmnet
  • 09:50 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2212-2215].codfw.wmnet
  • 09:47 marostegui@cumin1002: dbctl commit (dc=all): 'es1044 (re)pooling @ 2%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P72030 and previous config saved to /var/cache/conftool/dbconfig/20250114-094722-root.json
  • 09:47 root@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for dbprov2006.codfw.wmnet: Renew puppet certificate - root@cumin1002
  • 09:47 jelto: homer 'cr*codfw*' commit 'T377877'
  • 09:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on pc[2014-2016].codfw.wmnet,pc1016.eqiad.wmnet with reason: reorganizing pc4
  • 09:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on pc[2014-2016].codfw.wmnet,pc1016.eqiad.wmnet with reason: reorganizing pc4
  • 09:44 root@cumin1002: START - Cookbook sre.puppet.renew-cert for dbprov2006.codfw.wmnet: Renew puppet certificate - root@cumin1002
  • 09:43 jelto: homer 'lsw1-c3-codfw*' commit 'T377877'
  • 09:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depool pc4 T383398', diff saved to https://phabricator.wikimedia.org/P72028 and previous config saved to /var/cache/conftool/dbconfig/20250114-094350-marostegui.json
  • 09:43 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp
  • 09:43 hashar@deploy2002: Finished scap sync-world: Backport for knwiki, knwikisource, knwiktionary, knwikiquote: update logo, wordmark (T382802), hiwikisource: logo fix (T310961) (duration: 16m 21s)
  • 09:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2215.codfw.wmnet with OS bookworm
  • 09:35 hashar@deploy2002: anzx, hashar: Continuing with sync
  • 09:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es[1022,1043].eqiad.wmnet with reason: cloning
  • 09:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es[1022,1043].eqiad.wmnet with reason: cloning
  • 09:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1022 T382569', diff saved to https://phabricator.wikimedia.org/P72027 and previous config saved to /var/cache/conftool/dbconfig/20250114-093315-marostegui.json
  • 09:32 marostegui@cumin1002: dbctl commit (dc=all): 'es1044 (re)pooling @ 1%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P72026 and previous config saved to /var/cache/conftool/dbconfig/20250114-093216-root.json
  • 09:31 marostegui@cumin1002: dbctl commit (dc=all): 'Add es1044 to dbctl depooled T382569', diff saved to https://phabricator.wikimedia.org/P72025 and previous config saved to /var/cache/conftool/dbconfig/20250114-093147-marostegui.json
  • 09:31 hashar@deploy2002: anzx, hashar: Backport for knwiki, knwikisource, knwiktionary, knwikiquote: update logo, wordmark (T382802), hiwikisource: logo fix (T310961) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:26 hashar@deploy2002: Started scap sync-world: Backport for knwiki, knwikisource, knwiktionary, knwikiquote: update logo, wordmark (T382802), hiwikisource: logo fix (T310961)
  • 09:19 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2215.codfw.wmnet with reason: host reimage
  • 09:15 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2215.codfw.wmnet with reason: host reimage
  • 09:12 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 09:11 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 09:10 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 09:10 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 09:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
  • 09:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
  • 09:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 09:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2214.codfw.wmnet with OS bookworm
  • 08:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2215
  • 08:59 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2215
  • 08:59 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2215.codfw.wmnet with OS bookworm
  • 08:58 jelto@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker2215.codfw.wmnet with OS bookworm
  • 08:58 marostegui@cumin1002: dbctl commit (dc=all): 'es1023 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72024 and previous config saved to /var/cache/conftool/dbconfig/20250114-085802-root.json
  • 08:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2213.codfw.wmnet with OS bookworm
  • 08:55 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbprov2006.codfw.wmnet with reason: os upgrade
  • 08:55 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on dbprov2006.codfw.wmnet with reason: os upgrade
  • 08:53 gmodena@deploy2002: Finished scap sync-world: Backport for Revert^2 "config: remove eventbus instrumentation setting" (duration: 21m 52s)
  • 08:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2212.codfw.wmnet with OS bookworm
  • 08:43 gmodena@deploy2002: otto, gmodena: Continuing with sync
  • 08:42 marostegui@cumin1002: dbctl commit (dc=all): 'es1023 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72023 and previous config saved to /var/cache/conftool/dbconfig/20250114-084256-root.json
  • 08:40 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2214.codfw.wmnet with reason: host reimage
  • 08:38 gmodena@deploy2002: otto, gmodena: Backport for Revert^2 "config: remove eventbus instrumentation setting" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:36 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2213.codfw.wmnet with reason: host reimage
  • 08:34 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2214.codfw.wmnet with reason: host reimage
  • 08:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2212.codfw.wmnet with reason: host reimage
  • 08:31 gmodena@deploy2002: Started scap sync-world: Backport for Revert^2 "config: remove eventbus instrumentation setting"
  • 08:31 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2213.codfw.wmnet with reason: host reimage
  • 08:27 marostegui@cumin1002: dbctl commit (dc=all): 'es1023 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72022 and previous config saved to /var/cache/conftool/dbconfig/20250114-082751-root.json
  • 08:26 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2212.codfw.wmnet with reason: host reimage
  • 08:25 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2215
  • 08:25 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2215
  • 08:25 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2215
  • 08:25 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2215.codfw.wmnet 62.32.192.10.in-addr.arpa 2.6.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:25 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2215.codfw.wmnet 62.32.192.10.in-addr.arpa 2.6.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:25 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:25 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2215 - jelto@cumin1002"
  • 08:25 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2215 - jelto@cumin1002"
  • 08:22 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:21 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2215
  • 08:21 moritzm: installing perl security updates
  • 08:21 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2215.codfw.wmnet with OS bookworm
  • 08:19 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2214
  • 08:19 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2214
  • 08:19 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2214
  • 08:19 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2214.codfw.wmnet 61.32.192.10.in-addr.arpa 1.6.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:19 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2214.codfw.wmnet 61.32.192.10.in-addr.arpa 1.6.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:19 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:19 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2214 - jelto@cumin1002"
  • 08:19 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2214 - jelto@cumin1002"
  • 08:15 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:15 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2214
  • 08:15 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2214.codfw.wmnet with OS bookworm
  • 08:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2213
  • 08:14 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2213
  • 08:14 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2213
  • 08:14 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2213.codfw.wmnet 60.32.192.10.in-addr.arpa 0.6.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:14 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2213.codfw.wmnet 60.32.192.10.in-addr.arpa 0.6.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:14 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:14 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2213 - jelto@cumin1002"
  • 08:14 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2213 - jelto@cumin1002"
  • 08:12 marostegui@cumin1002: dbctl commit (dc=all): 'es1023 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72021 and previous config saved to /var/cache/conftool/dbconfig/20250114-081246-root.json
  • 08:10 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:10 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2213
  • 08:10 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2213.codfw.wmnet with OS bookworm
  • 08:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2212
  • 08:09 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2212
  • 08:08 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2212
  • 08:08 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2212.codfw.wmnet 59.32.192.10.in-addr.arpa 9.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:08 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2212.codfw.wmnet 59.32.192.10.in-addr.arpa 9.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:08 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:08 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2212 - jelto@cumin1002"
  • 08:08 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2212 - jelto@cumin1002"
  • 08:05 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:02 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2212
  • 08:02 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2212.codfw.wmnet with OS bookworm
  • 07:57 marostegui@cumin1002: dbctl commit (dc=all): 'es1023 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72020 and previous config saved to /var/cache/conftool/dbconfig/20250114-075741-root.json
  • 07:54 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2212.codfw.wmnet wikikube-worker2213.codfw.wmnet wikikube-worker2214.codfw.wmnet wikikube-worker2215.codfw.wmnet on all recursors
  • 07:54 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2212.codfw.wmnet wikikube-worker2213.codfw.wmnet wikikube-worker2214.codfw.wmnet wikikube-worker2215.codfw.wmnet on all recursors
  • 07:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2415 to wikikube-worker2215
  • 07:53 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2215
  • 07:53 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2215
  • 07:53 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:53 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2415 to wikikube-worker2215 - jelto@cumin1002"
  • 07:52 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2415 to wikikube-worker2215 - jelto@cumin1002"
  • 07:49 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 07:49 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2415 to wikikube-worker2215
  • 07:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2414 to wikikube-worker2214
  • 07:48 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2214
  • 07:47 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2214
  • 07:47 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:47 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2414 to wikikube-worker2214 - jelto@cumin1002"
  • 07:47 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2414 to wikikube-worker2214 - jelto@cumin1002"
  • 07:43 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 07:43 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2414 to wikikube-worker2214
  • 07:40 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2413 to wikikube-worker2213
  • 07:39 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2213
  • 07:39 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2213
  • 07:39 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:39 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2413 to wikikube-worker2213 - jelto@cumin1002"
  • 07:39 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2413 to wikikube-worker2213 - jelto@cumin1002"
  • 07:36 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 07:35 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2413 to wikikube-worker2213
  • 07:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2412 to wikikube-worker2212
  • 07:34 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2212
  • 07:33 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2212
  • 07:33 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:33 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2412 to wikikube-worker2212 - jelto@cumin1002"
  • 07:33 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2412 to wikikube-worker2212 - jelto@cumin1002"
  • 07:29 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 07:29 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2412 to wikikube-worker2212
  • 07:24 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2412-2415].codfw.wmnet
  • 07:21 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2412-2415].codfw.wmnet
  • 05:03 mwpresync@deploy2002: Pruned MediaWiki: 1.44.0-wmf.6 (duration: 03m 06s)
  • 04:54 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.44.0-wmf.12 refs T382363 (duration: 50m 57s)
  • 04:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.44.0-wmf.12 refs T382363

2025-01-13

  • 22:26 cwhite@deploy2002: Finished deploy [statsv/statsv@42a4331]: T382729 (duration: 00m 08s)
  • 22:26 cwhite@deploy2002: Started deploy [statsv/statsv@42a4331]: T382729
  • 21:53 bking@cumin2002: conftool action : set/pooled=no; selector: service=(cloudelastic-chi-ssl|cloudelastic-psi-ssl|cloudelastic-omega-ssl|cloudelastic-chi-ssl-public|cloudelastic-psi-ssl-public|cloudelastic-omega-ssl-public),name=cloudelastic1006.eqiad.wmnet
  • 21:53 bking@cumin2002: conftool action : set/pooled=no; selector: service=(cloudelastic-chi-ssl|cloudelastic-psi-ssl|cloudelastic-omega-ssl|cloudelastic-chi-ssl-public|cloudelastic-psi-ssl-public|cloudelastic-omega-ssl-public),name=cloudelastic1005.eqiad.wmnet
  • 21:50 cdanis@deploy2002: Finished scap sync-world: Backport for Add missing parsoid settings for new wikis (T381379 T381080 T378463) (duration: 13m 12s)
  • 21:42 cdanis@deploy2002: cdanis, pppery: Continuing with sync
  • 21:42 cdanis@deploy2002: cdanis, pppery: Backport for Add missing parsoid settings for new wikis (T381379 T381080 T378463) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:37 cdanis@deploy2002: Started scap sync-world: Backport for Add missing parsoid settings for new wikis (T381379 T381080 T378463)
  • 21:37 cdanis@deploy2002: Finished scap sync-world: Backport for Configure new wikis (T381379 T381080 T378463) (duration: 13m 23s)
  • 21:29 cdanis@deploy2002: cdanis, pppery: Continuing with sync
  • 21:29 cdanis@deploy2002: cdanis, pppery: Backport for Configure new wikis (T381379 T381080 T378463) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:24 cdanis@deploy2002: Started scap sync-world: Backport for Configure new wikis (T381379 T381080 T378463)
  • 21:23 cdanis@deploy2002: Finished scap sync-world: Backport for Set Flow to read-only on phase 2a wikis (T378834) (duration: 14m 02s)
  • 21:14 cdanis@deploy2002: kemayo, cdanis: Continuing with sync
  • 21:14 cdanis@deploy2002: kemayo, cdanis: Backport for Set Flow to read-only on phase 2a wikis (T378834) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:09 cdanis@deploy2002: Started scap sync-world: Backport for Set Flow to read-only on phase 2a wikis (T378834)
  • 19:08 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1097.eqiad.wmnet with OS bookworm
  • 19:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1096.eqiad.wmnet with OS bookworm
  • 18:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1097.eqiad.wmnet with reason: host reimage
  • 18:47 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1096.eqiad.wmnet with reason: host reimage
  • 18:43 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1097.eqiad.wmnet with reason: host reimage
  • 18:43 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1096.eqiad.wmnet with reason: host reimage
  • 18:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1097
  • 18:28 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1097
  • 18:27 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1097.eqiad.wmnet with OS bookworm
  • 18:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1096
  • 18:27 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1096
  • 18:27 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1096.eqiad.wmnet with OS bookworm
  • 18:26 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1096.eqiad.wmnet wikikube-worker1097.eqiad.wmnet on all recursors
  • 18:26 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1096.eqiad.wmnet wikikube-worker1097.eqiad.wmnet on all recursors
  • 18:26 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1446 to wikikube-worker1097
  • 18:26 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1097
  • 18:25 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1097
  • 18:25 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:25 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1446 to wikikube-worker1097 - kamila@cumin1002"
  • 18:25 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1446 to wikikube-worker1097 - kamila@cumin1002"
  • 18:23 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1445 to wikikube-worker1096
  • 18:23 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1096
  • 18:22 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1096
  • 18:22 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:22 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1445 to wikikube-worker1096 - kamila@cumin1002"
  • 18:21 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1445 to wikikube-worker1096 - kamila@cumin1002"
  • 18:21 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 18:21 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1446 to wikikube-worker1097
  • 18:15 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 18:14 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1445 to wikikube-worker1096
  • 17:46 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2208-2211].codfw.wmnet
  • 17:46 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2208-2211].codfw.wmnet
  • 17:45 jelto: sudo homer 'cr*codfw*' commit 'T377877'
  • 17:44 jelto: homer 'lsw1-c3-codfw*' commit 'T377877'
  • 17:40 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2211.codfw.wmnet with OS bookworm
  • 17:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2210.codfw.wmnet with OS bookworm
  • 17:25 cdanis@deploy2002: Finished scap sync-world: Backport for OpenTelemetry tracing to all wikis (T340552) (duration: 14m 00s)
  • 17:19 dzahn@dns1006: END - running authdns-update
  • 17:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2211.codfw.wmnet with reason: host reimage
  • 17:18 mutante: DNS - removing za.wikimedia.org and za.m.wikimedia.org - wiki was not created (T382730, T195926)
  • 17:17 dzahn@dns1006: START - running authdns-update
  • 17:16 cdanis@deploy2002: cdanis: Continuing with sync
  • 17:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2209.codfw.wmnet with OS bookworm
  • 17:15 cdanis@deploy2002: cdanis: Backport for OpenTelemetry tracing to all wikis (T340552) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:15 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2211.codfw.wmnet with reason: host reimage
  • 17:11 cdanis@deploy2002: Started scap sync-world: Backport for OpenTelemetry tracing to all wikis (T340552)
  • 17:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2208.codfw.wmnet with OS bookworm
  • 17:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2210.codfw.wmnet with reason: host reimage
  • 17:03 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2210.codfw.wmnet with reason: host reimage
  • 16:59 root@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for dbprov2005.codfw.wmnet: Renew puppet certificate - root@cumin1002
  • 16:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2211
  • 16:58 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2211
  • 16:58 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2211
  • 16:58 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2211.codfw.wmnet 66.32.192.10.in-addr.arpa 6.6.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:58 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2211.codfw.wmnet 66.32.192.10.in-addr.arpa 6.6.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:58 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:58 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2211 - jelto@cumin1002"
  • 16:58 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2211 - jelto@cumin1002"
  • 16:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2209.codfw.wmnet with reason: host reimage
  • 16:56 root@cumin1002: START - Cookbook sre.puppet.renew-cert for dbprov2005.codfw.wmnet: Renew puppet certificate - root@cumin1002
  • 16:54 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 16:54 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2211
  • 16:53 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2211.codfw.wmnet with OS bookworm
  • 16:52 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2209.codfw.wmnet with reason: host reimage
  • 16:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2208.codfw.wmnet with reason: host reimage
  • 16:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2210
  • 16:46 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2210
  • 16:44 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2208.codfw.wmnet with reason: host reimage
  • 16:43 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2210
  • 16:43 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2210.codfw.wmnet 65.32.192.10.in-addr.arpa 5.6.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:43 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2210.codfw.wmnet 65.32.192.10.in-addr.arpa 5.6.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:43 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:43 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2210 - jelto@cumin1002"
  • 16:43 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2210 - jelto@cumin1002"
  • 16:38 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 16:38 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2210
  • 16:38 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2210.codfw.wmnet with OS bookworm
  • 16:36 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2209
  • 16:36 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2209
  • 16:35 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2209
  • 16:35 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2209.codfw.wmnet 64.32.192.10.in-addr.arpa 4.6.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:35 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2209.codfw.wmnet 64.32.192.10.in-addr.arpa 4.6.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:35 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:35 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2209 - jelto@cumin1002"
  • 16:35 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2209 - jelto@cumin1002"
  • 16:32 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 16:32 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2209
  • 16:32 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2209.codfw.wmnet with OS bookworm
  • 16:27 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2208
  • 16:27 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2208
  • 16:27 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2208
  • 16:27 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2208.codfw.wmnet 63.32.192.10.in-addr.arpa 3.6.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:27 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2208.codfw.wmnet 63.32.192.10.in-addr.arpa 3.6.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:27 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:27 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2208 - jelto@cumin1002"
  • 16:27 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2208 - jelto@cumin1002"
  • 16:23 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 16:23 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2208
  • 16:23 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2208.codfw.wmnet with OS bookworm
  • 16:20 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2208.codfw.wmnet wikikube-worker2209.codfw.wmnet wikikube-worker2210.codfw.wmnet wikikube-worker2211.codfw.wmnet on all recursors
  • 16:20 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2208.codfw.wmnet wikikube-worker2209.codfw.wmnet wikikube-worker2210.codfw.wmnet wikikube-worker2211.codfw.wmnet on all recursors
  • 16:20 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2419 to wikikube-worker2211
  • 16:19 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2211
  • 16:19 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2211
  • 16:19 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:19 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2419 to wikikube-worker2211 - jelto@cumin1002"
  • 16:18 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2419 to wikikube-worker2211 - jelto@cumin1002"
  • 16:14 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 16:14 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2419 to wikikube-worker2211
  • 16:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2418 to wikikube-worker2210
  • 16:13 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2210
  • 16:13 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2210
  • 16:13 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:13 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2418 to wikikube-worker2210 - jelto@cumin1002"
  • 16:12 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2418 to wikikube-worker2210 - jelto@cumin1002"
  • 16:09 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 16:09 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2418 to wikikube-worker2210
  • 16:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2417 to wikikube-worker2209
  • 16:05 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2209
  • 16:05 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2209
  • 16:05 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:05 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2417 to wikikube-worker2209 - jelto@cumin1002"
  • 16:04 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2417 to wikikube-worker2209 - jelto@cumin1002"
  • 16:01 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 16:01 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2417 to wikikube-worker2209
  • 16:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2416 to wikikube-worker2208
  • 15:59 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2208
  • 15:59 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2208
  • 15:59 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:59 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2416 to wikikube-worker2208 - jelto@cumin1002"
  • 15:58 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2416 to wikikube-worker2208 - jelto@cumin1002"
  • 15:55 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 15:55 jelto@cumin1002: START - Cookbook sre.hosts.rename from mw2416 to wikikube-worker2208
  • 15:53 mutante: DNS - removing uz.wikimedia.org - wiki was never created (T270987)
  • 15:53 dzahn@dns1006: END - running authdns-update
  • 15:52 mutante: DNS - removing uz.wikimedia.org - wiki was never created (T382730)
  • 15:52 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2416-2419].codfw.wmnet
  • 15:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1020 T382569', diff saved to https://phabricator.wikimedia.org/P72015 and previous config saved to /var/cache/conftool/dbconfig/20250113-155153-marostegui.json
  • 15:51 dzahn@dns1006: START - running authdns-update
  • 15:51 marostegui@cumin1002: dbctl commit (dc=all): 'Set es1041 as es4 eqiad master dbmaint T382569', diff saved to https://phabricator.wikimedia.org/P72014 and previous config saved to /var/cache/conftool/dbconfig/20250113-155135-marostegui.json
  • 15:47 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2416-2419].codfw.wmnet
  • 15:44 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbprov2005.codfw.wmnet with reason: os upgrade
  • 15:44 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on dbprov2005.codfw.wmnet with reason: os upgrade
  • 15:33 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2206-2207].codfw.wmnet
  • 15:32 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2206-2207].codfw.wmnet
  • 15:32 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbprov2004.codfw.wmnet with reason: reboot
  • 15:32 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on dbprov2004.codfw.wmnet with reason: reboot
  • 15:31 jelto: homer 'cr*codfw*' commit 'T377877'
  • 15:30 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2128 from dbctl T383572', diff saved to https://phabricator.wikimedia.org/P72013 and previous config saved to /var/cache/conftool/dbconfig/20250113-153046-marostegui.json
  • 15:30 jelto: homer 'lsw1-c5-codfw*' commit 'T377877'
  • 15:28 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2207.codfw.wmnet with OS bookworm
  • 15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2128 T383572', diff saved to https://phabricator.wikimedia.org/P72012 and previous config saved to /var/cache/conftool/dbconfig/20250113-152828-marostegui.json
  • 15:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2234.codfw.wmnet with reason: maintenance
  • 15:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on db2234.codfw.wmnet with reason: maintenance
  • 15:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2206.codfw.wmnet with OS bookworm
  • 15:16 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
  • 15:15 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
  • 15:14 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
  • 15:12 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
  • 15:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2207.codfw.wmnet with reason: host reimage
  • 15:06 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2207.codfw.wmnet with reason: host reimage
  • 15:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2206.codfw.wmnet with reason: host reimage
  • 14:59 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2206.codfw.wmnet with reason: host reimage
  • 14:59 btullis@deploy2002: Finished deploy [airflow-dags/search@8c96899]: (no justification provided) (duration: 00m 24s)
  • 14:58 btullis@deploy2002: Started deploy [airflow-dags/search@8c96899]: (no justification provided)
  • 14:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2240 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72011 and previous config saved to /var/cache/conftool/dbconfig/20250113-145432-root.json
  • 14:52 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:51 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for prod: Enable $wgCampaignEventsEnableEventWikis (T380078) (duration: 32m 04s)
  • 14:49 moritzm: installing glibc bugfix updates for Bookworm
  • 14:48 jayme: re-enabling puppet on all hosts running kubelet - T383413
  • 14:47 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2207
  • 14:47 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2207
  • 14:46 otto@deploy2002: Finished deploy [analytics/refinery@f3945ee]: gobblin eventlogging_legacy - use EventStreamConfig to pull topics (duration: 01m 27s)
  • 14:45 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2207
  • 14:45 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2207.codfw.wmnet 166.32.192.10.in-addr.arpa 6.6.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:45 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2207.codfw.wmnet 166.32.192.10.in-addr.arpa 6.6.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:45 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:45 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2207 - jelto@cumin1002"
  • 14:45 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2207 - jelto@cumin1002"
  • 14:44 otto@deploy2002: Started deploy [analytics/refinery@f3945ee]: gobblin eventlogging_legacy - use EventStreamConfig to pull topics
  • 14:43 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, daimona: Continuing with sync
  • 14:42 dcausse@deploy2002: Finished deploy [airflow-dags/search@8c96899]: search: fix glent, import_cirrus_indexes and transfer_to_es (duration: 01m 44s)
  • 14:42 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:42 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2207
  • 14:41 jayme: disabling puppet on all hosts running kubelet - T383413
  • 14:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2206
  • 14:41 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2206
  • 14:41 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2206
  • 14:41 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2206.codfw.wmnet 167.32.192.10.in-addr.arpa 7.6.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:41 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2206.codfw.wmnet 167.32.192.10.in-addr.arpa 7.6.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:41 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:41 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2206 - jelto@cumin1002"
  • 14:41 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2206 - jelto@cumin1002"
  • 14:41 otto@deploy2002: Finished deploy [analytics/refinery@f3945ee] (hadoop-test): gobblin eventlogging_legacy - use EventStreamConfig to pull topics (duration: 01m 37s)
  • 14:40 dcausse@deploy2002: Started deploy [airflow-dags/search@8c96899]: search: fix glent, import_cirrus_indexes and transfer_to_es
  • 14:40 otto@deploy2002: Started deploy [analytics/refinery@f3945ee] (hadoop-test): gobblin eventlogging_legacy - use EventStreamConfig to pull topics
  • 14:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2240 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72010 and previous config saved to /var/cache/conftool/dbconfig/20250113-143926-root.json
  • 14:37 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:37 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2206
  • 14:37 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2207.codfw.wmnet with OS bookworm
  • 14:37 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2206.codfw.wmnet with OS bookworm
  • 14:31 btullis@dns1004: END - running authdns-update
  • 14:30 btullis@dns1004: START - running authdns-update
  • 14:26 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2206.codfw.wmnet wikikube-worker2207.codfw.wmnet on all recursors
  • 14:26 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2206.codfw.wmnet wikikube-worker2207.codfw.wmnet on all recursors
  • 14:25 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts eventlog1003.eqiad.wmnet
  • 14:25 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:24 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, daimona: Backport for prod: Enable $wgCampaignEventsEnableEventWikis (T380078) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2240 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72009 and previous config saved to /var/cache/conftool/dbconfig/20250113-142421-root.json
  • 14:24 root@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for dbprov2003.codfw.wmnet: Renew puppet certificate - root@cumin1002
  • 14:24 klausman@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 14:23 klausman@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 14:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2041 to wikikube-worker2207
  • 14:22 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 14:22 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2207
  • 14:21 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2207
  • 14:21 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:21 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2041 to wikikube-worker2207 - jelto@cumin1002"
  • 14:21 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2041 to wikikube-worker2207 - jelto@cumin1002"
  • 14:21 root@cumin1002: START - Cookbook sre.puppet.renew-cert for dbprov2003.codfw.wmnet: Renew puppet certificate - root@cumin1002
  • 14:19 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for prod: Enable $wgCampaignEventsEnableEventWikis (T380078)
  • 14:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2040 to wikikube-worker2206
  • 14:18 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2206
  • 14:18 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:18 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2206
  • 14:18 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:17 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2040 to wikikube-worker2206 - jelto@cumin1002"
  • 14:17 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2040 to wikikube-worker2206 - jelto@cumin1002"
  • 14:16 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2041 to wikikube-worker2207
  • 14:14 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:13 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2040 to wikikube-worker2206
  • 14:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2240 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P72008 and previous config saved to /var/cache/conftool/dbconfig/20250113-140916-root.json
  • 14:08 klausman@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 14:07 klausman@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 14:07 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts eventlog1003.eqiad.wmnet
  • 14:06 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[2040-2041].codfw.wmnet
  • 14:05 klausman@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 14:04 klausman@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 14:04 filippo@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=thanos-query,name=eqiad
  • 14:02 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[2040-2041].codfw.wmnet
  • 13:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2240 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P72006 and previous config saved to /var/cache/conftool/dbconfig/20250113-135410-root.json
  • 13:52 kamila_: homer cr*eqiad* commit 'wikikube decoms'
  • 13:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wikikube-worker[1013-1014,1017-1018].eqiad.wmnet
  • 13:48 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:48 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1013-1014,1017-1018].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - kamila@cumin1002"
  • 13:48 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1013-1014,1017-1018].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - kamila@cumin1002"
  • 13:41 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db[2134,2160,2234].codfw.wmnet with reason: maintenance
  • 13:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 10:00:00 on db[2134,2160,2234].codfw.wmnet with reason: maintenance
  • 13:28 filippo@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=thanos-query,name=eqiad
  • 13:27 filippo@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=thanos-swift,name=eqiad
  • 13:27 filippo@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=thanos-swift,name=eqiad
  • 13:25 godog: bounce thanos-store on titan1*
  • 13:24 godog: bounce thanos-query on titan1*
  • 13:19 kamila@cumin1002: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1013-1014,1017-1018].eqiad.wmnet
  • 13:18 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wikikube-worker[1008-1010].eqiad.wmnet
  • 13:18 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:18 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1008-1010].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - kamila@cumin1002"
  • 13:18 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1008-1010].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - kamila@cumin1002"
  • 13:13 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 13:10 Daimona: Creating new DB tables for the CampaignEvents extension in x1.testwiki, x1.test2wiki, x1.officewiki, and x1.wikishared # T379294 T381424
  • 12:53 kamila@cumin1002: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1008-1010].eqiad.wmnet
  • 12:50 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts wikikube-worker[1008-1010].eqiad.wmnet
  • 12:49 kamila@cumin1002: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1008-1010].eqiad.wmnet
  • 12:27 marostegui@dns1006: END - running authdns-update
  • 12:26 marostegui@dns1006: START - running authdns-update
  • 12:25 marostegui: Switch m2-master proxy
  • 12:21 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1008-1010,1013-1014,1017-1018].eqiad.wmnet
  • 12:18 reedy@deploy2002: Synchronized php-1.44.0-wmf.11/extensions/UploadWizard/: T383415 (duration: 13m 05s)
  • 12:17 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1008-1010,1013-1014,1017-1018].eqiad.wmnet
  • 11:58 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2203-2205].codfw.wmnet
  • 11:58 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2203-2205].codfw.wmnet
  • 11:57 jayme: re-enabling puppet on all hosts running kubelet - T383413
  • 11:53 jelto: homer 'cr*codfw*' commit 'T377877'
  • 11:53 jelto: homer 'lsw1-d1-codfw*' commit 'T377877'
  • 11:51 jelto: homer 'lsw1-c6-codfw*' commit 'T377877'
  • 11:51 jayme: disabling puppet on all hosts running kubelet - T383413
  • 11:50 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2205.codfw.wmnet with OS bookworm
  • 11:50 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 11:49 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 11:49 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 11:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P72004 and previous config saved to /var/cache/conftool/dbconfig/20250113-114852-root.json
  • 11:48 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 11:48 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 11:46 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 11:45 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:45 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:45 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:45 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:45 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:45 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 11:45 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 11:45 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 11:45 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:44 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 11:44 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:44 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 11:44 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:44 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 11:44 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 11:44 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 11:44 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 11:44 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 11:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2204.codfw.wmnet with OS bookworm
  • 11:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P72002 and previous config saved to /var/cache/conftool/dbconfig/20250113-113347-root.json
  • 11:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2205.codfw.wmnet with reason: host reimage
  • 11:28 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2205.codfw.wmnet with reason: host reimage
  • 11:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2203.codfw.wmnet with OS bookworm
  • 11:24 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db[2132,2160,2232].codfw.wmnet with reason: maintenance
  • 11:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 10:00:00 on db[2132,2160,2232].codfw.wmnet with reason: maintenance
  • 11:24 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2204.codfw.wmnet with reason: host reimage
  • 11:20 marostegui: Move db2160:3322 under db2232 in m2 codfw dbmaint T373579
  • 11:20 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2204.codfw.wmnet with reason: host reimage
  • 11:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2132,2160,2232].codfw.wmnet with reason: maintenance
  • 11:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db[2132,2160,2232].codfw.wmnet with reason: maintenance
  • 11:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P72000 and previous config saved to /var/cache/conftool/dbconfig/20250113-111842-root.json
  • 11:09 moritzm: installing pymysql security updates
  • 11:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2205
  • 11:09 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2205
  • 11:07 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2203.codfw.wmnet with reason: host reimage
  • 11:07 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2205
  • 11:07 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2205.codfw.wmnet 230.48.192.10.in-addr.arpa 0.3.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:06 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2205.codfw.wmnet 230.48.192.10.in-addr.arpa 0.3.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:06 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:06 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2205 - jelto@cumin1002"
  • 11:06 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2205 - jelto@cumin1002"
  • 11:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db[2133,2160].codfw.wmnet with reason: cloning
  • 11:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 10:00:00 on db[2133,2160].codfw.wmnet with reason: cloning
  • 11:04 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2203.codfw.wmnet with reason: host reimage
  • 11:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P71998 and previous config saved to /var/cache/conftool/dbconfig/20250113-110336-root.json
  • 11:03 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2123 from dbctl for decommission', diff saved to https://phabricator.wikimedia.org/P71997 and previous config saved to /var/cache/conftool/dbconfig/20250113-110333-marostegui.json
  • 11:03 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:03 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2205
  • 11:03 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2205.codfw.wmnet with OS bookworm
  • 11:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2204
  • 11:01 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2204
  • 11:01 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2204
  • 11:01 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2204.codfw.wmnet 164.32.192.10.in-addr.arpa 4.6.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:01 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2204.codfw.wmnet 164.32.192.10.in-addr.arpa 4.6.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:01 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:01 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2204 - jelto@cumin1002"
  • 11:01 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2204 - jelto@cumin1002"
  • 11:00 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov2003.codfw.wmnet with reason: os upgrade
  • 11:00 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov2003.codfw.wmnet with reason: os upgrade
  • 10:58 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:57 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2204
  • 10:57 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2204.codfw.wmnet with OS bookworm
  • 10:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2123.codfw.wmnet
  • 10:55 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:55 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2123.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 10:54 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2123.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 10:50 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 10:49 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2240.codfw.wmnet
  • 10:46 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2123.codfw.wmnet
  • 10:45 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2203
  • 10:45 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2203
  • 10:45 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2203
  • 10:45 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2203.codfw.wmnet 165.32.192.10.in-addr.arpa 5.6.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:44 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2203.codfw.wmnet 165.32.192.10.in-addr.arpa 5.6.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:44 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:44 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2203 - jelto@cumin1002"
  • 10:44 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2203 - jelto@cumin1002"
  • 10:43 root@cumin1002: START - Cookbook sre.mysql.upgrade for db2240.codfw.wmnet
  • 10:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2240', diff saved to https://phabricator.wikimedia.org/P71996 and previous config saved to /var/cache/conftool/dbconfig/20250113-104310-marostegui.json
  • 10:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2240 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P71995 and previous config saved to /var/cache/conftool/dbconfig/20250113-104250-root.json
  • 10:41 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:41 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2203
  • 10:41 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P71994 and previous config saved to /var/cache/conftool/dbconfig/20250113-104115-root.json
  • 10:41 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2203.codfw.wmnet with OS bookworm
  • 10:36 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2203.codfw.wmnet wikikube-worker2204.codfw.wmnet wikikube-worker2205.codfw.wmnet on all recursors
  • 10:36 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2203.codfw.wmnet wikikube-worker2204.codfw.wmnet wikikube-worker2205.codfw.wmnet on all recursors
  • 10:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on pc1015.eqiad.wmnet with reason: cloning
  • 10:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on pc1015.eqiad.wmnet with reason: cloning
  • 10:28 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 10:28 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 10:26 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 10:26 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 10:25 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 10:25 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 10:25 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2044 to wikikube-worker2205
  • 10:24 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2205
  • 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repool pc3 T383398', diff saved to https://phabricator.wikimedia.org/P71993 and previous config saved to /var/cache/conftool/dbconfig/20250113-102343-marostegui.json
  • 10:21 marostegui@cumin1002: dbctl commit (dc=all): 'Make pc1013 master in pc3 T383398', diff saved to https://phabricator.wikimedia.org/P71992 and previous config saved to /var/cache/conftool/dbconfig/20250113-102152-marostegui.json
  • 10:20 marostegui@cumin1002: dbctl commit (dc=all): 'Remove pc1015 from pc3', diff saved to https://phabricator.wikimedia.org/P71991 and previous config saved to /var/cache/conftool/dbconfig/20250113-102047-marostegui.json
  • 10:19 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2205
  • 10:18 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:18 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2044 to wikikube-worker2205 - jelto@cumin1002"
  • 10:18 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2044 to wikikube-worker2205 - jelto@cumin1002"
  • 10:14 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:14 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2044 to wikikube-worker2205
  • 10:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2043 to wikikube-worker2204
  • 10:13 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2204
  • 10:13 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for pc1015.eqiad.wmnet
  • 10:13 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2204
  • 10:13 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:13 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2043 to wikikube-worker2204 - jelto@cumin1002"
  • 10:12 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for pc2013.codfw.wmnet
  • 10:12 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2043 to wikikube-worker2204 - jelto@cumin1002"
  • 10:11 marostegui@cumin1002: dbctl commit (dc=all): 'Fix weights in pc3', diff saved to https://phabricator.wikimedia.org/P71990 and previous config saved to /var/cache/conftool/dbconfig/20250113-101132-marostegui.json
  • 10:10 ladsgroup@deploy2002: Finished scap sync-world: Backport for Add wikitech.wikimedia.org to list of local vhosts (T376305) (duration: 22m 28s)
  • 10:08 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:07 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2043 to wikikube-worker2204
  • 10:07 marostegui: Upgrade pc2013 pc1015 pc3 dbmaint eqiad codfw T383398
  • 10:07 root@cumin1002: START - Cookbook sre.mysql.upgrade for pc1015.eqiad.wmnet
  • 10:06 root@cumin1002: START - Cookbook sre.mysql.upgrade for pc2013.codfw.wmnet
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depool pc3 T383398', diff saved to https://phabricator.wikimedia.org/P71989 and previous config saved to /var/cache/conftool/dbconfig/20250113-100554-marostegui.json
  • 10:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2042 to wikikube-worker2203
  • 10:04 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2203
  • 10:04 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2203
  • 10:04 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:04 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2042 to wikikube-worker2203 - jelto@cumin1002"
  • 10:03 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2042 to wikikube-worker2203 - jelto@cumin1002"
  • 10:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on pc2013.codfw.wmnet with reason: cloning
  • 10:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on pc2013.codfw.wmnet with reason: cloning
  • 10:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on pc1013.eqiad.wmnet with reason: cloning
  • 10:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on pc1013.eqiad.wmnet with reason: cloning
  • 10:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on pc1015.eqiad.wmnet with reason: cloning
  • 10:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on pc1015.eqiad.wmnet with reason: cloning
  • 10:00 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 10:00 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:00 ladsgroup@deploy2002: ladsgroup: Backport for Add wikitech.wikimedia.org to list of local vhosts (T376305) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:59 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2042 to wikikube-worker2203
  • 09:56 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and not P{cp4052.ulsfo.wmnet} and A:cp
  • 09:55 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[2042-2044].codfw.wmnet
  • 09:51 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[2042-2044].codfw.wmnet
  • 09:49 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Reboot
  • 09:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es1023.eqiad.wmnet with reason: cloning
  • 09:49 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Reboot
  • 09:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es1023.eqiad.wmnet with reason: cloning
  • 09:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1023', diff saved to https://phabricator.wikimedia.org/P71988 and previous config saved to /var/cache/conftool/dbconfig/20250113-094846-marostegui.json
  • 09:48 marostegui@cumin1002: dbctl commit (dc=all): 'Switchover es5 eqiad master dbmaint T382569', diff saved to https://phabricator.wikimedia.org/P71987 and previous config saved to /var/cache/conftool/dbconfig/20250113-094833-marostegui.json
  • 09:48 ladsgroup@deploy2002: Started scap sync-world: Backport for Add wikitech.wikimedia.org to list of local vhosts (T376305)
  • 09:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2212.codfw.wmnet
  • 09:42 ladsgroup@cumin1002: START - Cookbook sre.mysql.upgrade for db2212.codfw.wmnet
  • 09:41 Amir1: dbmaint on pc5@eqiad (T382948)
  • 09:38 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and not P{cp4052.ulsfo.wmnet} and A:cp
  • 09:36 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and not P{cp4044.ulsfo.wmnet} and A:cp
  • 09:17 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and not P{cp4044.ulsfo.wmnet} and A:cp
  • 08:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2135.codfw.wmnet
  • 08:57 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:57 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2135.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 08:57 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2135.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 08:53 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 08:49 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2135.codfw.wmnet
  • 08:34 hashar@deploy2002: Finished deploy [integration/docroot@a81d82c]: build: Updating mediawiki/mediawiki-phan-config to 0.15.1 (duration: 00m 09s)
  • 08:34 hashar@deploy2002: Started deploy [integration/docroot@a81d82c]: build: Updating mediawiki/mediawiki-phan-config to 0.15.1
  • 08:27 moritzm: updated netboot image for bookworm to 12.9 T383537

2025-01-12

  • 09:13 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:12 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.

2025-01-11

  • 23:02 fabfur@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2212.codfw.wmnet with reason: Replication lag
  • 23:02 fabfur@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2212.codfw.wmnet with reason: Replication lag
  • 23:02 fabfur@cumin1002: dbctl commit (dc=all): 'Depool db2212', diff saved to https://phabricator.wikimedia.org/P71985 and previous config saved to /var/cache/conftool/dbconfig/20250111-230213-fabfur.json
  • 01:39 eileen: config revision changed from fba21538 to b41ed54d
  • 00:06 eileen: config revision changed from f86e46bb to b1f34373

2025-01-10

  • 23:33 eileen: config revision changed from 4947f9bd to f86e46bb
  • 23:09 eileen: config revision changed from 5d411dbb to 4947f9bd
  • 22:59 eileen: config revision changed from 51a3e52e to 5d411dbb
  • 22:49 eileen: config revision changed from cf756e5f to 51a3e52e
  • 22:45 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cloudelastic1005*,cloudelastic1006* for ban hosts prior to decom - bking@cumin2002 - T380937
  • 22:45 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1005*,cloudelastic1006* for ban hosts prior to decom - bking@cumin2002 - T380937
  • 22:45 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1005,cloudelastic1006 for ban hosts prior to decom - bking@cumin2002 - T380937
  • 22:45 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1005,cloudelastic1006 for ban hosts prior to decom - bking@cumin2002 - T380937
  • 22:21 eileen: config revision changed from 2a572b99 to cf756e5f
  • 22:11 eileen: config revision changed from e0866d2f to 2a572b99
  • 20:01 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 19:43 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 19:41 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 19:26 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on eventlog1003.eqiad.wmnet with reason: Shutting down VM in preparation for decommissioning
  • 19:26 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on eventlog1003.eqiad.wmnet with reason: Shutting down VM in preparation for decommissioning
  • 19:23 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Upgrading to Cassandra 4.1.7 — T380420 - eevans@cumin1002
  • 18:49 sukhe: sudo cumin 'P:Mediawiki::Maintenance' 'run-puppet-agent': CR 1109755
  • 16:56 cmooney@dns2005: END - running authdns-update
  • 16:55 cmooney@dns2005: START - running authdns-update
  • 16:50 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:50 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names in newly assigned wmcs private ipv6 ranges - cmooney@cumin1002"
  • 16:50 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names in newly assigned wmcs private ipv6 ranges - cmooney@cumin1002"
  • 16:47 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 16:28 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:28 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names in newly assigned wmcs private ipv6 ranges - cmooney@cumin1002"
  • 16:28 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
  • 16:28 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names in newly assigned wmcs private ipv6 ranges - cmooney@cumin1002"
  • 16:27 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/data-gateway: apply
  • 16:26 eevans@deploy2002: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
  • 16:25 eevans@deploy2002: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
  • 16:23 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 16:23 cmooney@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 16:19 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 15:46 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
  • 15:45 eevans@deploy2002: helmfile [staging] START helmfile.d/services/data-gateway: apply
  • 15:36 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2199-2202].codfw.wmnet
  • 15:36 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2199-2202].codfw.wmnet
  • 15:34 jelto: homer 'cr*codfw*' commit 'T377877'
  • 15:33 jelto: homer 'lsw1-d1-codfw*' commit 'T377877'
  • 15:33 jelto: homer 'lsw1-d5-codfw*' commit 'T377877'
  • 15:32 jelto: homer 'lsw1-d3-codfw*' commit 'T377877'
  • 15:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2202.codfw.wmnet with OS bookworm
  • 15:25 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2201.codfw.wmnet with OS bookworm
  • 15:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2240.codfw.wmnet with reason: maintenance
  • 15:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 10:00:00 on db2240.codfw.wmnet with reason: maintenance
  • 15:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2240.codfw.wmnet with reason: maintenance
  • 15:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2240.codfw.wmnet with reason: maintenance
  • 15:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2240 to make it candidate master', diff saved to https://phabricator.wikimedia.org/P71984 and previous config saved to /var/cache/conftool/dbconfig/20250110-152035-marostegui.json
  • 15:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2202.codfw.wmnet with reason: host reimage
  • 15:08 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2202.codfw.wmnet with reason: host reimage
  • 15:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2201.codfw.wmnet with reason: host reimage
  • 15:02 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2201.codfw.wmnet with reason: host reimage
  • 14:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2202
  • 14:49 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2202
  • 14:48 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2202
  • 14:48 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2202.codfw.wmnet 226.48.192.10.in-addr.arpa 6.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:48 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2202.codfw.wmnet 226.48.192.10.in-addr.arpa 6.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:48 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:46 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:46 jelto@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 14:46 jelto@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2202 - jelto@cumin1002"
  • 14:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2201
  • 14:42 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2201
  • 14:42 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2201
  • 14:42 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2201.codfw.wmnet 227.48.192.10.in-addr.arpa 7.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:42 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2201.codfw.wmnet 227.48.192.10.in-addr.arpa 7.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:42 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:42 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2201 - jelto@cumin1002"
  • 14:33 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2202 - jelto@cumin1002"
  • 14:29 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:27 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2201 - jelto@cumin1002"
  • 14:26 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2202
  • 14:26 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2202.codfw.wmnet with OS bookworm
  • 14:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2200.codfw.wmnet with OS bookworm
  • 14:24 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:24 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2201
  • 14:23 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2201.codfw.wmnet with OS bookworm
  • 14:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2199.codfw.wmnet with OS bookworm
  • 14:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2200.codfw.wmnet with reason: host reimage
  • 14:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2199.codfw.wmnet with reason: host reimage
  • 14:02 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2200.codfw.wmnet with reason: host reimage
  • 13:59 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2199.codfw.wmnet with reason: host reimage
  • 13:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: maintenance
  • 13:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: maintenance
  • 13:56 aqu@deploy2002: Finished deploy [airflow-dags/analytics@7a1a552]: Backfill 2024 12: cassandra_load_pageview_per_article (duration: 01m 19s)
  • 13:54 aqu@deploy2002: Started deploy [airflow-dags/analytics@7a1a552]: Backfill 2024 12: cassandra_load_pageview_per_article
  • 13:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1170.eqiad.wmnet with reason: maintenance
  • 13:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1170.eqiad.wmnet with reason: maintenance
  • 13:49 jynus@cumin1002: dbctl commit (dc=all): 'depool db1170', diff saved to https://phabricator.wikimedia.org/P71983 and previous config saved to /var/cache/conftool/dbconfig/20250110-134954-jynus.json
  • 13:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2200
  • 13:43 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2200
  • 13:43 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2200
  • 13:43 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2200.codfw.wmnet 228.48.192.10.in-addr.arpa 8.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:43 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2200.codfw.wmnet 228.48.192.10.in-addr.arpa 8.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:43 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:40 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:40 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2199
  • 13:40 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2199
  • 13:40 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2199
  • 13:40 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2199.codfw.wmnet 229.48.192.10.in-addr.arpa 9.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:40 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2199.codfw.wmnet 229.48.192.10.in-addr.arpa 9.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:40 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:40 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2199 - jelto@cumin1002"
  • 13:40 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2199 - jelto@cumin1002"
  • 13:38 marostegui: Move pc2013 to pc4 dbmaint codfw - T383398
  • 13:36 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:35 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2200
  • 13:35 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2199
  • 13:35 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2200.codfw.wmnet with OS bookworm
  • 13:34 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2199.codfw.wmnet with OS bookworm
  • 13:32 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2199.codfw.wmnet wikikube-worker2200.codfw.wmnet wikikube-worker2201.codfw.wmnet wikikube-worker2202.codfw.wmnet on all recursors
  • 13:32 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2199.codfw.wmnet wikikube-worker2200.codfw.wmnet wikikube-worker2201.codfw.wmnet wikikube-worker2202.codfw.wmnet on all recursors
  • 13:30 marostegui: Move pc1013 to pc3 dbmaint eqiad - T383398
  • 13:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2048 to wikikube-worker2202
  • 13:29 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2202
  • 13:29 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2202
  • 13:29 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:29 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2048 to wikikube-worker2202 - jelto@cumin1002"
  • 13:28 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2048 to wikikube-worker2202 - jelto@cumin1002"
  • 13:25 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:25 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2048 to wikikube-worker2202
  • 13:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2047 to wikikube-worker2201
  • 13:23 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2201
  • 13:22 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2201
  • 13:22 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:22 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2047 to wikikube-worker2201 - jelto@cumin1002"
  • 13:21 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2047 to wikikube-worker2201 - jelto@cumin1002"
  • 13:18 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:17 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2047 to wikikube-worker2201
  • 13:17 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2046 to wikikube-worker2200
  • 13:16 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2200
  • 13:15 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2200
  • 13:15 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:14 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2046 to wikikube-worker2200 - jelto@cumin1002"
  • 13:10 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:10 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2046 to wikikube-worker2200
  • 13:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2045 to wikikube-worker2199
  • 13:09 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2199
  • 13:08 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2199
  • 13:08 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:08 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2045 to wikikube-worker2199 - jelto@cumin1002"
  • 13:08 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2045 to wikikube-worker2199 - jelto@cumin1002"
  • 13:05 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 13:04 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2045 to wikikube-worker2199
  • 13:00 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[2045-2048].codfw.wmnet
  • 12:58 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[2045-2048].codfw.wmnet
  • 12:44 marostegui@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P71982 and previous config saved to /var/cache/conftool/dbconfig/20250110-124438-root.json
  • 12:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2126.codfw.wmnet
  • 12:30 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:30 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2126.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 12:30 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2126.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 12:29 marostegui@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P71981 and previous config saved to /var/cache/conftool/dbconfig/20250110-122933-root.json
  • 12:27 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 12:22 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2126.codfw.wmnet
  • 12:22 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2126 from dbctl T383395', diff saved to https://phabricator.wikimedia.org/P71980 and previous config saved to /var/cache/conftool/dbconfig/20250110-122206-marostegui.json
  • 12:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2126 T383395', diff saved to https://phabricator.wikimedia.org/P71979 and previous config saved to /var/cache/conftool/dbconfig/20250110-121657-marostegui.json
  • 12:14 marostegui@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P71978 and previous config saved to /var/cache/conftool/dbconfig/20250110-121427-root.json
  • 12:07 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 11:59 marostegui@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P71977 and previous config saved to /var/cache/conftool/dbconfig/20250110-115922-root.json
  • 11:54 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2195-2198].codfw.wmnet
  • 11:54 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2195-2198].codfw.wmnet
  • 11:51 jelto: homer 'cr*codw*' commit 'T377877'
  • 11:51 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 11:50 jelto: homer 'lsw1-d6-codfw*' commit 'T377877'
  • 11:50 jelto: homer 'lsw1-d8-codfw*' commit 'T377877'
  • 11:49 jelto: homer 'lsw1-d5-codfw*' commit 'T377877'
  • 11:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2197.codfw.wmnet with OS bookworm
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P71976 and previous config saved to /var/cache/conftool/dbconfig/20250110-114417-root.json
  • 11:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2198.codfw.wmnet with OS bookworm
  • 11:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2197.codfw.wmnet with reason: host reimage
  • 11:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2198.codfw.wmnet with reason: host reimage
  • 11:21 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2197.codfw.wmnet with reason: host reimage
  • 11:19 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2198.codfw.wmnet with reason: host reimage
  • 11:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2128 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P71975 and previous config saved to /var/cache/conftool/dbconfig/20250110-110643-root.json
  • 11:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P71974 and previous config saved to /var/cache/conftool/dbconfig/20250110-110633-root.json
  • 11:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2197
  • 11:02 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2197
  • 11:01 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2197
  • 11:01 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2197.codfw.wmnet 223.48.192.10.in-addr.arpa 3.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:01 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2197.codfw.wmnet 223.48.192.10.in-addr.arpa 3.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:01 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2198
  • 10:59 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2198
  • 10:59 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:58 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2198
  • 10:58 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2198.codfw.wmnet 222.48.192.10.in-addr.arpa 2.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:58 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2198.codfw.wmnet 222.48.192.10.in-addr.arpa 2.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:58 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:58 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2198 - jelto@cumin1002"
  • 10:58 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2198 - jelto@cumin1002"
  • 10:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2192.codfw.wmnet with reason: maintenance
  • 10:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2192.codfw.wmnet with reason: maintenance
  • 10:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2192 to change binlog format', diff saved to https://phabricator.wikimedia.org/P71973 and previous config saved to /var/cache/conftool/dbconfig/20250110-105514-marostegui.json
  • 10:55 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:54 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2198
  • 10:54 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2197
  • 10:54 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2198.codfw.wmnet with OS bookworm
  • 10:54 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2197.codfw.wmnet with OS bookworm
  • 10:52 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P71972 and previous config saved to /var/cache/conftool/dbconfig/20250110-105202-root.json
  • 10:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2195.codfw.wmnet with OS bookworm
  • 10:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2128 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P71971 and previous config saved to /var/cache/conftool/dbconfig/20250110-105137-root.json
  • 10:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P71970 and previous config saved to /var/cache/conftool/dbconfig/20250110-105127-root.json
  • 10:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2196.codfw.wmnet with OS bookworm
  • 10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2123 T383388', diff saved to https://phabricator.wikimedia.org/P71969 and previous config saved to /var/cache/conftool/dbconfig/20250110-104739-marostegui.json
  • 10:36 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P71968 and previous config saved to /var/cache/conftool/dbconfig/20250110-103657-root.json
  • 10:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2128 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P71967 and previous config saved to /var/cache/conftool/dbconfig/20250110-103632-root.json
  • 10:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P71966 and previous config saved to /var/cache/conftool/dbconfig/20250110-103622-root.json
  • 10:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2195.codfw.wmnet with reason: host reimage
  • 10:30 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1093-1095].eqiad.wmnet
  • 10:30 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1093-1095].eqiad.wmnet
  • 10:28 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2196.codfw.wmnet with reason: host reimage
  • 10:28 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2195.codfw.wmnet with reason: host reimage
  • 10:25 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2196.codfw.wmnet with reason: host reimage
  • 10:21 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P71965 and previous config saved to /var/cache/conftool/dbconfig/20250110-102152-root.json
  • 10:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2128 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P71964 and previous config saved to /var/cache/conftool/dbconfig/20250110-102126-root.json
  • 10:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P71963 and previous config saved to /var/cache/conftool/dbconfig/20250110-102116-root.json
  • 10:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbproxy1022.eqiad.wmnet with reason: maintenance
  • 10:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbproxy1022.eqiad.wmnet with reason: maintenance
  • 10:07 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2195
  • 10:07 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2195
  • 10:07 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2195
  • 10:07 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2195.codfw.wmnet 225.48.192.10.in-addr.arpa 5.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:07 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2195.codfw.wmnet 225.48.192.10.in-addr.arpa 5.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:07 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P71962 and previous config saved to /var/cache/conftool/dbconfig/20250110-100646-root.json
  • 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2128 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P71961 and previous config saved to /var/cache/conftool/dbconfig/20250110-100621-root.json
  • 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P71960 and previous config saved to /var/cache/conftool/dbconfig/20250110-100611-root.json
  • 10:06 elukey: restart dump_cloud_ip_ranges on puppetserver1001 - unit failed due to errors while fetching new data from upstream, trying to see if it was a temporary issue
  • 10:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2196
  • 10:05 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2196
  • 10:05 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2196
  • 10:05 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2196.codfw.wmnet 224.48.192.10.in-addr.arpa 4.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:05 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:05 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2196.codfw.wmnet 224.48.192.10.in-addr.arpa 4.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:05 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:05 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2196 - jelto@cumin1002"
  • 10:05 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2196 - jelto@cumin1002"
  • 10:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2128,2186,2228].codfw.wmnet with reason: maintenance
  • 10:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db[2128,2186,2228].codfw.wmnet with reason: maintenance
  • 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2128 db2228 T373579', diff saved to https://phabricator.wikimedia.org/P71959 and previous config saved to /var/cache/conftool/dbconfig/20250110-100248-marostegui.json
  • 10:02 elukey: elukey@cumin1002:~$ sudo cumin -b 20 'an-worker*' 'apt-get clean' (safety to free space and avoid issues on hadoop) - T383320
  • 10:01 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:00 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2196
  • 10:00 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2195
  • 10:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2196.codfw.wmnet with OS bookworm
  • 10:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2195.codfw.wmnet with OS bookworm
  • 09:57 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2195.codfw.wmnet wikikube-worker2196.codfw.wmnet wikikube-worker2197.codfw.wmnet wikikube-worker2198.codfw.wmnet on all recursors
  • 09:57 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2195.codfw.wmnet wikikube-worker2196.codfw.wmnet wikikube-worker2197.codfw.wmnet wikikube-worker2198.codfw.wmnet on all recursors
  • 09:57 elukey: kill hanging jupyterhub process on stat1009 to allow puppet to run an delete a user
  • 09:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2052 to wikikube-worker2198
  • 09:56 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2198
  • 09:56 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2198
  • 09:56 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:56 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2052 to wikikube-worker2198 - jelto@cumin1002"
  • 09:55 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2052 to wikikube-worker2198 - jelto@cumin1002"
  • 09:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db[2132,2160,2232].codfw.wmnet with reason: maintenance
  • 09:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db[2132,2160,2232].codfw.wmnet with reason: maintenance
  • 09:52 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:51 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2052 to wikikube-worker2198
  • 09:51 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P71957 and previous config saved to /var/cache/conftool/dbconfig/20250110-095141-root.json
  • 09:50 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2051 to wikikube-worker2197
  • 09:50 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2197
  • 09:50 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2197
  • 09:50 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:50 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2051 to wikikube-worker2197 - jelto@cumin1002"
  • 09:49 elukey: elukey@cumin1002:~$ sudo cumin 'an-worker11[39,15,54,90,75,57,89,18,06,24]*' 'apt-get clean'
  • 09:49 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2051 to wikikube-worker2197 - jelto@cumin1002"
  • 09:46 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:45 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2051 to wikikube-worker2197
  • 09:45 elukey: elukey@cumin1002:~$ sudo cumin 'an-worker11[16,43,19,47,56,72,69]*' 'apt-get clean'
  • 09:45 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2050 to wikikube-worker2196
  • 09:44 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2196
  • 09:44 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2196
  • 09:44 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:44 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2050 to wikikube-worker2196 - jelto@cumin1002"
  • 09:43 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2050 to wikikube-worker2196 - jelto@cumin1002"
  • 09:40 elukey: `apt-get clean` on an-worker1147 to free space on the root partition
  • 09:39 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:39 elukey: `apt-get clean` on an-worker1117 to free space on the root partition
  • 09:39 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2050 to wikikube-worker2196
  • 09:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2049 to wikikube-worker2195
  • 09:38 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2195
  • 09:37 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2195
  • 09:37 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:37 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2049 to wikikube-worker2195 - jelto@cumin1002"
  • 09:36 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2049 to wikikube-worker2195 - jelto@cumin1002"
  • 09:33 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 09:32 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2049 to wikikube-worker2195
  • 09:28 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[2049-2052].codfw.wmnet
  • 09:25 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[2049-2052].codfw.wmnet
  • 09:21 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2022.codfw.wmnet
  • 09:21 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2022.codfw.wmnet
  • 09:19 jelto: homer 'lsw1-c6-codfw*' commit 'T377877'
  • 09:00 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 08:39 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1243.eqiad.wmnet
  • 08:39 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1243.eqiad.wmnet
  • 08:36 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1081.eqiad.wmnet
  • 08:36 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1081.eqiad.wmnet
  • 08:34 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1073.eqiad.wmnet
  • 08:34 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1073.eqiad.wmnet
  • 08:31 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1069.eqiad.wmnet
  • 08:31 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1069.eqiad.wmnet
  • 08:25 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1057.eqiad.wmnet
  • 08:25 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1057.eqiad.wmnet
  • 08:21 jelto: homer 'lsw1-e3-eqiad*' commit 'T377876'
  • 08:15 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 05:02 eileen: civicrm upgraded from b357a6fd to 1ea537d3
  • 02:36 eileen: civicrm upgraded from e8326943 to b357a6fd
  • 01:32 eileen: config revision changed from fc1c1a6b to e0866d2f
  • 01:07 eileen: civicrm upgraded from 24f3b57f to e8326943

2025-01-09

  • 23:00 inflatador: bking@pcc-db1002.puppet-diffs.eqiad1.wikimedia.cloud sudo -u jenkins-deploy /usr/local/sbin/pcc_facts_processor T378368
  • 23:00 inflatador: bking@puppetserver1001:~$ sudo /usr/local/sbin/puppet-facts-upload --proxy http://webproxy.eqiad.wmnet:8080 T378368
  • 21:44 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Sync cloudelastic1011 status change after Netbox update - bking@cumin2002 - T378368"
  • 21:43 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Sync cloudelastic1011 status change after Netbox update - bking@cumin2002 - T378368"
  • 20:30 aqu@deploy2002: Finished deploy [airflow-dags/analytics@0e4370e]: Canary event fix (duration: 01m 23s)
  • 20:29 aqu@deploy2002: Started deploy [airflow-dags/analytics@0e4370e]: Canary event fix
  • 20:06 dcausse@deploy2002: Finished deploy [airflow-dags/search@718e870]: search: switch query_clicks to SparkSqlOperator (duration: 00m 27s)
  • 20:05 dcausse@deploy2002: Started deploy [airflow-dags/search@718e870]: search: switch query_clicks to SparkSqlOperator
  • 19:54 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 19:54 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 19:49 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 19:34 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.44.0-wmf.11 refs T382362
  • 18:49 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: apply
  • 18:46 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: apply
  • 18:46 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-wikifunctions: apply
  • 18:45 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/mw-wikifunctions: apply
  • 18:45 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
  • 18:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1095.eqiad.wmnet with OS bookworm
  • 18:44 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
  • 18:44 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
  • 18:42 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
  • 18:42 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 18:41 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 18:41 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 18:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1094.eqiad.wmnet with OS bookworm
  • 18:39 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 18:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 18:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1093.eqiad.wmnet with OS bookworm
  • 18:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 18:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 18:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 18:27 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 18:26 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 18:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1095.eqiad.wmnet with reason: host reimage
  • 18:24 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 18:23 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 18:20 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1094.eqiad.wmnet with reason: host reimage
  • 18:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1093.eqiad.wmnet with reason: host reimage
  • 18:14 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1094.eqiad.wmnet with reason: host reimage
  • 18:13 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1095.eqiad.wmnet with reason: host reimage
  • 18:13 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1093.eqiad.wmnet with reason: host reimage
  • 17:58 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1095
  • 17:58 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1095
  • 17:58 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1095.eqiad.wmnet with OS bookworm
  • 17:58 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1094
  • 17:58 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1094
  • 17:58 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1094.eqiad.wmnet with OS bookworm
  • 17:58 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1093
  • 17:58 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1093
  • 17:58 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1093.eqiad.wmnet with OS bookworm
  • 17:55 kamila@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1095.eqiad.wmnet with OS bookworm
  • 17:55 kamila@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1094.eqiad.wmnet with OS bookworm
  • 17:55 kamila@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1093.eqiad.wmnet with OS bookworm
  • 17:50 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: apply
  • 17:49 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: apply
  • 17:49 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-wikifunctions: apply
  • 17:48 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/mw-wikifunctions: apply
  • 17:48 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 17:47 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 17:47 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 17:45 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 17:45 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 17:44 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 17:44 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 17:42 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 17:42 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 17:41 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 17:41 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 17:41 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 17:39 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 17:39 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
  • 17:38 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
  • 17:38 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
  • 17:37 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1095
  • 17:37 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1095
  • 17:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1095.eqiad.wmnet with OS bookworm
  • 17:37 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
  • 17:37 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1094
  • 17:37 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1094
  • 17:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1094.eqiad.wmnet with OS bookworm
  • 17:26 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 17:25 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 17:25 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 17:22 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 17:22 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 17:21 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 17:21 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 17:21 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1093
  • 17:20 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1093
  • 17:20 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1093.eqiad.wmnet with OS bookworm
  • 17:20 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2022.codfw.wmnet with reason: host reimage
  • 17:19 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 17:19 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 17:18 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 17:18 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 17:17 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 17:17 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: apply
  • 17:16 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2022.codfw.wmnet with reason: host reimage
  • 17:16 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: apply
  • 17:16 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-wikifunctions: apply
  • 17:15 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/mw-wikifunctions: apply
  • 17:02 cdanis@deploy2002: Finished scap sync-world: Backport for group1: enable OpenTelemetry exports (T340552) (duration: 14m 22s)
  • 16:55 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2022
  • 16:55 elukey@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2022
  • 16:55 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 16:54 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2022
  • 16:54 elukey@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2022
  • 16:54 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2022
  • 16:54 elukey@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2022
  • 16:54 elukey@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2022
  • 16:54 elukey@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2022.codfw.wmnet 212.32.192.10.in-addr.arpa 2.1.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:54 elukey@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2022.codfw.wmnet 212.32.192.10.in-addr.arpa 2.1.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:54 elukey@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:54 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2022 - elukey@cumin1002"
  • 16:54 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2022 - elukey@cumin1002"
  • 16:53 cdanis@deploy2002: cdanis: Continuing with sync
  • 16:53 cdanis@deploy2002: cdanis: Backport for group1: enable OpenTelemetry exports (T340552) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:50 elukey@cumin1002: START - Cookbook sre.dns.netbox
  • 16:50 elukey@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2022
  • 16:48 cdanis@deploy2002: Started scap sync-world: Backport for group1: enable OpenTelemetry exports (T340552)
  • 16:40 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.44.0-wmf.11 refs T382362
  • 16:33 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1243.eqiad.wmnet with OS bookworm
  • 16:33 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:29 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 16:28 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 16:27 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:21 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1095.eqiad.wmnet with OS bookworm
  • 16:21 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1094.eqiad.wmnet with OS bookworm
  • 16:21 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1093.eqiad.wmnet with OS bookworm
  • 16:21 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 16:21 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1069.eqiad.wmnet with OS bookworm
  • 16:21 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:20 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 16:12 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:08 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1243.eqiad.wmnet with reason: host reimage
  • 16:05 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1243.eqiad.wmnet with reason: host reimage
  • 15:56 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1057.eqiad.wmnet with OS bookworm
  • 15:56 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:55 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:54 sukhe@dns1004:: END - running authdns-update
  • 15:53 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1069.eqiad.wmnet with reason: host reimage
  • 15:52 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 15:52 sukhe@dns1004:: START - running authdns-update
  • 15:52 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 15:49 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1069.eqiad.wmnet with reason: host reimage
  • 15:46 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1243.eqiad.wmnet with OS bookworm
  • 15:46 inflatador: bking@an-airflow1005 stopping airflow-search services as part of k8s migration T380615
  • 15:46 tchin@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
  • 15:45 tchin@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
  • 15:39 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1073.eqiad.wmnet with OS bookworm
  • 15:39 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:39 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:37 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1057.eqiad.wmnet with reason: host reimage
  • 15:33 tchin@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
  • 15:33 tchin@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
  • 15:33 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1057.eqiad.wmnet with reason: host reimage
  • 15:31 sukhe@dns1004:: END - running authdns-update
  • 15:31 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1069.eqiad.wmnet with OS bookworm
  • 15:30 sukhe@dns1004:: START - running authdns-update
  • 15:28 logmsgbot: testing update from dns host
  • 15:27 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2193-2194].codfw.wmnet
  • 15:27 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2193-2194].codfw.wmnet
  • 15:25 jelto: homer 'cr*codfw*' commit 'T377877'
  • 15:24 jelto: homer 'lsw1-c5-codfw*' commit 'T377877'
  • 15:23 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1095
  • 15:23 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1095
  • 15:23 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1095.eqiad.wmnet with OS bookworm
  • 15:23 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1094
  • 15:23 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1094
  • 15:23 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1094.eqiad.wmnet with OS bookworm
  • 15:23 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1093
  • 15:23 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1093
  • 15:23 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1093.eqiad.wmnet with OS bookworm
  • 15:22 tchin@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
  • 15:22 tchin@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
  • 15:21 jelto: homer 'lsw1-d3-codfw*' commit 'T377877'
  • 15:21 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1093.eqiad.wmnet wikikube-worker1094.eqiad.wmnet wikikube-worker1095.eqiad.wmnet on all recursors
  • 15:21 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1093.eqiad.wmnet wikikube-worker1094.eqiad.wmnet wikikube-worker1095.eqiad.wmnet on all recursors
  • 15:21 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1457 to wikikube-worker1093
  • 15:20 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1093
  • 15:20 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1073.eqiad.wmnet with reason: host reimage
  • 15:19 jelto@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker2192.codfw.wmnet with OS bookworm
  • 15:19 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1093
  • 15:18 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:18 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1459 to wikikube-worker1095
  • 15:17 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1095
  • 15:16 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1073.eqiad.wmnet with reason: host reimage
  • 15:16 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1458 to wikikube-worker1094
  • 15:16 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:16 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1095
  • 15:16 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:15 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1094
  • 15:15 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1057.eqiad.wmnet with OS bookworm
  • 15:14 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1094
  • 15:14 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:14 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1458 to wikikube-worker1094 - kamila@cumin1002"
  • 15:14 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:12 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1458 to wikikube-worker1094 - kamila@cumin1002"
  • 15:09 tchin@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
  • 15:09 tchin@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
  • 15:09 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:09 kamila@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 15:08 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1459 to wikikube-worker1095
  • 15:07 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1458 to wikikube-worker1094
  • 15:06 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:06 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1457 to wikikube-worker1093
  • 15:06 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:04 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Enable Translate message bundle Scribunto library on MetaWiki (T379892) (duration: 32m 53s)
  • 14:58 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1073.eqiad.wmnet with OS bookworm
  • 14:56 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2194.codfw.wmnet with OS bookworm
  • 14:54 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[1457-1459].eqiad.wmnet
  • 14:54 lucaswerkmeister-wmde@deploy2002: abi, lucaswerkmeister-wmde: Continuing with sync
  • 14:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2193.codfw.wmnet with OS bookworm
  • 14:52 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[1457-1459].eqiad.wmnet
  • 14:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2192
  • 14:41 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2192
  • 14:41 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2192.codfw.wmnet with OS bookworm
  • 14:38 lucaswerkmeister-wmde@deploy2002: abi, lucaswerkmeister-wmde: Backport for Enable Translate message bundle Scribunto library on MetaWiki (T379892) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:37 dcaro@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 14:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2194.codfw.wmnet with reason: host reimage
  • 14:33 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2193.codfw.wmnet with reason: host reimage
  • 14:32 dcaro@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 14:31 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Enable Translate message bundle Scribunto library on MetaWiki (T379892)
  • 14:31 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2194.codfw.wmnet with reason: host reimage
  • 14:29 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2193.codfw.wmnet with reason: host reimage
  • 14:24 dcaro@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 14:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2194
  • 14:12 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2194
  • 14:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2194.codfw.wmnet with OS bookworm
  • 14:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2193
  • 14:11 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2193
  • 14:11 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2193.codfw.wmnet with OS bookworm
  • 14:05 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2194.codfw.wmnet with OS bookworm
  • 14:00 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2193.codfw.wmnet with OS bookworm
  • 13:58 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2192.codfw.wmnet with OS bookworm
  • 13:39 moritzm: installing jinja2 security updates
  • 13:21 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1088-1092].eqiad.wmnet
  • 13:21 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1088-1092].eqiad.wmnet
  • 13:21 aqu@deploy2002: Finished deploy [airflow-dags/analytics@9073e46]: Refine refactoring (duration: 02m 51s)
  • 13:18 aqu@deploy2002: Started deploy [airflow-dags/analytics@9073e46]: Refine refactoring
  • 13:11 moritzm: installing sqlparse security updates
  • 13:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2231 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71948 and previous config saved to /var/cache/conftool/dbconfig/20250109-130818-root.json
  • 12:54 aqu@deploy2002: deploy aborted: Refine refactoring (duration: 00m 20s)
  • 12:54 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@9073e46]: Refine refactoring
  • 12:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2231 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71947 and previous config saved to /var/cache/conftool/dbconfig/20250109-125313-root.json
  • 12:45 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2194
  • 12:45 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2194
  • 12:45 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2194
  • 12:45 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2194.codfw.wmnet 224.32.192.10.in-addr.arpa 4.2.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:45 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2194.codfw.wmnet 224.32.192.10.in-addr.arpa 4.2.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:45 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:45 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2194 - jelto@cumin1002"
  • 12:44 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2194 - jelto@cumin1002"
  • 12:43 dcaro@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1012.eqiad.wmnet with OS bullseye
  • 12:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2193
  • 12:41 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2193
  • 12:41 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:41 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2193
  • 12:41 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2193.codfw.wmnet 62.48.192.10.in-addr.arpa 2.6.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:41 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2193.codfw.wmnet 62.48.192.10.in-addr.arpa 2.6.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:41 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:41 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2193 - jelto@cumin1002"
  • 12:41 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2193 - jelto@cumin1002"
  • 12:40 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2194
  • 12:40 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2194.codfw.wmnet with OS bookworm
  • 12:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2231 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71946 and previous config saved to /var/cache/conftool/dbconfig/20250109-123806-root.json
  • 12:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2192
  • 12:38 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2192
  • 12:37 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2192
  • 12:37 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2192.codfw.wmnet 221.48.192.10.in-addr.arpa 1.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:37 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:37 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2192.codfw.wmnet 221.48.192.10.in-addr.arpa 1.2.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:37 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:37 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2192 - jelto@cumin1002"
  • 12:37 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2192 - jelto@cumin1002"
  • 12:37 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2193
  • 12:37 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2193.codfw.wmnet with OS bookworm
  • 12:34 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:34 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2192
  • 12:34 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2192.codfw.wmnet with OS bookworm
  • 12:31 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2192.codfw.wmnet wikikube-worker2193.codfw.wmnet wikikube-worker2194.codfw.wmnet on all recursors
  • 12:31 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2192.codfw.wmnet wikikube-worker2193.codfw.wmnet wikikube-worker2194.codfw.wmnet on all recursors
  • 12:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2058 to wikikube-worker2194
  • 12:30 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2194
  • 12:30 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2194
  • 12:30 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:30 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2058 to wikikube-worker2194 - jelto@cumin1002"
  • 12:29 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2058 to wikikube-worker2194 - jelto@cumin1002"
  • 12:26 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:25 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2058 to wikikube-worker2194
  • 12:25 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2056 to wikikube-worker2193
  • 12:25 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Piccardi out of all services on: 2310 hosts
  • 12:24 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2193
  • 12:24 root@cumin2002: START - Cookbook sre.idm.logout Logging Piccardi out of all services on: 2310 hosts
  • 12:23 marostegui@cumin1002: dbctl commit (dc=all): 'db2231 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71945 and previous config saved to /var/cache/conftool/dbconfig/20250109-122301-root.json
  • 12:19 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2193
  • 12:19 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:19 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2056 to wikikube-worker2193 - jelto@cumin1002"
  • 12:18 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2056 to wikikube-worker2193 - jelto@cumin1002"
  • 12:15 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:14 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2056 to wikikube-worker2193
  • 12:13 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2053 to wikikube-worker2192
  • 12:13 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2192
  • 12:13 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2192
  • 12:12 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:12 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2053 to wikikube-worker2192 - jelto@cumin1002"
  • 12:12 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2053 to wikikube-worker2192 - jelto@cumin1002"
  • 12:08 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 12:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2231 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71944 and previous config saved to /var/cache/conftool/dbconfig/20250109-120755-root.json
  • 12:07 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 12:06 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 12:05 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 12:05 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:05 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2053 to wikikube-worker2192
  • 12:04 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 12:03 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 11:58 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
  • 11:57 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
  • 11:54 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 11:53 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 11:53 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 11:53 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 11:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2231 (re)pooling @ 9%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71943 and previous config saved to /var/cache/conftool/dbconfig/20250109-115250-root.json
  • 11:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2231 (re)pooling @ 8%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71942 and previous config saved to /var/cache/conftool/dbconfig/20250109-113744-root.json
  • 11:29 elukey@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 11:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 100%: Repooling after moving sanitarium', diff saved to https://phabricator.wikimedia.org/P71940 and previous config saved to /var/cache/conftool/dbconfig/20250109-112543-root.json
  • 11:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 100%: Repooling after moving sanitarium', diff saved to https://phabricator.wikimedia.org/P71939 and previous config saved to /var/cache/conftool/dbconfig/20250109-112534-root.json
  • 11:24 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2022
  • 11:24 elukey@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2022
  • 11:23 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 11:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2231 (re)pooling @ 7%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71938 and previous config saved to /var/cache/conftool/dbconfig/20250109-112239-root.json
  • 11:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 75%: Repooling after moving sanitarium', diff saved to https://phabricator.wikimedia.org/P71937 and previous config saved to /var/cache/conftool/dbconfig/20250109-111038-root.json
  • 11:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 75%: Repooling after moving sanitarium', diff saved to https://phabricator.wikimedia.org/P71936 and previous config saved to /var/cache/conftool/dbconfig/20250109-111029-root.json
  • 11:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2231 (re)pooling @ 6%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71935 and previous config saved to /var/cache/conftool/dbconfig/20250109-110734-root.json
  • 10:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repool pc4', diff saved to https://phabricator.wikimedia.org/P71934 and previous config saved to /var/cache/conftool/dbconfig/20250109-105708-ladsgroup.json
  • 10:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 50%: Repooling after moving sanitarium', diff saved to https://phabricator.wikimedia.org/P71933 and previous config saved to /var/cache/conftool/dbconfig/20250109-105533-root.json
  • 10:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 50%: Repooling after moving sanitarium', diff saved to https://phabricator.wikimedia.org/P71932 and previous config saved to /var/cache/conftool/dbconfig/20250109-105523-root.json
  • 10:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2231 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71931 and previous config saved to /var/cache/conftool/dbconfig/20250109-105228-root.json
  • 10:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for pc2015.codfw.wmnet
  • 10:45 root@cumin2002: END (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for dbprov2004.codfw.wmnet: Renew puppet certificate - root@cumin2002
  • 10:45 root@cumin2002: START - Cookbook sre.puppet.renew-cert for dbprov2004.codfw.wmnet: Renew puppet certificate - root@cumin2002
  • 10:43 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1016.eqiad.wmnet with reason: Reboot
  • 10:43 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on pc1016.eqiad.wmnet with reason: Reboot
  • 10:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2131 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P71930 and previous config saved to /var/cache/conftool/dbconfig/20250109-104043-root.json
  • 10:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 25%: Repooling after moving sanitarium', diff saved to https://phabricator.wikimedia.org/P71929 and previous config saved to /var/cache/conftool/dbconfig/20250109-104027-root.json
  • 10:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 25%: Repooling after moving sanitarium', diff saved to https://phabricator.wikimedia.org/P71928 and previous config saved to /var/cache/conftool/dbconfig/20250109-104017-root.json
  • 10:40 ladsgroup@cumin1002: START - Cookbook sre.mysql.upgrade for pc2015.codfw.wmnet
  • 10:38 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for pc1016.eqiad.wmnet
  • 10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2231 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71927 and previous config saved to /var/cache/conftool/dbconfig/20250109-103723-root.json
  • 10:34 ladsgroup@cumin1002: START - Cookbook sre.mysql.upgrade for pc1016.eqiad.wmnet
  • 10:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling pc4', diff saved to https://phabricator.wikimedia.org/P71926 and previous config saved to /var/cache/conftool/dbconfig/20250109-102700-ladsgroup.json
  • 10:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2131 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P71925 and previous config saved to /var/cache/conftool/dbconfig/20250109-102538-root.json
  • 10:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 10%: Repooling after moving sanitarium', diff saved to https://phabricator.wikimedia.org/P71924 and previous config saved to /var/cache/conftool/dbconfig/20250109-102522-root.json
  • 10:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 10%: Repooling after moving sanitarium', diff saved to https://phabricator.wikimedia.org/P71923 and previous config saved to /var/cache/conftool/dbconfig/20250109-102512-root.json
  • 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2231 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71922 and previous config saved to /var/cache/conftool/dbconfig/20250109-102218-root.json
  • 10:21 marostegui: Move db2187:3312 under db2226 s2 codfw dbmaint T373579
  • 10:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2126,2187,2226].codfw.wmnet with reason: maintenance
  • 10:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db[2126,2187,2226].codfw.wmnet with reason: maintenance
  • 10:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2126 db2226 T373579', diff saved to https://phabricator.wikimedia.org/P71921 and previous config saved to /var/cache/conftool/dbconfig/20250109-102010-marostegui.json
  • 10:12 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes[2053,2056,2058].codfw.wmnet
  • 10:11 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes[2053,2056,2058].codfw.wmnet
  • 10:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2131 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P71920 and previous config saved to /var/cache/conftool/dbconfig/20250109-101033-root.json
  • 10:09 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2001.codfw.wmnet
  • 10:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2231 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71919 and previous config saved to /var/cache/conftool/dbconfig/20250109-100712-root.json
  • 10:06 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbprov2004.codfw.wmnet with reason: os upgrade
  • 10:05 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbprov2004.codfw.wmnet with reason: os upgrade
  • 09:58 moritzm: installing glibc bugfix updates for Bookworm
  • 09:55 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-serve2001.codfw.wmnet
  • 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2131 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P71918 and previous config saved to /var/cache/conftool/dbconfig/20250109-095527-root.json
  • 09:55 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2001.codfw.wmnet
  • 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2231 (re)pooling @ 1%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71917 and previous config saved to /var/cache/conftool/dbconfig/20250109-095207-root.json
  • 09:43 marostegui@cumin1002: dbctl commit (dc=all): 'es1042 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71916 and previous config saved to /var/cache/conftool/dbconfig/20250109-094355-root.json
  • 09:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2131 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P71915 and previous config saved to /var/cache/conftool/dbconfig/20250109-094022-root.json
  • 09:28 marostegui@cumin1002: dbctl commit (dc=all): 'es1042 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71914 and previous config saved to /var/cache/conftool/dbconfig/20250109-092850-root.json
  • 09:23 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp4044.ulsfo.wmnet,cp4052.ulsfo.wmnet} and A:cp
  • 09:21 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-serve2001.codfw.wmnet
  • 09:20 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2013.codfw.wmnet
  • 09:19 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2013.codfw.wmnet
  • 09:19 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2012.codfw.wmnet
  • 09:19 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2012.codfw.wmnet
  • 09:19 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2011.codfw.wmnet
  • 09:19 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2011.codfw.wmnet
  • 09:19 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2013.codfw.wmnet with OS bookworm
  • 09:18 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp4044.ulsfo.wmnet,cp4052.ulsfo.wmnet} and A:cp
  • 09:17 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp4044.ulsfo.wmnet,cp4052.ulsfo.wmnet} and A:cp
  • 09:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2012.codfw.wmnet with OS bookworm
  • 09:13 marostegui@cumin1002: dbctl commit (dc=all): 'es1042 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71913 and previous config saved to /var/cache/conftool/dbconfig/20250109-091345-root.json
  • 09:13 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp4044.ulsfo.wmnet,cp4052.ulsfo.wmnet} and A:cp
  • 09:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2011.codfw.wmnet with OS bookworm
  • 09:02 vgutierrez: update to haproxy 2.8.13 on component thirdparty/haproxy28 bullseye-wikimedia (apt.wm.o) - T383111
  • 08:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2013.codfw.wmnet with reason: host reimage
  • 08:58 marostegui@cumin1002: dbctl commit (dc=all): 'es1042 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71912 and previous config saved to /var/cache/conftool/dbconfig/20250109-085840-root.json
  • 08:57 root@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2131.codfw.wmnet onto db2231.codfw.wmnet
  • 08:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2012.codfw.wmnet with reason: host reimage
  • 08:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2011.codfw.wmnet with reason: host reimage
  • 08:49 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2013.codfw.wmnet with reason: host reimage
  • 08:49 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2012.codfw.wmnet with reason: host reimage
  • 08:49 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2011.codfw.wmnet with reason: host reimage
  • 08:43 marostegui@cumin1002: dbctl commit (dc=all): 'es1042 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71911 and previous config saved to /var/cache/conftool/dbconfig/20250109-084335-root.json
  • 08:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es1043.eqiad.wmnet with reason: cloning
  • 08:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es1043.eqiad.wmnet with reason: cloning
  • 08:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2012
  • 08:32 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2012
  • 08:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2013
  • 08:32 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2013
  • 08:31 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2012.codfw.wmnet with OS bookworm
  • 08:31 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2013.codfw.wmnet with OS bookworm
  • 08:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2011
  • 08:31 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2011
  • 08:31 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2011.codfw.wmnet with OS bookworm
  • 08:28 marostegui@cumin1002: dbctl commit (dc=all): 'es1042 (re)pooling @ 9%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71910 and previous config saved to /var/cache/conftool/dbconfig/20250109-082829-root.json
  • 08:26 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2011-2013].codfw.wmnet
  • 08:24 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2011-2013].codfw.wmnet
  • 08:22 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2014.codfw.wmnet
  • 08:22 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2014.codfw.wmnet
  • 08:22 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2017.codfw.wmnet
  • 08:22 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2017.codfw.wmnet
  • 08:21 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2017.codfw.wmnet with OS bookworm
  • 08:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2014.codfw.wmnet with OS bookworm
  • 08:13 marostegui@cumin1002: dbctl commit (dc=all): 'es1042 (re)pooling @ 8%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71909 and previous config saved to /var/cache/conftool/dbconfig/20250109-081324-root.json
  • 08:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2017.codfw.wmnet with reason: host reimage
  • 07:58 marostegui@cumin1002: dbctl commit (dc=all): 'es1042 (re)pooling @ 7%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71908 and previous config saved to /var/cache/conftool/dbconfig/20250109-075820-root.json
  • 07:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2014.codfw.wmnet with reason: host reimage
  • 07:52 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2017.codfw.wmnet with reason: host reimage
  • 07:52 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2014.codfw.wmnet with reason: host reimage
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'es1042 (re)pooling @ 6%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71907 and previous config saved to /var/cache/conftool/dbconfig/20250109-074314-root.json
  • 07:42 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P71906 and previous config saved to /var/cache/conftool/dbconfig/20250109-074214-root.json
  • 07:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2017
  • 07:34 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2017
  • 07:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2014
  • 07:34 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2014
  • 07:34 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2017.codfw.wmnet with OS bookworm
  • 07:34 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2014.codfw.wmnet with OS bookworm
  • 07:32 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2014,2017].codfw.wmnet
  • 07:29 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2014,2017].codfw.wmnet
  • 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'es1042 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71905 and previous config saved to /var/cache/conftool/dbconfig/20250109-072809-root.json
  • 07:27 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P71904 and previous config saved to /var/cache/conftool/dbconfig/20250109-072709-root.json
  • 07:13 marostegui@cumin1002: dbctl commit (dc=all): 'es1042 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71903 and previous config saved to /var/cache/conftool/dbconfig/20250109-071305-root.json
  • 07:12 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P71902 and previous config saved to /var/cache/conftool/dbconfig/20250109-071203-root.json
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'es1042 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71901 and previous config saved to /var/cache/conftool/dbconfig/20250109-065759-root.json
  • 06:56 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P71900 and previous config saved to /var/cache/conftool/dbconfig/20250109-065658-root.json
  • 06:53 kart_: Updated cxserver to 2025-01-07-045930-production (T377966, T377813, T381379)
  • 06:52 root@cumin1002: START - Cookbook sre.mysql.clone of db2131.codfw.wmnet onto db2231.codfw.wmnet
  • 06:51 marostegui@cumin1002: dbctl commit (dc=all): 'Add db2231 to dbctl depooled T373579', diff saved to https://phabricator.wikimedia.org/P71899 and previous config saved to /var/cache/conftool/dbconfig/20250109-065114-marostegui.json
  • 06:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2131.codfw.wmnet with reason: cloning db2231
  • 06:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2131.codfw.wmnet with reason: cloning db2231
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2131 T373579', diff saved to https://phabricator.wikimedia.org/P71898 and previous config saved to /var/cache/conftool/dbconfig/20250109-064556-marostegui.json
  • 06:45 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 06:44 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'es1042 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71897 and previous config saved to /var/cache/conftool/dbconfig/20250109-064254-root.json
  • 06:41 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P71896 and previous config saved to /var/cache/conftool/dbconfig/20250109-064153-root.json
  • 06:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es1022.eqiad.wmnet with reason: cloning es1042
  • 06:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es1022.eqiad.wmnet with reason: cloning es1042
  • 06:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1022 T382569', diff saved to https://phabricator.wikimedia.org/P71895 and previous config saved to /var/cache/conftool/dbconfig/20250109-064117-marostegui.json
  • 06:40 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 06:40 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 06:31 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 06:30 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 06:27 marostegui@cumin1002: dbctl commit (dc=all): 'es1042 (re)pooling @ 1%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71894 and previous config saved to /var/cache/conftool/dbconfig/20250109-062749-root.json
  • 06:26 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P71893 and previous config saved to /var/cache/conftool/dbconfig/20250109-062647-root.json
  • 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'Add es1042 depooled T382569', diff saved to https://phabricator.wikimedia.org/P71892 and previous config saved to /var/cache/conftool/dbconfig/20250109-061724-marostegui.json
  • 06:11 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P71891 and previous config saved to /var/cache/conftool/dbconfig/20250109-061142-root.json
  • 00:43 ladsgroup@deploy2002: Finished scap sync-world: Backport for filerepo: Fix schema compatibility constant usage (T383269) (duration: 13m 34s)
  • 00:36 ladsgroup@deploy2002: ladsgroup, cdanis: Continuing with sync
  • 00:36 ladsgroup@deploy2002: ladsgroup, cdanis: Backport for filerepo: Fix schema compatibility constant usage (T383269) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 00:30 ladsgroup@deploy2002: Started scap sync-world: Backport for filerepo: Fix schema compatibility constant usage (T383269)

2025-01-08

  • 23:03 eileen: civicrm upgraded from 1c93f7a1 to 24f3b57f
  • 22:27 eileen: civicrm upgraded from 82b02ce5 to 1c93f7a1
  • 21:28 dzahn@cumin2002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: security release 20250108
  • 21:16 aokoth@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Update
  • 21:14 dzahn@cumin2002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: security release 20250108
  • 21:08 aokoth@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Update
  • 20:05 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.44.0-wmf.11 refs T382362
  • 19:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wikikube-worker1001.eqiad.wmnet
  • 19:59 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:59 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - kamila@cumin1002"
  • 19:55 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - kamila@cumin1002"
  • 19:54 dduvall: rolling back wmf.11 to group0 due to `Table 'commonswiki.file' doesn't exist` errors
  • 19:53 kamila_: homer 'cr*eqiad*' commit 'T365571'
  • 19:52 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 19:46 kamila@cumin1002: START - Cookbook sre.hosts.decommission for hosts wikikube-worker1001.eqiad.wmnet
  • 19:32 sfaci@deploy2002: Finished deploy [airflow-dags/analytics@b2b5707]: (no justification provided) (duration: 03m 06s)
  • 19:29 sfaci@deploy2002: Started deploy [airflow-dags/analytics@b2b5707]: (no justification provided)
  • 19:29 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.44.0-wmf.11 refs T382362
  • 19:17 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1001.eqiad.wmnet
  • 19:17 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1001.eqiad.wmnet
  • 19:08 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1089.eqiad.wmnet with OS bookworm
  • 19:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1091.eqiad.wmnet with OS bookworm
  • 18:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1090.eqiad.wmnet with OS bookworm
  • 18:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1092.eqiad.wmnet with OS bookworm
  • 18:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1088.eqiad.wmnet with OS bookworm
  • 18:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1089.eqiad.wmnet with reason: host reimage
  • 18:43 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1091.eqiad.wmnet with reason: host reimage
  • 18:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1090.eqiad.wmnet with reason: host reimage
  • 18:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1092.eqiad.wmnet with reason: host reimage
  • 18:33 swfrench@deploy2002: Finished scap sync-world: Deployment to switch migration release files to 8.1 - T377040 (duration: 13m 57s)
  • 18:32 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1088.eqiad.wmnet with reason: host reimage
  • 18:29 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1092.eqiad.wmnet with reason: host reimage
  • 18:29 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1091.eqiad.wmnet with reason: host reimage
  • 18:29 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1090.eqiad.wmnet with reason: host reimage
  • 18:28 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1089.eqiad.wmnet with reason: host reimage
  • 18:27 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1088.eqiad.wmnet with reason: host reimage
  • 18:19 swfrench@deploy2002: Started scap sync-world: Deployment to switch migration release files to 8.1 - T377040
  • 18:17 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1011.eqiad.wmnet with OS bookworm
  • 18:17 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 18:16 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 18:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1092
  • 18:13 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1092
  • 18:13 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1092.eqiad.wmnet with OS bookworm
  • 18:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1091
  • 18:13 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1091
  • 18:13 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1091.eqiad.wmnet with OS bookworm
  • 18:12 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1090
  • 18:12 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1090
  • 18:12 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1090.eqiad.wmnet with OS bookworm
  • 18:12 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1089
  • 18:12 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1089
  • 18:12 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1089.eqiad.wmnet with OS bookworm
  • 18:11 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1088
  • 18:11 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1088
  • 18:11 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1088.eqiad.wmnet with OS bookworm
  • 18:09 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1088.eqiad.wmnet wikikube-worker1089.eqiad.wmnet wikikube-worker1090.eqiad.wmnet wikikube-worker1091.eqiad.wmnet wikikube-worker1092.eqiad.wmnet on all recursors
  • 18:09 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1088.eqiad.wmnet wikikube-worker1089.eqiad.wmnet wikikube-worker1090.eqiad.wmnet wikikube-worker1091.eqiad.wmnet wikikube-worker1092.eqiad.wmnet on all recursors
  • 18:09 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1455 to wikikube-worker1092
  • 18:08 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1092
  • 18:07 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1092
  • 18:07 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:07 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1455 to wikikube-worker1092 - kamila@cumin1002"
  • 18:07 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1455 to wikikube-worker1092 - kamila@cumin1002"
  • 18:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1454 to wikikube-worker1091
  • 18:04 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1091
  • 18:03 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 18:02 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1091
  • 18:02 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:02 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1454 to wikikube-worker1091 - kamila@cumin1002"
  • 18:02 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1454 to wikikube-worker1091 - kamila@cumin1002"
  • 18:01 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1455 to wikikube-worker1092
  • 17:59 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 17:58 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1454 to wikikube-worker1091
  • 17:58 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1088.eqiad.wmnet wikikube-worker1089.eqiad.wmnet wikikube-worker1090.eqiad.wmnet wikikub on all recursors
  • 17:58 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1088.eqiad.wmnet wikikube-worker1089.eqiad.wmnet wikikube-worker1090.eqiad.wmnet wikikub on all recursors
  • 17:57 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1011.eqiad.wmnet with reason: host reimage
  • 17:56 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1452 to wikikube-worker1089
  • 17:56 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1089
  • 17:55 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1089
  • 17:55 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:54 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1011.eqiad.wmnet with reason: host reimage
  • 17:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1453 to wikikube-worker1090
  • 17:52 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1090
  • 17:52 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 17:51 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1090
  • 17:51 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:51 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1453 to wikikube-worker1090 - kamila@cumin1002"
  • 17:51 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1453 to wikikube-worker1090 - kamila@cumin1002"
  • 17:49 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw1451 to wikikube-worker1088
  • 17:48 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1088
  • 17:47 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 17:47 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1088
  • 17:47 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:47 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1451 to wikikube-worker1088 - kamila@cumin1002"
  • 17:46 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1453 to wikikube-worker1090
  • 17:46 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw1451 to wikikube-worker1088 - kamila@cumin1002"
  • 17:44 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=99) from mw1453 to wikikube-worker1090
  • 17:43 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1453 to wikikube-worker1090
  • 17:43 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1452 to wikikube-worker1089
  • 17:43 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 17:42 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw1451 to wikikube-worker1088
  • 17:41 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[1451-1455].eqiad.wmnet
  • 17:36 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[1451-1455].eqiad.wmnet
  • 17:22 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2001.codfw.wmnet with OS bookworm
  • 17:05 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2001.codfw.wmnet with reason: host reimage
  • 17:04 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2019.codfw.wmnet
  • 17:04 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2019.codfw.wmnet
  • 17:02 jhathaway@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2001.codfw.wmnet with reason: host reimage
  • 17:02 jelto: sudo homer 'cr*codfw*' commit 'T377877'
  • 17:01 jelto: sudo homer 'lsw1-c3-codfw*' commit 'T377877'
  • 17:00 jelto@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker2019.codfw.wmnet
  • 17:00 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcontrol1011.eqiad.wmnet with OS bookworm
  • 16:59 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2019.codfw.wmnet
  • 16:59 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2018.codfw.wmnet
  • 16:59 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2018.codfw.wmnet
  • 16:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2019.codfw.wmnet with OS bookworm
  • 16:50 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1081.eqiad.wmnet with OS bookworm
  • 16:50 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:50 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:44 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2018.codfw.wmnet with OS bookworm
  • 16:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2019.codfw.wmnet with reason: host reimage
  • 16:31 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1081.eqiad.wmnet with reason: host reimage
  • 16:29 jhathaway@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 16:27 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2001.codfw.wmnet with OS bookworm
  • 16:26 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2019.codfw.wmnet with reason: host reimage
  • 16:25 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1081.eqiad.wmnet with reason: host reimage
  • 16:24 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2018.codfw.wmnet with reason: host reimage
  • 16:20 jhathaway@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 16:20 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2018.codfw.wmnet with reason: host reimage
  • 16:08 jhathaway@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2001.codfw.wmnet with OS bookworm
  • 16:08 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1081.eqiad.wmnet with OS bookworm
  • 16:07 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2019
  • 16:07 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2019
  • 16:07 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2019
  • 16:07 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2019.codfw.wmnet 117.32.192.10.in-addr.arpa 7.1.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:07 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2019.codfw.wmnet 117.32.192.10.in-addr.arpa 7.1.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:07 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:07 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2019 - jelto@cumin1002"
  • 16:07 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2019 - jelto@cumin1002"
  • 16:05 jhathaway@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 16:04 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 16:03 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2019
  • 16:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2018
  • 16:03 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2018
  • 16:03 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2019.codfw.wmnet with OS bookworm
  • 16:03 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2018.codfw.wmnet with OS bookworm
  • 16:02 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2018-2019].codfw.wmnet
  • 16:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71886 and previous config saved to /var/cache/conftool/dbconfig/20250108-160158-root.json
  • 16:00 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2018-2019].codfw.wmnet
  • 15:57 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 15:57 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 15:57 jelto@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 15:54 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 15:54 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 15:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2022
  • 15:51 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2022
  • 15:51 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 15:51 jelto@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 15:49 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 15:48 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 15:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71885 and previous config saved to /var/cache/conftool/dbconfig/20250108-154653-root.json
  • 15:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 15:36 moritzm: installing jinja2 security updates
  • 15:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71884 and previous config saved to /var/cache/conftool/dbconfig/20250108-153147-root.json
  • 15:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 15:27 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 15:27 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 15:22 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2022
  • 15:22 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2022
  • 15:22 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 15:21 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:21 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:20 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:18 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:17 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71883 and previous config saved to /var/cache/conftool/dbconfig/20250108-151642-root.json
  • 15:12 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:11 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:11 jelto@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 15:10 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:10 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:08 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:07 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:07 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 15:07 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 15:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2022
  • 15:06 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2022
  • 15:06 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 15:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71882 and previous config saved to /var/cache/conftool/dbconfig/20250108-150136-root.json
  • 15:00 joelyrookewmde: Finished populateSitesTable for tigwiki (T381382)
  • 14:53 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2022.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:48 jelto@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 14:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repool pc5 (T373037)', diff saved to https://phabricator.wikimedia.org/P71881 and previous config saved to /var/cache/conftool/dbconfig/20250108-144805-ladsgroup.json
  • 14:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71880 and previous config saved to /var/cache/conftool/dbconfig/20250108-144631-root.json
  • 14:44 marostegui@cumin1002: dbctl commit (dc=all): 'db2128 (re)pooling @ 100%: Repooling after cloning', diff saved to https://phabricator.wikimedia.org/P71879 and previous config saved to /var/cache/conftool/dbconfig/20250108-144429-root.json
  • 14:42 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for pc1014.eqiad.wmnet
  • 14:40 elukey: elukey@puppetserver1001:~$ sudo puppetserver ca clean --certname kubernetes1061.eqiad.wmnet
  • 14:39 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 14:38 joelyrookewmde: joelyrookewmde@mwmaint2002:~$ foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https
  • 14:37 ladsgroup@cumin1002: START - Cookbook sre.mysql.upgrade for pc1014.eqiad.wmnet
  • 14:33 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 14:33 elukey@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-worker2022.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71878 and previous config saved to /var/cache/conftool/dbconfig/20250108-143126-root.json
  • 14:29 marostegui@cumin1002: dbctl commit (dc=all): 'db2128 (re)pooling @ 75%: Repooling after cloning', diff saved to https://phabricator.wikimedia.org/P71877 and previous config saved to /var/cache/conftool/dbconfig/20250108-142923-root.json
  • 14:23 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 14:21 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 14:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71876 and previous config saved to /var/cache/conftool/dbconfig/20250108-141620-root.json
  • 14:14 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2017.codfw.wmnet,pc[1014,1017].eqiad.wmnet with reason: Reboot
  • 14:14 marostegui@cumin1002: dbctl commit (dc=all): 'db2128 (re)pooling @ 50%: Repooling after cloning', diff saved to https://phabricator.wikimedia.org/P71875 and previous config saved to /var/cache/conftool/dbconfig/20250108-141418-root.json
  • 14:14 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on pc2017.codfw.wmnet,pc[1014,1017].eqiad.wmnet with reason: Reboot
  • 14:12 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2022
  • 14:12 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2022
  • 14:12 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 14:11 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2020.codfw.wmnet
  • 14:11 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2020.codfw.wmnet
  • 14:11 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2021.codfw.wmnet
  • 14:11 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2021.codfw.wmnet
  • 14:09 jelto: sudo homer 'cr*codfw*' commit 'T377877'
  • 14:09 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 14:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71874 and previous config saved to /var/cache/conftool/dbconfig/20250108-140115-root.json
  • 13:59 marostegui@cumin1002: dbctl commit (dc=all): 'db2128 (re)pooling @ 25%: Repooling after cloning', diff saved to https://phabricator.wikimedia.org/P71873 and previous config saved to /var/cache/conftool/dbconfig/20250108-135913-root.json
  • 13:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2228 (re)pooling @ 1%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71872 and previous config saved to /var/cache/conftool/dbconfig/20250108-134610-root.json
  • 13:45 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71871 and previous config saved to /var/cache/conftool/dbconfig/20250108-134544-root.json
  • 13:44 marostegui@cumin1002: dbctl commit (dc=all): 'db2128 (re)pooling @ 10%: Repooling after cloning', diff saved to https://phabricator.wikimedia.org/P71870 and previous config saved to /var/cache/conftool/dbconfig/20250108-134408-root.json
  • 13:44 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2021.codfw.wmnet with OS bookworm
  • 13:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2020.codfw.wmnet with OS bookworm
  • 13:37 elukey: elukey@puppetserver1001:~$ sudo puppetserver ca clean --certname kubernetes1021.eqiad.wmnet
  • 13:30 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71869 and previous config saved to /var/cache/conftool/dbconfig/20250108-133038-root.json
  • 13:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling pc5 in codfw for test (T373037)', diff saved to https://phabricator.wikimedia.org/P71868 and previous config saved to /var/cache/conftool/dbconfig/20250108-132708-ladsgroup.json
  • 13:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling pc5 in eqiad for test (T373037)', diff saved to https://phabricator.wikimedia.org/P71867 and previous config saved to /var/cache/conftool/dbconfig/20250108-132506-ladsgroup.json
  • 13:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2021.codfw.wmnet with reason: host reimage
  • 13:22 ladsgroup@deploy2002: Finished scap sync-world: Backport for Fully depool ParserCache section if load of the primary is zero (T373037 T383137) (duration: 19m 19s)
  • 13:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 100%: Repooling after cloning', diff saved to https://phabricator.wikimedia.org/P71866 and previous config saved to /var/cache/conftool/dbconfig/20250108-131953-root.json
  • 13:19 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2021.codfw.wmnet with reason: host reimage
  • 13:17 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2020.codfw.wmnet with reason: host reimage
  • 13:15 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71865 and previous config saved to /var/cache/conftool/dbconfig/20250108-131533-root.json
  • 13:14 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2020.codfw.wmnet with reason: host reimage
  • 13:13 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 13:12 ladsgroup@deploy2002: ladsgroup: Backport for Fully depool ParserCache section if load of the primary is zero (T373037 T383137) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 75%: Repooling after cloning', diff saved to https://phabricator.wikimedia.org/P71864 and previous config saved to /var/cache/conftool/dbconfig/20250108-130448-root.json
  • 13:03 ladsgroup@deploy2002: Started scap sync-world: Backport for Fully depool ParserCache section if load of the primary is zero (T373037 T383137)
  • 13:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2021
  • 13:01 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2021
  • 13:01 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2021
  • 13:01 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2021.codfw.wmnet 210.32.192.10.in-addr.arpa 0.1.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:01 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2021.codfw.wmnet 210.32.192.10.in-addr.arpa 0.1.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:01 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:01 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2021 - jelto@cumin1002"
  • 13:01 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2021 - jelto@cumin1002"
  • 13:00 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71863 and previous config saved to /var/cache/conftool/dbconfig/20250108-130028-root.json
  • 12:57 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:57 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2021
  • 12:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2020
  • 12:56 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2020
  • 12:56 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2020
  • 12:56 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2020.codfw.wmnet 208.32.192.10.in-addr.arpa 8.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:56 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2020.codfw.wmnet 208.32.192.10.in-addr.arpa 8.0.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:56 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:56 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2020 - jelto@cumin1002"
  • 12:56 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2020 - jelto@cumin1002"
  • 12:53 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1086.eqiad.wmnet
  • 12:53 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1086.eqiad.wmnet
  • 12:52 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 12:52 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2020
  • 12:52 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2020.codfw.wmnet with OS bookworm
  • 12:52 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2021.codfw.wmnet with OS bookworm
  • 12:51 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2020-2021].codfw.wmnet
  • 12:50 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2020-2021].codfw.wmnet
  • 12:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 12:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 50%: Repooling after cloning', diff saved to https://phabricator.wikimedia.org/P71862 and previous config saved to /var/cache/conftool/dbconfig/20250108-124943-root.json
  • 12:49 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2023.codfw.wmnet
  • 12:49 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2023.codfw.wmnet
  • 12:48 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 12:48 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 12:47 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 12:47 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 12:46 root@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2128.codfw.wmnet onto db2228.codfw.wmnet
  • 12:45 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 12:45 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71861 and previous config saved to /var/cache/conftool/dbconfig/20250108-124522-root.json
  • 12:41 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:40 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 12:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 12:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 25%: Repooling after cloning', diff saved to https://phabricator.wikimedia.org/P71860 and previous config saved to /var/cache/conftool/dbconfig/20250108-123437-root.json
  • 12:30 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 12:30 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71859 and previous config saved to /var/cache/conftool/dbconfig/20250108-123017-root.json
  • 12:21 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1086.eqiad.wmnet with OS bookworm
  • 12:20 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 12:20 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 12:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 10%: Repooling after cloning', diff saved to https://phabricator.wikimedia.org/P71858 and previous config saved to /var/cache/conftool/dbconfig/20250108-121931-root.json
  • 12:15 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71857 and previous config saved to /var/cache/conftool/dbconfig/20250108-121512-root.json
  • 12:09 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1084-1085,1087].eqiad.wmnet
  • 12:09 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1084-1085,1087].eqiad.wmnet
  • 12:08 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 12:03 root@cumin1002: START - Cookbook sre.mysql.clone of db2128.codfw.wmnet onto db2228.codfw.wmnet
  • 12:02 root@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2128.codfw.wmnet onto db2228.codfw.wmnet
  • 12:02 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1086.eqiad.wmnet with reason: host reimage
  • 12:00 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71856 and previous config saved to /var/cache/conftool/dbconfig/20250108-120006-root.json
  • 11:59 root@cumin1002: START - Cookbook sre.mysql.clone of db2128.codfw.wmnet onto db2228.codfw.wmnet
  • 11:59 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1086.eqiad.wmnet with reason: host reimage
  • 11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Add db2228 to dbctl depooled T373579', diff saved to https://phabricator.wikimedia.org/P71855 and previous config saved to /var/cache/conftool/dbconfig/20250108-115908-marostegui.json
  • 11:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db[2128,2186].codfw.wmnet with reason: cloning
  • 11:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db[2128,2186].codfw.wmnet with reason: cloning
  • 11:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2128 T373579', diff saved to https://phabricator.wikimedia.org/P71854 and previous config saved to /var/cache/conftool/dbconfig/20250108-115206-marostegui.json
  • 11:45 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71853 and previous config saved to /var/cache/conftool/dbconfig/20250108-114501-root.json
  • 11:41 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1086
  • 11:41 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1086
  • 11:40 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1086.eqiad.wmnet with OS bookworm
  • 11:40 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1087.eqiad.wmnet with OS bookworm
  • 11:40 jayme@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1086.eqiad.wmnet with OS bookworm
  • 11:35 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1085.eqiad.wmnet with OS bookworm
  • 11:34 moritzm: installing php7.4 security updates
  • 11:31 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1084.eqiad.wmnet with OS bookworm
  • 11:29 marostegui@cumin1002: dbctl commit (dc=all): 'db2226 (re)pooling @ 1%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71852 and previous config saved to /var/cache/conftool/dbconfig/20250108-112956-root.json
  • 11:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on ganeti2027.codfw.wmnet with reason: reimage pending, blocked by T383207
  • 11:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on ganeti2027.codfw.wmnet with reason: reimage pending, blocked by T383207
  • 11:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ganeti2027.codfw.wmnet with reason: reimage pending, blocked by T383207
  • 11:26 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on ganeti2027.codfw.wmnet with reason: reimage pending, blocked by T383207
  • 11:25 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2022
  • 11:25 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2022
  • 11:25 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 11:20 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1087.eqiad.wmnet with reason: host reimage
  • 11:18 jelto: sudo homer 'lsw1-c6-codfw*' commit 'T377877'
  • 11:17 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1085.eqiad.wmnet with reason: host reimage
  • 11:13 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2022
  • 11:13 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2022
  • 11:12 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1084.eqiad.wmnet with reason: host reimage
  • 11:10 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1085.eqiad.wmnet with reason: host reimage
  • 11:09 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1087.eqiad.wmnet with reason: host reimage
  • 11:08 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1084.eqiad.wmnet with reason: host reimage
  • 11:08 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-airflow1002.eqiad.wmnet with reason: Migrating to kubernetes
  • 11:07 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-airflow1002.eqiad.wmnet with reason: Migrating to kubernetes
  • 11:06 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 11:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2023.codfw.wmnet with OS bookworm
  • 11:00 moritzm: installing graphviz bugfix updates from bookworm point release
  • 10:54 moritzm: installing numpy bugfix updates from bookworm point release
  • 10:52 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1085
  • 10:52 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1085
  • 10:52 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1085.eqiad.wmnet with OS bookworm
  • 10:52 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1087
  • 10:52 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1087
  • 10:52 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1087.eqiad.wmnet with OS bookworm
  • 10:51 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1086
  • 10:51 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1086
  • 10:51 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1086.eqiad.wmnet with OS bookworm
  • 10:51 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1084
  • 10:51 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1084
  • 10:51 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1084.eqiad.wmnet with OS bookworm
  • 10:50 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1084.eqiad.wmnet wikikube-worker1085.eqiad.wmnet wikikube-worker1086.eqiad.wmnet wikikube-worker1087.eqiad.wmnet on all recursors
  • 10:50 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker1084.eqiad.wmnet wikikube-worker1085.eqiad.wmnet wikikube-worker1086.eqiad.wmnet wikikube-worker1087.eqiad.wmnet on all recursors
  • 10:48 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1060 to wikikube-worker1085
  • 10:47 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1085
  • 10:47 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2022
  • 10:47 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2022
  • 10:47 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 10:46 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 10:46 moritzm: installing libnvme bugfix updates from bookworm point release
  • 10:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2023.codfw.wmnet with reason: host reimage
  • 10:44 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1085
  • 10:44 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:43 moritzm: uploaded php7.4 1:7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u2+icu67u4 (backport of latest PHP security fixes to our PHP build) T378173
  • 10:42 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 10:42 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1059 to wikikube-worker1084
  • 10:42 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1061 to wikikube-worker1086
  • 10:42 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2023.codfw.wmnet with reason: host reimage
  • 10:41 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1084
  • 10:41 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1086
  • 10:41 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1086
  • 10:41 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:38 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 10:38 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1084
  • 10:38 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:37 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1062 to wikikube-worker1087
  • 10:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2022
  • 10:37 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2022
  • 10:37 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 10:37 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1087
  • 10:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ganeti2027.codfw.wmnet with reason: reimage to bookworm
  • 10:37 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1087
  • 10:37 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:37 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1062 to wikikube-worker1087 - jayme@cumin1002"
  • 10:36 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on ganeti2027.codfw.wmnet with reason: reimage to bookworm
  • 10:36 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1062 to wikikube-worker1087 - jayme@cumin1002"
  • 10:36 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 10:36 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 10:34 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1267-1269].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 10:34 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1269.eqiad.wmnet with OS bookworm
  • 10:31 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 10:31 jayme@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 10:30 jayme@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1062 to wikikube-worker1087
  • 10:29 jayme@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1061 to wikikube-worker1086
  • 10:29 jayme@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1060 to wikikube-worker1085
  • 10:29 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 10:28 jayme@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1059 to wikikube-worker1084
  • 10:24 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2022
  • 10:24 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2022
  • 10:24 marostegui@cumin1002: dbctl commit (dc=all): 'es1041 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71850 and previous config saved to /var/cache/conftool/dbconfig/20250108-102416-root.json
  • 10:24 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2023
  • 10:24 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2023
  • 10:23 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2023
  • 10:23 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2023.codfw.wmnet 213.32.192.10.in-addr.arpa 3.1.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:23 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2023.codfw.wmnet 213.32.192.10.in-addr.arpa 3.1.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:23 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:23 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2023 - jelto@cumin1002"
  • 10:23 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2023 - jelto@cumin1002"
  • 10:21 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 10:20 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 10:19 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2023
  • 10:19 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 10:19 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.move-vlan (exit_code=99) for host wikikube-worker2022
  • 10:19 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2022
  • 10:19 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2023.codfw.wmnet with OS bookworm
  • 10:19 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2022.codfw.wmnet with OS bookworm
  • 10:15 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2022-2023].codfw.wmnet
  • 10:15 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1269.eqiad.wmnet with reason: host reimage
  • 10:14 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2022-2023].codfw.wmnet
  • 10:12 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1269.eqiad.wmnet with reason: host reimage
  • 10:09 marostegui@cumin1002: dbctl commit (dc=all): 'es1041 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71849 and previous config saved to /var/cache/conftool/dbconfig/20250108-100910-root.json
  • 10:04 moritzm: imported osmborder 0.1.0+wmf12u1 to apt.wikimedia.org/bookworm T381565
  • 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
  • 10:02 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2024.codfw.wmnet
  • 10:02 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2024.codfw.wmnet
  • 10:02 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2025.codfw.wmnet
  • 10:02 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2025.codfw.wmnet
  • 10:00 jelto: sudo homer 'cr*codfw*' commit 'T377877'
  • 10:00 jelto: sudo homer 'lsw1-c6-codfw*' commit 'T377877'
  • 09:59 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2025.codfw.wmnet with OS bookworm
  • 09:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2024.codfw.wmnet with OS bookworm
  • 09:54 marostegui@cumin1002: dbctl commit (dc=all): 'es1041 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71848 and previous config saved to /var/cache/conftool/dbconfig/20250108-095405-root.json
  • 09:52 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1269.eqiad.wmnet with OS bookworm
  • 09:50 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1268.eqiad.wmnet with OS bookworm
  • 09:42 root@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2126.codfw.wmnet onto db2226.codfw.wmnet
  • 09:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2025.codfw.wmnet with reason: host reimage
  • 09:39 marostegui@cumin1002: dbctl commit (dc=all): 'es1041 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71847 and previous config saved to /var/cache/conftool/dbconfig/20250108-093900-root.json
  • 09:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2024.codfw.wmnet with reason: host reimage
  • 09:31 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1268.eqiad.wmnet with reason: host reimage
  • 09:30 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2024.codfw.wmnet with reason: host reimage
  • 09:27 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2025.codfw.wmnet with reason: host reimage
  • 09:25 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1268.eqiad.wmnet with reason: host reimage
  • 09:23 marostegui@cumin1002: dbctl commit (dc=all): 'es1041 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71846 and previous config saved to /var/cache/conftool/dbconfig/20250108-092354-root.json
  • 09:13 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2024
  • 09:13 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2024
  • 09:13 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2024.codfw.wmnet with OS bookworm
  • 09:12 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2024.codfw.wmnet with OS bookworm
  • 09:10 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2025
  • 09:10 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2025
  • 09:10 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2025.codfw.wmnet with OS bookworm
  • 09:08 marostegui@cumin1002: dbctl commit (dc=all): 'es1041 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71845 and previous config saved to /var/cache/conftool/dbconfig/20250108-090849-root.json
  • 09:08 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2025.codfw.wmnet with OS bookworm
  • 09:05 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1268.eqiad.wmnet with OS bookworm
  • 09:04 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1267.eqiad.wmnet with OS bookworm
  • 08:53 marostegui@cumin1002: dbctl commit (dc=all): 'es1041 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71844 and previous config saved to /var/cache/conftool/dbconfig/20250108-085344-root.json
  • 08:44 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1267.eqiad.wmnet with reason: host reimage
  • 08:42 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1267.eqiad.wmnet with reason: host reimage
  • 08:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es2024.codfw.wmnet
  • 08:41 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:41 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2024.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 08:41 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2024.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 08:38 marostegui@cumin1002: dbctl commit (dc=all): 'es1041 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71843 and previous config saved to /var/cache/conftool/dbconfig/20250108-083838-root.json
  • 08:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
  • 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
  • 08:37 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 08:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
  • 08:36 elukey: kill hanging processes on stat1011 to allow puppet to properly clean up absented users
  • 08:32 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts es2024.codfw.wmnet
  • 08:28 marostegui@cumin1002: dbctl commit (dc=all): 'Remove es2024 from dbctl T383028', diff saved to https://phabricator.wikimedia.org/P71842 and previous config saved to /var/cache/conftool/dbconfig/20250108-082807-marostegui.json
  • 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Aitolkyn out of all services on: 2309 hosts
  • 08:26 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Aitolkyn out of all services on: 2309 hosts
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'es1041 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71841 and previous config saved to /var/cache/conftool/dbconfig/20250108-082333-root.json
  • 08:23 elukey: destroy puppet cert for cloudelastic1011.eqiad.wmnet on puppetmaster1001 (cruft from old/wrong reimages)
  • 08:21 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1267.eqiad.wmnet with OS bookworm
  • 08:19 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1267-1269].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'es1041 (re)pooling @ 1%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P71840 and previous config saved to /var/cache/conftool/dbconfig/20250108-080828-root.json
  • 07:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2024
  • 07:51 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2024
  • 07:51 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2024
  • 07:51 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2024.codfw.wmnet 214.32.192.10.in-addr.arpa 4.1.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 07:51 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2024.codfw.wmnet 214.32.192.10.in-addr.arpa 4.1.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 07:51 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:51 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2024 - jelto@cumin1002"
  • 07:51 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2024 - jelto@cumin1002"
  • 07:50 moritzm: truncate /var/log/debug on seaborgium to unblock some disk space
  • 07:49 root@cumin1002: START - Cookbook sre.mysql.clone of db2126.codfw.wmnet onto db2226.codfw.wmnet
  • 07:48 marostegui@cumin1002: dbctl commit (dc=all): 'Add db2226 depooled', diff saved to https://phabricator.wikimedia.org/P71839 and previous config saved to /var/cache/conftool/dbconfig/20250108-074856-marostegui.json
  • 07:47 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 07:47 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2024
  • 07:47 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2025
  • 07:47 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2025
  • 07:47 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2025.codfw.wmnet with OS bookworm
  • 07:47 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2024.codfw.wmnet with OS bookworm
  • 07:45 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2024-2025].codfw.wmnet
  • 07:44 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2024-2025].codfw.wmnet
  • 07:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: cloning
  • 07:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: cloning
  • 07:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: cloning
  • 07:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: cloning
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2126 T373579', diff saved to https://phabricator.wikimedia.org/P71838 and previous config saved to /var/cache/conftool/dbconfig/20250108-073603-marostegui.json
  • 07:27 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 100%: Repooling after recloning', diff saved to https://phabricator.wikimedia.org/P71837 and previous config saved to /var/cache/conftool/dbconfig/20250108-072733-root.json
  • 07:12 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 75%: Repooling after recloning', diff saved to https://phabricator.wikimedia.org/P71836 and previous config saved to /var/cache/conftool/dbconfig/20250108-071228-root.json
  • 06:57 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 50%: Repooling after recloning', diff saved to https://phabricator.wikimedia.org/P71834 and previous config saved to /var/cache/conftool/dbconfig/20250108-065723-root.json
  • 06:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbproxy1021.eqiad.wmnet
  • 06:53 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 06:53 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbproxy1021.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 06:53 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbproxy1021.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 06:50 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 06:44 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbproxy1021.eqiad.wmnet
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 25%: Repooling after recloning', diff saved to https://phabricator.wikimedia.org/P71833 and previous config saved to /var/cache/conftool/dbconfig/20250108-064217-root.json
  • 06:27 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 10%: Repooling after recloning', diff saved to https://phabricator.wikimedia.org/P71832 and previous config saved to /var/cache/conftool/dbconfig/20250108-062712-root.json
  • 06:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es1021.eqiad.wmnet with reason: cloning es1042
  • 06:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es1021.eqiad.wmnet with reason: cloning es1042
  • 06:24 marostegui@cumin1002: dbctl commit (dc=all): 'Add es1041 to dbctl depooled T382569', diff saved to https://phabricator.wikimedia.org/P71831 and previous config saved to /var/cache/conftool/dbconfig/20250108-062447-marostegui.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1021 T382569', diff saved to https://phabricator.wikimedia.org/P71830 and previous config saved to /var/cache/conftool/dbconfig/20250108-061928-marostegui.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Switchover es4 eqiad master dbmaint T382569', diff saved to https://phabricator.wikimedia.org/P71829 and previous config saved to /var/cache/conftool/dbconfig/20250108-061914-marostegui.json
  • 06:12 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 5%: Repooling after recloning', diff saved to https://phabricator.wikimedia.org/P71828 and previous config saved to /var/cache/conftool/dbconfig/20250108-061207-root.json

2025-01-07

  • 23:58 eileen: civicrm upgraded from d8ec1e91 to 82b02ce5
  • 23:49 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] enwiki: Deploy Add Link to 5% of users (T382382) (duration: 13m 34s)
  • 23:42 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 23:42 urbanecm@deploy2002: urbanecm: Backport for [Growth] enwiki: Deploy Add Link to 5% of users (T382382) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:36 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2001.codfw.wmnet with OS bookworm
  • 23:36 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] enwiki: Deploy Add Link to 5% of users (T382382)
  • 23:35 cdanis@deploy2002: Finished scap sync-world: Backport for tracing: Disable tracing in CLI mode (T340552) (duration: 14m 23s)
  • 23:27 cdanis@deploy2002: cdanis: Continuing with sync
  • 23:27 cdanis@deploy2002: cdanis: Backport for tracing: Disable tracing in CLI mode (T340552) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:20 cdanis@deploy2002: Started scap sync-world: Backport for tracing: Disable tracing in CLI mode (T340552)
  • 23:19 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2001.codfw.wmnet with reason: host reimage
  • 23:16 jhathaway@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2001.codfw.wmnet with reason: host reimage
  • 23:05 jhathaway@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 22:47 jhathaway@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2001.codfw.wmnet with OS bookworm
  • 22:35 jhathaway@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 22:34 jhathaway@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2001.codfw.wmnet with OS bookworm
  • 22:22 jhathaway@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 22:22 jhathaway@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2001.codfw.wmnet with OS bookworm
  • 22:15 jhathaway@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 22:09 jhathaway@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2001.codfw.wmnet with OS bookworm
  • 22:02 jhathaway@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 20:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 20:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 20:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 20:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 20:46 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on doc1003.eqiad.wmnet with reason: maintenance
  • 20:45 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on doc1003.eqiad.wmnet with reason: maintenance
  • 20:19 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 20:18 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 20:16 cdanis@deploy2002: Finished scap sync-world: Backport for group0: enable OpenTelemetry exports (T340552) (duration: 16m 06s)
  • 20:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 20:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 20:08 cdanis@deploy2002: cdanis: Continuing with sync
  • 20:07 cdanis@deploy2002: cdanis: Backport for group0: enable OpenTelemetry exports (T340552) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:00 cdanis@deploy2002: Started scap sync-world: Backport for group0: enable OpenTelemetry exports (T340552)
  • 20:00 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 20:00 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 19:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 19:55 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 19:52 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 19:52 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply
  • 19:36 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.44.0-wmf.11 refs T382362
  • 18:41 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:41 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:27 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: sync
  • 18:27 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: sync
  • 18:17 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:17 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 18:12 mvernon@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8 days, 0:00:00 on ms-be2075.codfw.wmnet with reason: host is awaiting attention from Dell
  • 18:11 mvernon@cumin1002: START - Cookbook sre.hosts.downtime for 8 days, 0:00:00 on ms-be2075.codfw.wmnet with reason: host is awaiting attention from Dell
  • 17:50 ottomata: Disable varnish handling of /beacon/event to decommission eventlogging backend [puppet] - T238230 T353817
  • 17:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:49 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 17:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:43 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:29 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 17:26 dzahn@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muhammad Jaziraly out of all services on: 4 hosts
  • 17:26 dzahn@cumin2002: START - Cookbook sre.idm.logout Logging Muhammad Jaziraly out of all services on: 4 hosts
  • 17:24 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 17:24 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 17:20 dzahn@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muhammad Jaziraly out of all services on: 2313 hosts
  • 17:18 dzahn@cumin2002: START - Cookbook sre.idm.logout Logging Muhammad Jaziraly out of all services on: 2313 hosts
  • 16:53 ejegg: fundraising civicrm upgraded from 64cfe3a0 to d8ec1e91
  • 16:44 jayme: puppet ca destroy citoid.discovery.wmnet - T381474
  • 16:44 jayme: puppet ca destroy mathoid.discovery.wmnet - T381474
  • 16:44 jayme: puppet ca destroy termbox.discovery.wmnet - T381474
  • 16:43 mutante: krb1001 - sudo manage_principals.py delete muja@WIKIMEDIA (T383056)
  • 16:32 lucaswerkmeister-wmde@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 16:32 lucaswerkmeister-wmde@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 16:31 lucaswerkmeister-wmde@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:31 lucaswerkmeister-wmde@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 16:31 lucaswerkmeister-wmde@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:30 lucaswerkmeister-wmde@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:30 lucaswerkmeister-wmde@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 16:30 lucaswerkmeister-wmde@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 16:30 lucaswerkmeister-wmde@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:30 lucaswerkmeister-wmde@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 16:30 lucaswerkmeister-wmde@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 16:29 lucaswerkmeister-wmde@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 16:29 lucaswerkmeister-wmde@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 16:28 lucaswerkmeister-wmde@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 16:28 lucaswerkmeister-wmde@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:27 lucaswerkmeister-wmde@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
  • 16:27 lucaswerkmeister-wmde@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:27 lucaswerkmeister-wmde@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:27 lucaswerkmeister-wmde@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
  • 16:27 lucaswerkmeister-wmde@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
  • 16:26 lucaswerkmeister-wmde@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:26 lucaswerkmeister-wmde@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 16:26 lucaswerkmeister-wmde@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
  • 16:26 lucaswerkmeister-wmde@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
  • 16:25 lucaswerkmeister-wmde@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 16:24 lucaswerkmeister-wmde@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 16:24 lucaswerkmeister-wmde@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:24 lucaswerkmeister-wmde@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 16:24 lucaswerkmeister-wmde@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:23 lucaswerkmeister-wmde@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:23 lucaswerkmeister-wmde@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 16:23 lucaswerkmeister-wmde@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 16:22 lucaswerkmeister-wmde@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:22 lucaswerkmeister-wmde@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 16:22 lucaswerkmeister-wmde@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 16:22 lucaswerkmeister-wmde@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 16:06 elukey: reloaded postgres config on puppetdb1003 to pick up new wal size settings
  • 15:59 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2026.codfw.wmnet
  • 15:59 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2026.codfw.wmnet
  • 15:58 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2027.codfw.wmnet
  • 15:58 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2027.codfw.wmnet
  • 15:58 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2026.codfw.wmnet with OS bookworm
  • 15:58 lucaswerkmeister-wmde@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 15:57 lucaswerkmeister-wmde@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 15:57 lucaswerkmeister-wmde@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 15:57 lucaswerkmeister-wmde@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 15:55 lucaswerkmeister-wmde@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 15:55 lucaswerkmeister-wmde@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 15:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2027.codfw.wmnet with OS bookworm
  • 15:51 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1264-1266].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 15:51 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1266.eqiad.wmnet with OS bookworm
  • 15:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2026.codfw.wmnet with reason: host reimage
  • 15:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2027.codfw.wmnet with reason: host reimage
  • 15:31 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1266.eqiad.wmnet with reason: host reimage
  • 15:31 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1253-1255].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 15:31 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1255.eqiad.wmnet with OS bookworm
  • 15:29 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2027.codfw.wmnet with reason: host reimage
  • 15:29 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2026.codfw.wmnet with reason: host reimage
  • 15:26 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1266.eqiad.wmnet with reason: host reimage
  • 15:24 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: sync
  • 15:24 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: sync
  • 15:13 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:13 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Enable AutoModerator on zhwiki (T367306), cswikivoyage: Change the wordmark v2 (T382779) (duration: 30m 38s)
  • 15:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2026
  • 15:11 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2026
  • 15:11 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2027
  • 15:11 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2027
  • 15:11 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1255.eqiad.wmnet with reason: host reimage
  • 15:11 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2026.codfw.wmnet with OS bookworm
  • 15:11 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2027.codfw.wmnet with OS bookworm
  • 15:09 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2026-2027].codfw.wmnet
  • 15:08 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2026-2027].codfw.wmnet
  • 15:08 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1255.eqiad.wmnet with reason: host reimage
  • 15:07 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2028.codfw.wmnet
  • 15:07 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2028.codfw.wmnet
  • 15:07 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2029.codfw.wmnet
  • 15:07 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2029.codfw.wmnet
  • 15:05 lucaswerkmeister-wmde@deploy2002: zhaofjx, lucaswerkmeister-wmde: Continuing with sync
  • 14:58 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2029.codfw.wmnet with OS bookworm
  • 14:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2028.codfw.wmnet with OS bookworm
  • 14:51 moritzm: installing intel-microcode security updates
  • 14:49 lucaswerkmeister-wmde@deploy2002: zhaofjx, lucaswerkmeister-wmde: Backport for Enable AutoModerator on zhwiki (T367306), cswikivoyage: Change the wordmark v2 (T382779) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:42 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Enable AutoModerator on zhwiki (T367306), cswikivoyage: Change the wordmark v2 (T382779)
  • 14:37 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2029.codfw.wmnet with reason: host reimage
  • 14:36 ladsgroup@deploy2002: Finished scap sync-world: Backport for Reactivate Parsoid+Kartographer on hewiki and commonswiki (T373454 T373460) (duration: 24m 56s)
  • 14:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2028.codfw.wmnet with reason: host reimage
  • 14:34 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 14:33 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 14:31 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2029.codfw.wmnet with reason: host reimage
  • 14:31 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2028.codfw.wmnet with reason: host reimage
  • 14:26 ladsgroup@deploy2002: ladsgroup, ihurbain: Continuing with sync
  • 14:20 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1266.eqiad.wmnet with OS bookworm
  • 14:20 ladsgroup@deploy2002: ladsgroup, ihurbain: Backport for Reactivate Parsoid+Kartographer on hewiki and commonswiki (T373454 T373460) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:18 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1265.eqiad.wmnet with OS bookworm
  • 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1001.eqiad.wmnet
  • 14:16 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1255.eqiad.wmnet with OS bookworm
  • 14:14 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1254.eqiad.wmnet with OS bookworm
  • 14:13 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2029
  • 14:13 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2029
  • 14:13 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2028
  • 14:13 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2028
  • 14:13 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2029.codfw.wmnet with OS bookworm
  • 14:13 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2028.codfw.wmnet with OS bookworm
  • 14:12 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2028-2029].codfw.wmnet
  • 14:11 ladsgroup@deploy2002: Started scap sync-world: Backport for Reactivate Parsoid+Kartographer on hewiki and commonswiki (T373454 T373460)
  • 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1001.eqiad.wmnet
  • 14:10 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2028-2029].codfw.wmnet
  • 14:08 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2031.codfw.wmnet
  • 14:08 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2031.codfw.wmnet
  • 14:08 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2030.codfw.wmnet
  • 14:08 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2030.codfw.wmnet
  • 14:02 elukey: re-enable puppet fleetwide after maintenance
  • 13:59 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on wikikube-worker1265.eqiad.wmnet with reason: host reimage
  • 13:59 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1265.eqiad.wmnet with reason: host reimage
  • 13:57 elukey: start postgres and puppetdb on puppetdb2003
  • 13:54 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on wikikube-worker1254.eqiad.wmnet with reason: host reimage
  • 13:53 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1254.eqiad.wmnet with reason: host reimage
  • 13:52 elukey@cumin1002: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
  • 13:41 elukey@cumin1002: START - Cookbook sre.postgresql.postgres-init
  • 13:39 elukey: stop puppetdb and postgres on puppetdb2003 - T383114
  • 13:39 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1265.eqiad.wmnet with OS bookworm
  • 13:35 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on puppetdb2003.codfw.wmnet with reason: Resync postgres
  • 13:35 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on puppetdb2003.codfw.wmnet with reason: Resync postgres
  • 13:35 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1264.eqiad.wmnet with OS bookworm
  • 13:33 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1254.eqiad.wmnet with OS bookworm
  • 13:32 elukey: disable puppet fleetwide to allow maintenance on puppetdb2003
  • 13:31 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1253.eqiad.wmnet with OS bookworm
  • 13:30 moritzm: install gst-plugins-base1.0 security updates
  • 13:30 Amir1: running mwscript purgeParserCache.php --wiki=aawiki --tag pc5 --age=2592000 --msleep 200 in *eqiad* (T382948)
  • 13:15 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1264.eqiad.wmnet with reason: host reimage
  • 13:12 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1253.eqiad.wmnet with reason: host reimage
  • 13:09 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1264.eqiad.wmnet with reason: host reimage
  • 13:08 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1253.eqiad.wmnet with reason: host reimage
  • 12:49 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1264.eqiad.wmnet with OS bookworm
  • 12:48 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1253.eqiad.wmnet with OS bookworm
  • 12:47 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1264-1266].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 12:47 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2030.codfw.wmnet with OS bookworm
  • 12:46 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1253-1255].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 12:44 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2031.codfw.wmnet with OS bookworm
  • 12:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2030.codfw.wmnet with reason: host reimage
  • 12:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2031.codfw.wmnet with reason: host reimage
  • 12:20 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2030.codfw.wmnet with reason: host reimage
  • 12:19 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2031.codfw.wmnet with reason: host reimage
  • 12:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbproxy1020.eqiad.wmnet
  • 12:03 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:03 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbproxy1020.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 12:02 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbproxy1020.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 12:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2030
  • 12:02 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2030
  • 12:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2031
  • 12:02 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2031
  • 12:01 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2030.codfw.wmnet with OS bookworm
  • 12:01 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2031.codfw.wmnet with OS bookworm
  • 12:00 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2030-2031].codfw.wmnet
  • 11:59 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2030-2031].codfw.wmnet
  • 11:58 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 11:58 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2032.codfw.wmnet
  • 11:58 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2032.codfw.wmnet
  • 11:57 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2033.codfw.wmnet
  • 11:57 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2033.codfw.wmnet
  • 11:56 jelto: sudo homer 'cr*codfw*' commit 'T377877'
  • 11:55 jelto: sudo homer 'lsw1-c6-codfw*' commit 'T377877'
  • 11:52 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbproxy1020.eqiad.wmnet
  • 11:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2032.codfw.wmnet with OS bookworm
  • 11:46 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 11:46 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 11:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2033.codfw.wmnet with OS bookworm
  • 11:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2032.codfw.wmnet with reason: host reimage
  • 11:26 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2032.codfw.wmnet with reason: host reimage
  • 11:25 moritzm: refreshed Ganeti internal cert for ulsfo (after adding a manual temp cert to unblock itself) T382873
  • 11:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2033.codfw.wmnet with reason: host reimage
  • 11:18 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2033.codfw.wmnet with reason: host reimage
  • 11:13 hashar@deploy2002: Finished deploy [integration/docroot@5d32766]: Remove Flow from being listed on doc.wikimedia.org frontpage - T379671 (duration: 00m 11s)
  • 11:13 hashar@deploy2002: Started deploy [integration/docroot@5d32766]: Remove Flow from being listed on doc.wikimedia.org frontpage - T379671
  • 11:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2032
  • 11:08 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2032
  • 11:08 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2032
  • 11:08 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2032.codfw.wmnet 215.32.192.10.in-addr.arpa 5.1.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:08 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2032.codfw.wmnet 215.32.192.10.in-addr.arpa 5.1.2.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:08 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:07 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2032 - jelto@cumin1002"
  • 11:07 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2032 - jelto@cumin1002"
  • 11:03 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:02 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2032
  • 11:01 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2032.codfw.wmnet with OS bookworm
  • 11:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2033
  • 11:01 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2033
  • 11:01 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2033.codfw.wmnet with OS bookworm
  • 10:56 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2032-2033].codfw.wmnet
  • 10:53 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2032-2033].codfw.wmnet
  • 10:52 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2034.codfw.wmnet
  • 10:52 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2034.codfw.wmnet
  • 10:52 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2035.codfw.wmnet
  • 10:51 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2035.codfw.wmnet
  • 10:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ganeti[4005-4008].ulsfo.wmnet with reason: renew certs
  • 10:50 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2035.codfw.wmnet with OS bookworm
  • 10:50 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on ganeti[4005-4008].ulsfo.wmnet with reason: renew certs
  • 10:50 cmooney@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 269180
  • 10:49 cmooney@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 269180
  • 10:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2034.codfw.wmnet with OS bookworm
  • 10:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2035.codfw.wmnet with reason: host reimage
  • 10:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2034.codfw.wmnet with reason: host reimage
  • 10:25 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2035.codfw.wmnet with reason: host reimage
  • 10:23 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2034.codfw.wmnet with reason: host reimage
  • 10:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2035
  • 10:05 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2035
  • 10:05 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2034
  • 10:05 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2034
  • 10:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2034.codfw.wmnet with OS bookworm
  • 10:05 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2035.codfw.wmnet with OS bookworm
  • 10:05 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
  • 10:03 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2034-2035].codfw.wmnet
  • 10:02 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2034-2035].codfw.wmnet
  • 10:01 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2036.codfw.wmnet
  • 10:01 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2036.codfw.wmnet
  • 10:01 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2037.codfw.wmnet
  • 10:01 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2037.codfw.wmnet
  • 10:01 cmooney@cumin1002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
  • 09:51 jelto: sudo homer 'cr*codfw*' commit 'T377877'
  • 09:50 jelto: sudo homer 'lsw1-c1-codfw*' commit 'T377877'
  • 09:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2037.codfw.wmnet with OS bookworm
  • 09:40 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2036.codfw.wmnet with OS bookworm
  • 09:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2037.codfw.wmnet with reason: host reimage
  • 09:22 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2037.codfw.wmnet with reason: host reimage
  • 09:20 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2036.codfw.wmnet with reason: host reimage
  • 09:17 moritzm: installing systemd bugfix updates from Bookworm point release
  • 09:16 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2036.codfw.wmnet with reason: host reimage
  • 09:08 moritzm: installing Java 21.0.5 security updates
  • 09:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2037
  • 09:02 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2037
  • 09:02 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2037
  • 09:02 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2037.codfw.wmnet 146.32.192.10.in-addr.arpa 6.4.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:02 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2037.codfw.wmnet 146.32.192.10.in-addr.arpa 6.4.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:02 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:02 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2037 - jelto@cumin1002"
  • 09:02 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2037 - jelto@cumin1002"
  • 08:59 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 08:58 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2037
  • 08:57 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2037.codfw.wmnet with OS bookworm
  • 08:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2036
  • 08:57 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2036
  • 08:56 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2036.codfw.wmnet with OS bookworm
  • 08:54 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2036-2037].codfw.wmnet
  • 08:53 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2036-2037].codfw.wmnet
  • 08:48 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2038.codfw.wmnet
  • 08:48 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2038.codfw.wmnet
  • 08:48 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2039.codfw.wmnet
  • 08:48 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2039.codfw.wmnet
  • 08:47 jelto: sudo homer 'lsw1-c5-codfw*' commit 'T377877'
  • 08:44 jelto: sudo homer 'cr*codfw*' commit 'T377877'
  • 08:44 jelto: sudo homer 'lsw1-c1-codfw*' commit 'T377877'
  • 08:43 jelto: sudo homer 'lsw1-c7-codfw*' commit 'T377877'
  • 08:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es1020.eqiad.wmnet with reason: cloning es1041
  • 08:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es1020.eqiad.wmnet with reason: cloning es1041
  • 08:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1020 T382569', diff saved to https://phabricator.wikimedia.org/P71822 and previous config saved to /var/cache/conftool/dbconfig/20250107-083930-marostegui.json
  • 08:38 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2038.codfw.wmnet with OS bookworm
  • 08:38 marostegui@cumin1002: dbctl commit (dc=all): 'Switchover es5 eqiad master dbmaint T382569', diff saved to https://phabricator.wikimedia.org/P71821 and previous config saved to /var/cache/conftool/dbconfig/20250107-083811-marostegui.json
  • 08:26 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2039.codfw.wmnet with OS bookworm
  • 08:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es2025.codfw.wmnet
  • 08:20 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:20 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2025.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 08:19 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2025.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 08:17 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2038.codfw.wmnet with reason: host reimage
  • 08:14 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 08:13 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2038.codfw.wmnet with reason: host reimage
  • 08:09 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts es2025.codfw.wmnet
  • 08:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2039.codfw.wmnet with reason: host reimage
  • 08:03 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2039.codfw.wmnet with reason: host reimage
  • 07:53 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2038
  • 07:53 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2038
  • 07:53 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2038
  • 07:53 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2038.codfw.wmnet 147.32.192.10.in-addr.arpa 7.4.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 07:53 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2038.codfw.wmnet 147.32.192.10.in-addr.arpa 7.4.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 07:53 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:53 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2038 - jelto@cumin1002"
  • 07:53 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2038 - jelto@cumin1002"
  • 07:49 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 07:49 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2038
  • 07:49 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2038.codfw.wmnet with OS bookworm
  • 07:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2039
  • 07:43 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2039
  • 07:43 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2039
  • 07:43 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2039.codfw.wmnet 150.32.192.10.in-addr.arpa 0.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 07:43 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2039.codfw.wmnet 150.32.192.10.in-addr.arpa 0.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 07:43 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:42 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2039 - jelto@cumin1002"
  • 07:41 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2039 - jelto@cumin1002"
  • 07:37 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 07:37 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2039
  • 07:37 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2039.codfw.wmnet with OS bookworm
  • 07:33 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2038-2039].codfw.wmnet
  • 07:32 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2038-2039].codfw.wmnet
  • 06:50 marostegui@cumin1002: dbctl commit (dc=all): 'Remove es2025 from dbctl T381848', diff saved to https://phabricator.wikimedia.org/P71820 and previous config saved to /var/cache/conftool/dbconfig/20250107-064958-marostegui.json
  • 05:40 marostegui: Switchover m5 eqiad proxy dbmaint eqiad T368874
  • 04:51 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.44.0-wmf.11 refs T382362 (duration: 48m 35s)
  • 04:02 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.44.0-wmf.11 refs T382362
  • 00:24 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1257-1263].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 00:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1263.eqiad.wmnet with OS bookworm
  • 00:05 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1263.eqiad.wmnet with reason: host reimage
  • 00:01 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1263.eqiad.wmnet with reason: host reimage

2025-01-06

  • 23:41 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1263.eqiad.wmnet with OS bookworm
  • 23:39 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1262.eqiad.wmnet with OS bookworm
  • 23:19 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1262.eqiad.wmnet with reason: host reimage
  • 23:16 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1262.eqiad.wmnet with reason: host reimage
  • 22:56 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1262.eqiad.wmnet with OS bookworm
  • 22:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1261.eqiad.wmnet with OS bookworm
  • 22:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1261.eqiad.wmnet with reason: host reimage
  • 22:33 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1261.eqiad.wmnet with reason: host reimage
  • 22:13 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1261.eqiad.wmnet with OS bookworm
  • 22:10 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1260.eqiad.wmnet with OS bookworm
  • 21:49 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1260.eqiad.wmnet with reason: host reimage
  • 21:45 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1260.eqiad.wmnet with reason: host reimage
  • 21:39 cjming: end of UTC late backport window
  • 21:31 cjming@deploy2002: Finished scap sync-world: Backport for Move logic for type infering to server (T382042) (duration: 19m 31s)
  • 21:25 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1260.eqiad.wmnet with OS bookworm
  • 21:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1259.eqiad.wmnet with OS bookworm
  • 21:22 cjming@deploy2002: cjming, jdlrobson: Continuing with sync
  • 21:16 cjming@deploy2002: cjming, jdlrobson: Backport for Move logic for type infering to server (T382042) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:12 cjming@deploy2002: Started scap sync-world: Backport for Move logic for type infering to server (T382042)
  • 21:06 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1259.eqiad.wmnet with reason: host reimage
  • 21:02 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1259.eqiad.wmnet with reason: host reimage
  • 20:49 eileen: tools upgraded from c7b53ecd to 6ddfb22f
  • 20:41 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1259.eqiad.wmnet with OS bookworm
  • 20:39 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1258.eqiad.wmnet with OS bookworm
  • 20:24 brett: running authdns-update for CR 1108142
  • 20:19 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1258.eqiad.wmnet with reason: host reimage
  • 20:15 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1258.eqiad.wmnet with reason: host reimage
  • 19:55 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1258.eqiad.wmnet with OS bookworm
  • 19:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1257.eqiad.wmnet with OS bookworm
  • 19:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1257.eqiad.wmnet with reason: host reimage
  • 19:32 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1257.eqiad.wmnet with reason: host reimage
  • 19:11 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1257.eqiad.wmnet with OS bookworm
  • 19:09 kamila@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1257-1263].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 18:47 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1244.eqiad.wmnet
  • 18:47 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1244.eqiad.wmnet
  • 18:32 ChrisDobbins901_: cdobbins@dns1004 running authdns-update for CR 1097521
  • 18:31 ChrisDobbins901_: cdobbins@cumin1002 running authdns-update for CR 1097521
  • 17:59 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1244.eqiad.wmnet with OS bookworm
  • 17:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T370903)', diff saved to https://phabricator.wikimedia.org/P71818 and previous config saved to /var/cache/conftool/dbconfig/20250106-174024-ladsgroup.json
  • 17:39 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1244.eqiad.wmnet with reason: host reimage
  • 17:37 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1244.eqiad.wmnet with reason: host reimage
  • 17:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P71817 and previous config saved to /var/cache/conftool/dbconfig/20250106-172517-ladsgroup.json
  • 17:16 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1244.eqiad.wmnet with OS bookworm
  • 17:15 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1250-1252].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 17:15 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1252.eqiad.wmnet with OS bookworm
  • 17:15 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P{wikikube-worker[1240-1244].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 17:15 jayme@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1244.eqiad.wmnet with OS bookworm
  • 17:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P71816 and previous config saved to /var/cache/conftool/dbconfig/20250106-171010-ladsgroup.json
  • 17:00 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 02m 43s)
  • 16:58 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 12m 29s)
  • 16:56 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1252.eqiad.wmnet with reason: host reimage
  • 16:55 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 16:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T370903)', diff saved to https://phabricator.wikimedia.org/P71815 and previous config saved to /var/cache/conftool/dbconfig/20250106-165503-ladsgroup.json
  • 16:54 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 16:54 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 16:54 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 16:54 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 16:53 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 16:52 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1252.eqiad.wmnet with reason: host reimage
  • 16:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1193 (T370903)', diff saved to https://phabricator.wikimedia.org/P71813 and previous config saved to /var/cache/conftool/dbconfig/20250106-164215-ladsgroup.json
  • 16:42 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 16:42 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 16:39 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:39 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 16:38 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:37 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 15:51 moritzm: uploaded openjdk-21 21.0.5+11-1~deb12u1 to apt.wikimedia.org component/jdk21
  • 15:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2143.codfw.wmnet with reason: onsite maintenance
  • 15:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2143.codfw.wmnet with reason: onsite maintenance
  • 15:31 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2041.codfw.wmnet
  • 15:31 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2041.codfw.wmnet
  • 15:31 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2040.codfw.wmnet
  • 15:31 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2040.codfw.wmnet
  • 15:29 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2040.codfw.wmnet with OS bookworm
  • 15:27 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1252.eqiad.wmnet with OS bookworm
  • 15:25 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1251.eqiad.wmnet with OS bookworm
  • 15:23 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1244.eqiad.wmnet with OS bookworm
  • 15:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2041.codfw.wmnet with OS bookworm
  • 15:13 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 15:12 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:11 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Remove EntitySchema DataType feature flag - is always enabled (T333667) (duration: 13m 25s)
  • 15:09 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2040.codfw.wmnet with reason: host reimage
  • 15:06 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1251.eqiad.wmnet with reason: host reimage
  • 15:04 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, arthurtaylor: Continuing with sync
  • 15:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2041.codfw.wmnet with reason: host reimage
  • 15:02 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, arthurtaylor: Backport for Remove EntitySchema DataType feature flag - is always enabled (T333667) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:00 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1251.eqiad.wmnet with reason: host reimage
  • 15:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T371742)', diff saved to https://phabricator.wikimedia.org/P71811 and previous config saved to /var/cache/conftool/dbconfig/20250106-150040-ladsgroup.json
  • 14:58 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2040.codfw.wmnet with reason: host reimage
  • 14:58 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2041.codfw.wmnet with reason: host reimage
  • 14:58 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Remove EntitySchema DataType feature flag - is always enabled (T333667)
  • 14:56 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Add bfw, gju-arab, gju-deva, hoc and kgg to wmgExtraLanguageNames (T381934) (duration: 17m 11s)
  • 14:49 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, jhsoby: Continuing with sync
  • 14:46 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, jhsoby: Backport for Add bfw, gju-arab, gju-deva, hoc and kgg to wmgExtraLanguageNames (T381934) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P71810 and previous config saved to /var/cache/conftool/dbconfig/20250106-144534-ladsgroup.json
  • 14:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2040
  • 14:40 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2040
  • 14:40 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2041
  • 14:40 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2041
  • 14:40 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2040.codfw.wmnet with OS bookworm
  • 14:40 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2041.codfw.wmnet with OS bookworm
  • 14:40 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1251.eqiad.wmnet with OS bookworm
  • 14:39 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Add bfw, gju-arab, gju-deva, hoc and kgg to wmgExtraLanguageNames (T381934)
  • 14:39 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2040-2041].codfw.wmnet
  • 14:38 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1250.eqiad.wmnet with OS bookworm
  • 14:38 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for ExtensionDistributor: Mark 1.43 as stable; remove 1.41 as EOL (T372331 T376550) (duration: 14m 14s)
  • 14:35 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2040-2041].codfw.wmnet
  • 14:34 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2042.codfw.wmnet
  • 14:34 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2042.codfw.wmnet
  • 14:34 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2043.codfw.wmnet
  • 14:34 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2043.codfw.wmnet
  • 14:33 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2042.codfw.wmnet with OS bookworm
  • 14:33 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1243.eqiad.wmnet with OS bookworm
  • 14:30 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2043.codfw.wmnet with OS bookworm
  • 14:30 lucaswerkmeister-wmde@deploy2002: macfan4000, lucaswerkmeister-wmde: Continuing with sync
  • 14:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P71809 and previous config saved to /var/cache/conftool/dbconfig/20250106-143027-ladsgroup.json
  • 14:28 moritzm: installing libvirt bugfix updates
  • 14:27 lucaswerkmeister-wmde@deploy2002: macfan4000, lucaswerkmeister-wmde: Backport for ExtensionDistributor: Mark 1.43 as stable; remove 1.41 as EOL (T372331 T376550) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:23 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for ExtensionDistributor: Mark 1.43 as stable; remove 1.41 as EOL (T372331 T376550)
  • 14:22 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Add mergehistory to import and transwiki on en.wikibooks (T382785), Add suppressredirect and delete-redirect to en.wikinews reviewers (T382887) (duration: 15m 02s)
  • 14:18 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1250.eqiad.wmnet with reason: host reimage
  • 14:15 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, dreamrimmer: Continuing with sync
  • 14:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T371742)', diff saved to https://phabricator.wikimedia.org/P71808 and previous config saved to /var/cache/conftool/dbconfig/20250106-141520-ladsgroup.json
  • 14:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2042.codfw.wmnet with reason: host reimage
  • 14:12 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, dreamrimmer: Backport for Add mergehistory to import and transwiki on en.wikibooks (T382785), Add suppressredirect and delete-redirect to en.wikinews reviewers (T382887) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:12 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1250.eqiad.wmnet with reason: host reimage
  • 14:10 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2043.codfw.wmnet with reason: host reimage
  • 14:10 marostegui: Deploy schema change on x1 dbmaint codfw T383052
  • 14:08 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2042.codfw.wmnet with reason: host reimage
  • 14:07 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2043.codfw.wmnet with reason: host reimage
  • 14:07 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Add mergehistory to import and transwiki on en.wikibooks (T382785), Add suppressredirect and delete-redirect to en.wikinews reviewers (T382887)
  • 14:04 marostegui: Deploy schema change on x1 dbmaint eqiad T383052
  • 13:51 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1250.eqiad.wmnet with OS bookworm
  • 13:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2042
  • 13:49 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2042
  • 13:49 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2043
  • 13:49 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2043
  • 13:49 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2042.codfw.wmnet with OS bookworm
  • 13:49 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2043.codfw.wmnet with OS bookworm
  • 13:47 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1250-1252].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 13:47 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2042-2043].codfw.wmnet
  • 13:46 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2042-2043].codfw.wmnet
  • 13:40 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2044.codfw.wmnet
  • 13:40 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2044.codfw.wmnet
  • 13:40 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2045.codfw.wmnet
  • 13:40 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2045.codfw.wmnet
  • 13:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2045.codfw.wmnet with OS bookworm
  • 13:34 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2044.codfw.wmnet with OS bookworm
  • 13:18 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2045.codfw.wmnet with reason: host reimage
  • 13:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2044.codfw.wmnet with reason: host reimage
  • 13:13 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1243.eqiad.wmnet with OS bookworm
  • 13:12 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2045.codfw.wmnet with reason: host reimage
  • 13:11 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2044.codfw.wmnet with reason: host reimage
  • 12:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2044
  • 12:55 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2044
  • 12:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2045
  • 12:54 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2045
  • 12:54 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2044.codfw.wmnet with OS bookworm
  • 12:54 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2045.codfw.wmnet with OS bookworm
  • 12:54 ladsgroup@deploy2002: Finished scap sync-world: Backport for ParserCache: Set connect and recieve timeouts (T378076 T373037) (duration: 13m 39s)
  • 12:53 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2044-2045].codfw.wmnet
  • 12:52 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2044-2045].codfw.wmnet
  • 12:51 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2046.codfw.wmnet
  • 12:51 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2046.codfw.wmnet
  • 12:51 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2048.codfw.wmnet
  • 12:51 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2048.codfw.wmnet
  • 12:48 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 12:46 ladsgroup@deploy2002: ladsgroup: Backport for ParserCache: Set connect and recieve timeouts (T378076 T373037) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:40 ladsgroup@deploy2002: Started scap sync-world: Backport for ParserCache: Set connect and recieve timeouts (T378076 T373037)
  • 12:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1217.eqiad.wmnet with reason: upgrade kernel
  • 12:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on db1217.eqiad.wmnet with reason: upgrade kernel
  • 12:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1193 (T371742)', diff saved to https://phabricator.wikimedia.org/P71807 and previous config saved to /var/cache/conftool/dbconfig/20250106-123416-ladsgroup.json
  • 12:34 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 12:34 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 12:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T371742)', diff saved to https://phabricator.wikimedia.org/P71806 and previous config saved to /var/cache/conftool/dbconfig/20250106-123405-ladsgroup.json
  • 12:30 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1245-1249].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 12:30 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1249.eqiad.wmnet with OS bookworm
  • 12:27 Emperor: swift post wikipedia-commons-local-thumb.f8 --read-acl 'mw:thumbor,mw:media,.r:*' --write-acl 'mw:thumbor,mw:media' ms-fe2009 per T383034
  • 12:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P71805 and previous config saved to /var/cache/conftool/dbconfig/20250106-121858-ladsgroup.json
  • 12:14 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1243.eqiad.wmnet with OS bookworm
  • 12:11 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1249.eqiad.wmnet with reason: host reimage
  • 12:07 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1249.eqiad.wmnet with reason: host reimage
  • 12:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es2023.codfw.wmnet
  • 12:04 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:04 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2023.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 12:04 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2023.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 12:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P71804 and previous config saved to /var/cache/conftool/dbconfig/20250106-120351-ladsgroup.json
  • 12:01 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 11:56 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts es2023.codfw.wmnet
  • 11:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2046.codfw.wmnet with OS bookworm
  • 11:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2048.codfw.wmnet with OS bookworm
  • 11:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T371742)', diff saved to https://phabricator.wikimedia.org/P71803 and previous config saved to /var/cache/conftool/dbconfig/20250106-114844-ladsgroup.json
  • 11:47 moritzm: fix /etc/network/interfaces on doc2002 T382610
  • 11:47 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1249.eqiad.wmnet with OS bookworm
  • 11:45 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1248.eqiad.wmnet with OS bookworm
  • 11:35 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2046.codfw.wmnet with reason: host reimage
  • 11:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2048.codfw.wmnet with reason: host reimage
  • 11:28 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2046.codfw.wmnet with reason: host reimage
  • 11:26 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2048.codfw.wmnet with reason: host reimage
  • 11:24 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1248.eqiad.wmnet with reason: host reimage
  • 11:20 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1248.eqiad.wmnet with reason: host reimage
  • 11:16 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:16 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2048
  • 11:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2046
  • 11:08 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2046
  • 11:08 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2048
  • 11:08 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2046.codfw.wmnet with OS bookworm
  • 11:08 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2048.codfw.wmnet with OS bookworm
  • 11:07 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2046,2048].codfw.wmnet
  • 11:06 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2046,2048].codfw.wmnet
  • 11:00 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1248.eqiad.wmnet with OS bookworm
  • 10:58 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1247.eqiad.wmnet with OS bookworm
  • 10:52 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1243.eqiad.wmnet with OS bookworm
  • 10:51 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1242.eqiad.wmnet with OS bookworm
  • 10:49 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2050.codfw.wmnet
  • 10:49 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2050.codfw.wmnet
  • 10:49 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2049.codfw.wmnet
  • 10:49 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2049.codfw.wmnet
  • 10:39 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1247.eqiad.wmnet with reason: host reimage
  • 10:39 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2049.codfw.wmnet with OS bookworm
  • 10:36 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2050.codfw.wmnet with OS bookworm
  • 10:35 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1247.eqiad.wmnet with reason: host reimage
  • 10:32 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1242.eqiad.wmnet with reason: host reimage
  • 10:28 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1242.eqiad.wmnet with reason: host reimage
  • 10:19 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2049.codfw.wmnet with reason: host reimage
  • 10:16 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2050.codfw.wmnet with reason: host reimage
  • 10:15 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1247.eqiad.wmnet with OS bookworm
  • 10:15 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2049.codfw.wmnet with reason: host reimage
  • 10:13 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1246.eqiad.wmnet with OS bookworm
  • 10:13 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 10:13 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 10:13 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 10:13 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2050.codfw.wmnet with reason: host reimage
  • 10:12 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 10:12 cgoubert@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 10:11 cgoubert@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 10:11 cgoubert@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:10 cgoubert@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 10:10 cgoubert@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:09 cgoubert@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 10:09 cgoubert@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 10:09 cgoubert@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 10:08 cgoubert@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:08 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1242.eqiad.wmnet with OS bookworm
  • 10:08 cgoubert@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 10:07 cgoubert@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 10:07 cgoubert@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 10:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1167 (T371742)', diff saved to https://phabricator.wikimedia.org/P71801 and previous config saved to /var/cache/conftool/dbconfig/20250106-100706-ladsgroup.json
  • 10:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 10:06 claime: Deploying admin_ng external services changes on all kubernetes clusters
  • 10:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 10:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 10:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 10:06 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1241.eqiad.wmnet with OS bookworm
  • 09:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2050
  • 09:55 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2050
  • 09:55 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2049
  • 09:55 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2049
  • 09:55 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2050.codfw.wmnet with OS bookworm
  • 09:55 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2049.codfw.wmnet with OS bookworm
  • 09:54 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2049-2050].codfw.wmnet
  • 09:54 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1246.eqiad.wmnet with reason: host reimage
  • 09:53 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2049-2050].codfw.wmnet
  • 09:52 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2052.codfw.wmnet
  • 09:52 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2052.codfw.wmnet
  • 09:52 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2051.codfw.wmnet
  • 09:52 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2051.codfw.wmnet
  • 09:51 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2052.codfw.wmnet with OS bookworm
  • 09:50 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1246.eqiad.wmnet with reason: host reimage
  • 09:47 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1241.eqiad.wmnet with reason: host reimage
  • 09:46 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2051.codfw.wmnet with OS bookworm
  • 09:43 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1241.eqiad.wmnet with reason: host reimage
  • 09:41 dcausse: repooling wdqs1012
  • 09:31 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2052.codfw.wmnet with reason: host reimage
  • 09:29 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1246.eqiad.wmnet with OS bookworm
  • 09:27 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2051.codfw.wmnet with reason: host reimage
  • 09:26 dcausse: depooling wdqs1012 (high lag, forgot to keep it depooled after restarting blazegraph)
  • 09:26 marostegui: Reboot db2160 for kernel upgrade T376905
  • 09:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2160.codfw.wmnet with reason: upgrade kernel
  • 09:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on db2160.codfw.wmnet with reason: upgrade kernel
  • 09:25 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1245.eqiad.wmnet with OS bookworm
  • 09:24 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2052.codfw.wmnet with reason: host reimage
  • 09:23 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2051.codfw.wmnet with reason: host reimage
  • 09:23 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1241.eqiad.wmnet with OS bookworm
  • 09:21 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1240.eqiad.wmnet with OS bookworm
  • 09:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2052
  • 09:06 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2052
  • 09:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2051
  • 09:06 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2051
  • 09:06 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2052.codfw.wmnet with OS bookworm
  • 09:06 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2051.codfw.wmnet with OS bookworm
  • 09:06 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1245.eqiad.wmnet with reason: host reimage
  • 09:05 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2051-2052].codfw.wmnet
  • 09:04 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2051-2052].codfw.wmnet
  • 09:03 awight: UTC morning deployment finished
  • 09:02 awight@deploy2002: Finished scap sync-world: Backport for bjnwikiquote: add wordmark (T382777) (duration: 11m 59s)
  • 09:02 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1240.eqiad.wmnet with reason: host reimage
  • 09:00 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1245.eqiad.wmnet with reason: host reimage
  • 08:58 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1240.eqiad.wmnet with reason: host reimage
  • 08:57 awight@deploy2002: awight, anzx: Continuing with sync
  • 08:55 awight@deploy2002: awight, anzx: Backport for bjnwikiquote: add wordmark (T382777) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:50 awight@deploy2002: Started scap sync-world: Backport for bjnwikiquote: add wordmark (T382777)
  • 08:46 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2053.codfw.wmnet
  • 08:46 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2053.codfw.wmnet
  • 08:46 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2054.codfw.wmnet
  • 08:46 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2054.codfw.wmnet
  • 08:46 awight@deploy2002: Finished scap sync-world: Backport for Change license on ptwikinews, nlwikinews and rowikinews to cc-by-4.0 (T382649) (duration: 09m 28s)
  • 08:44 jelto@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[2053-2054].codfw.wmnet
  • 08:44 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2053-2054].codfw.wmnet
  • 08:43 jelto@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[2053-2054].codfw.wmnet
  • 08:43 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2053-2054].codfw.wmnet
  • 08:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2053.codfw.wmnet with OS bookworm
  • 08:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2054.codfw.wmnet with OS bookworm
  • 08:41 awight@deploy2002: dreamrimmer, awight: Continuing with sync
  • 08:41 awight@deploy2002: dreamrimmer, awight: Backport for Change license on ptwikinews, nlwikinews and rowikinews to cc-by-4.0 (T382649) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:39 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1245.eqiad.wmnet with OS bookworm
  • 08:38 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1240.eqiad.wmnet with OS bookworm
  • 08:37 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1245-1249].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 08:37 jayme@cumin1002: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1240-1244].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
  • 08:36 awight@deploy2002: Started scap sync-world: Backport for Change license on ptwikinews, nlwikinews and rowikinews to cc-by-4.0 (T382649)
  • 08:29 awight@deploy2002: awight, dreamrimmer: Backport for Change license on ptwikinews, nlwikinews and rowikinews to cc-by-4.0 (T382649) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:25 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2053.codfw.wmnet with reason: host reimage
  • 08:24 awight@deploy2002: Started scap sync-world: Backport for Change license on ptwikinews, nlwikinews and rowikinews to cc-by-4.0 (T382649)
  • 08:22 awight@deploy2002: Finished scap sync-world: Backport for [arwiki] Add templateeditor user group (T382784) (duration: 14m 48s)
  • 08:22 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2054.codfw.wmnet with reason: host reimage
  • 08:18 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2053.codfw.wmnet with reason: host reimage
  • 08:18 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2054.codfw.wmnet with reason: host reimage
  • 08:18 dcausse: restarting blazegraph on wdqs1012 (stuck with high thread count)
  • 08:17 awight@deploy2002: awight, hubaishan: Continuing with sync
  • 08:15 dcausse: restarting blazegraph on wdqs1014 (BlazegraphFreeAllocatorsDecreasingRapidly)
  • 08:13 awight@deploy2002: awight, hubaishan: Backport for [arwiki] Add templateeditor user group (T382784) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:13 moritzm: installing fastnetmon security updates
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2024 and es2025 T381848', diff saved to https://phabricator.wikimedia.org/P71798 and previous config saved to /var/cache/conftool/dbconfig/20250106-080845-marostegui.json
  • 08:07 marostegui@cumin1002: dbctl commit (dc=all): 'Remove es2023 from dbctl and promote es2046 to es5 master T381848 T383026', diff saved to https://phabricator.wikimedia.org/P71797 and previous config saved to /var/cache/conftool/dbconfig/20250106-080755-marostegui.json
  • 08:07 awight@deploy2002: Started scap sync-world: Backport for [arwiki] Add templateeditor user group (T382784)
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2024 T381848', diff saved to https://phabricator.wikimedia.org/P71796 and previous config saved to /var/cache/conftool/dbconfig/20250106-080609-marostegui.json
  • 08:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2023 T383026', diff saved to https://phabricator.wikimedia.org/P71795 and previous config saved to /var/cache/conftool/dbconfig/20250106-080405-marostegui.json
  • 08:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2054
  • 08:00 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2054
  • 08:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2053
  • 08:00 jelto@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2053
  • 08:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2054.codfw.wmnet with OS bookworm
  • 08:00 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2053.codfw.wmnet with OS bookworm
  • 07:59 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2053-2054].codfw.wmnet
  • 07:55 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2053-2054].codfw.wmnet
  • 07:29 moritzm: installing systemd bugfix updates from Bookworm point release
  • 07:13 marostegui: dbmaint Switchover m3 (phabricator) eqiad master dbproxy1020 -> dbproxy1028 T368874
  • 06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es2022.codfw.wmnet
  • 06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2022.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 06:54 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera ge