Server Admin Log

From Wikitech

2024-05-19

  • 11:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T352010)', diff saved to https://phabricator.wikimedia.org/P62657 and previous config saved to /var/cache/conftool/dbconfig/20240519-112730-ladsgroup.json
  • 11:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P62656 and previous config saved to /var/cache/conftool/dbconfig/20240519-111222-ladsgroup.json
  • 10:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P62655 and previous config saved to /var/cache/conftool/dbconfig/20240519-105714-ladsgroup.json
  • 10:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T352010)', diff saved to https://phabricator.wikimedia.org/P62654 and previous config saved to /var/cache/conftool/dbconfig/20240519-104206-ladsgroup.json
  • 10:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T352010)', diff saved to https://phabricator.wikimedia.org/P62653 and previous config saved to /var/cache/conftool/dbconfig/20240519-102315-ladsgroup.json
  • 10:23 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 10:23 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 10:23 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 10:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 10:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T352010)', diff saved to https://phabricator.wikimedia.org/P62652 and previous config saved to /var/cache/conftool/dbconfig/20240519-102247-ladsgroup.json
  • 10:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P62651 and previous config saved to /var/cache/conftool/dbconfig/20240519-100739-ladsgroup.json
  • 09:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P62650 and previous config saved to /var/cache/conftool/dbconfig/20240519-095231-ladsgroup.json
  • 09:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T352010)', diff saved to https://phabricator.wikimedia.org/P62649 and previous config saved to /var/cache/conftool/dbconfig/20240519-093723-ladsgroup.json
  • 07:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2217 (T352010)', diff saved to https://phabricator.wikimedia.org/P62648 and previous config saved to /var/cache/conftool/dbconfig/20240519-074556-ladsgroup.json
  • 07:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 07:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 07:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214 (T352010)', diff saved to https://phabricator.wikimedia.org/P62647 and previous config saved to /var/cache/conftool/dbconfig/20240519-074532-ladsgroup.json
  • 07:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214', diff saved to https://phabricator.wikimedia.org/P62646 and previous config saved to /var/cache/conftool/dbconfig/20240519-073025-ladsgroup.json
  • 07:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214', diff saved to https://phabricator.wikimedia.org/P62645 and previous config saved to /var/cache/conftool/dbconfig/20240519-071517-ladsgroup.json
  • 07:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214 (T352010)', diff saved to https://phabricator.wikimedia.org/P62644 and previous config saved to /var/cache/conftool/dbconfig/20240519-070008-ladsgroup.json
  • 05:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 05:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 05:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2214 (T352010)', diff saved to https://phabricator.wikimedia.org/P62643 and previous config saved to /var/cache/conftool/dbconfig/20240519-051029-ladsgroup.json
  • 05:10 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2214.codfw.wmnet with reason: Maintenance
  • 05:10 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2214.codfw.wmnet with reason: Maintenance
  • 01:43 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 01:43 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 01:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T352010)', diff saved to https://phabricator.wikimedia.org/P62642 and previous config saved to /var/cache/conftool/dbconfig/20240519-014335-ladsgroup.json
  • 01:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P62641 and previous config saved to /var/cache/conftool/dbconfig/20240519-012827-ladsgroup.json
  • 01:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P62640 and previous config saved to /var/cache/conftool/dbconfig/20240519-011320-ladsgroup.json
  • 00:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T352010)', diff saved to https://phabricator.wikimedia.org/P62639 and previous config saved to /var/cache/conftool/dbconfig/20240519-005811-ladsgroup.json

2024-05-18

  • 23:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T352010)', diff saved to https://phabricator.wikimedia.org/P62638 and previous config saved to /var/cache/conftool/dbconfig/20240518-230800-ladsgroup.json
  • 23:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 23:07 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 23:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62637 and previous config saved to /var/cache/conftool/dbconfig/20240518-230736-ladsgroup.json
  • 22:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P62636 and previous config saved to /var/cache/conftool/dbconfig/20240518-225228-ladsgroup.json
  • 22:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P62635 and previous config saved to /var/cache/conftool/dbconfig/20240518-223720-ladsgroup.json
  • 22:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T352010)', diff saved to https://phabricator.wikimedia.org/P62634 and previous config saved to /var/cache/conftool/dbconfig/20240518-222748-ladsgroup.json
  • 22:27 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 22:27 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 22:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T352010)', diff saved to https://phabricator.wikimedia.org/P62633 and previous config saved to /var/cache/conftool/dbconfig/20240518-222725-ladsgroup.json
  • 22:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62632 and previous config saved to /var/cache/conftool/dbconfig/20240518-222212-ladsgroup.json
  • 22:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P62631 and previous config saved to /var/cache/conftool/dbconfig/20240518-221216-ladsgroup.json
  • 21:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P62630 and previous config saved to /var/cache/conftool/dbconfig/20240518-215708-ladsgroup.json
  • 21:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T352010)', diff saved to https://phabricator.wikimedia.org/P62629 and previous config saved to /var/cache/conftool/dbconfig/20240518-214200-ladsgroup.json
  • 20:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62628 and previous config saved to /var/cache/conftool/dbconfig/20240518-200322-ladsgroup.json
  • 20:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 20:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 20:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T352010)', diff saved to https://phabricator.wikimedia.org/P62627 and previous config saved to /var/cache/conftool/dbconfig/20240518-200258-ladsgroup.json
  • 19:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P62626 and previous config saved to /var/cache/conftool/dbconfig/20240518-194750-ladsgroup.json
  • 19:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P62625 and previous config saved to /var/cache/conftool/dbconfig/20240518-193240-ladsgroup.json
  • 19:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T352010)', diff saved to https://phabricator.wikimedia.org/P62624 and previous config saved to /var/cache/conftool/dbconfig/20240518-191732-ladsgroup.json
  • 18:59 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw
  • 18:58 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw
  • 18:56 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2090.codfw.wmnet with OS bullseye
  • 18:36 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2090.codfw.wmnet with reason: host reimage
  • 18:33 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2090.codfw.wmnet with reason: host reimage
  • 18:16 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2090.codfw.wmnet with OS bullseye
  • 16:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 16:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 16:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T364299)', diff saved to https://phabricator.wikimedia.org/P62623 and previous config saved to /var/cache/conftool/dbconfig/20240518-162907-marostegui.json
  • 16:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P62622 and previous config saved to /var/cache/conftool/dbconfig/20240518-161400-marostegui.json
  • 15:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P62621 and previous config saved to /var/cache/conftool/dbconfig/20240518-155852-marostegui.json
  • 15:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2169 (T352010)', diff saved to https://phabricator.wikimedia.org/P62620 and previous config saved to /var/cache/conftool/dbconfig/20240518-155136-ladsgroup.json
  • 15:51 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 15:51 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 15:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62619 and previous config saved to /var/cache/conftool/dbconfig/20240518-155112-ladsgroup.json
  • 15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T364299)', diff saved to https://phabricator.wikimedia.org/P62618 and previous config saved to /var/cache/conftool/dbconfig/20240518-154343-marostegui.json
  • 15:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P62617 and previous config saved to /var/cache/conftool/dbconfig/20240518-153604-ladsgroup.json
  • 15:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P62616 and previous config saved to /var/cache/conftool/dbconfig/20240518-152056-ladsgroup.json
  • 15:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62615 and previous config saved to /var/cache/conftool/dbconfig/20240518-150548-ladsgroup.json
  • 11:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62614 and previous config saved to /var/cache/conftool/dbconfig/20240518-112824-ladsgroup.json
  • 11:28 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:28 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:28 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 11:27 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 11:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T352010)', diff saved to https://phabricator.wikimedia.org/P62613 and previous config saved to /var/cache/conftool/dbconfig/20240518-112745-ladsgroup.json
  • 11:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P62612 and previous config saved to /var/cache/conftool/dbconfig/20240518-111237-ladsgroup.json
  • 10:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P62611 and previous config saved to /var/cache/conftool/dbconfig/20240518-105729-ladsgroup.json
  • 10:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T352010)', diff saved to https://phabricator.wikimedia.org/P62610 and previous config saved to /var/cache/conftool/dbconfig/20240518-104222-ladsgroup.json
  • 07:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T352010)', diff saved to https://phabricator.wikimedia.org/P62609 and previous config saved to /var/cache/conftool/dbconfig/20240518-071726-ladsgroup.json
  • 07:17 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 07:17 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 07:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T352010)', diff saved to https://phabricator.wikimedia.org/P62608 and previous config saved to /var/cache/conftool/dbconfig/20240518-071703-ladsgroup.json
  • 07:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P62607 and previous config saved to /var/cache/conftool/dbconfig/20240518-070155-ladsgroup.json
  • 06:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P62606 and previous config saved to /var/cache/conftool/dbconfig/20240518-064646-ladsgroup.json
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2195 (T364299)', diff saved to https://phabricator.wikimedia.org/P62605 and previous config saved to /var/cache/conftool/dbconfig/20240518-063529-marostegui.json
  • 06:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 06:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T364299)', diff saved to https://phabricator.wikimedia.org/P62604 and previous config saved to /var/cache/conftool/dbconfig/20240518-063505-marostegui.json
  • 06:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T352010)', diff saved to https://phabricator.wikimedia.org/P62603 and previous config saved to /var/cache/conftool/dbconfig/20240518-063138-ladsgroup.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P62602 and previous config saved to /var/cache/conftool/dbconfig/20240518-061958-marostegui.json
  • 06:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P62601 and previous config saved to /var/cache/conftool/dbconfig/20240518-060450-marostegui.json
  • 05:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T352010)', diff saved to https://phabricator.wikimedia.org/P62600 and previous config saved to /var/cache/conftool/dbconfig/20240518-055125-ladsgroup.json
  • 05:51 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 05:51 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 05:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T352010)', diff saved to https://phabricator.wikimedia.org/P62599 and previous config saved to /var/cache/conftool/dbconfig/20240518-055100-ladsgroup.json
  • 05:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T364299)', diff saved to https://phabricator.wikimedia.org/P62598 and previous config saved to /var/cache/conftool/dbconfig/20240518-054942-marostegui.json
  • 05:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P62597 and previous config saved to /var/cache/conftool/dbconfig/20240518-053550-ladsgroup.json
  • 05:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P62596 and previous config saved to /var/cache/conftool/dbconfig/20240518-052043-ladsgroup.json
  • 05:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T352010)', diff saved to https://phabricator.wikimedia.org/P62595 and previous config saved to /var/cache/conftool/dbconfig/20240518-050535-ladsgroup.json
  • 03:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T352010)', diff saved to https://phabricator.wikimedia.org/P62594 and previous config saved to /var/cache/conftool/dbconfig/20240518-030359-ladsgroup.json
  • 03:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 03:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 02:39 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2090.codfw.wmnet with OS bullseye
  • 01:18 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2090.codfw.wmnet with OS bullseye
  • 00:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 00:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 00:35 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic2090* for ban elastic2090 before reimage - ryankemper@cumin2002 - T353878
  • 00:35 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic2090* for ban elastic2090 before reimage - ryankemper@cumin2002 - T353878
  • 00:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 00:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 00:02 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"

2024-05-17

  • 23:46 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
  • 23:43 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
  • 23:41 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 23:41 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 23:08 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 23:06 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1008.eqiad.wmnet with OS bullseye
  • 23:05 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1007.eqiad.wmnet with OS bullseye
  • 22:43 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 22:21 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 22:20 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1008.eqiad.wmnet with OS bullseye
  • 22:20 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 22:19 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS bullseye
  • 21:57 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 21:57 akosiaris@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 21:47 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 21:10 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 21:02 ryankemper@puppetmaster1001: conftool action : set/weight=10:pooled=yes; selector: name=elastic2090\.codfw\.wmnet
  • 20:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 20:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 19:43 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 19:42 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:38 dzahn@cumin1002: conftool action : set/pooled=no; selector: name=ml-serve2002.codfw.wmnet
  • 19:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 18:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2181 (T364299)', diff saved to https://phabricator.wikimedia.org/P62592 and previous config saved to /var/cache/conftool/dbconfig/20240517-184554-marostegui.json
  • 18:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 18:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 18:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62591 and previous config saved to /var/cache/conftool/dbconfig/20240517-184530-marostegui.json
  • 18:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P62590 and previous config saved to /var/cache/conftool/dbconfig/20240517-183022-marostegui.json
  • 18:22 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P62589 and previous config saved to /var/cache/conftool/dbconfig/20240517-181515-marostegui.json
  • 18:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62588 and previous config saved to /var/cache/conftool/dbconfig/20240517-180006-marostegui.json
  • 17:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T352010)', diff saved to https://phabricator.wikimedia.org/P62587 and previous config saved to /var/cache/conftool/dbconfig/20240517-173608-ladsgroup.json
  • 17:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 17:35 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 17:18 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 17:18 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 17:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:35 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:22 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:22 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2002 to codfw - jhancock@cumin2002"
  • 16:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2002 to codfw - jhancock@cumin2002"
  • 16:17 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 16:15 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 15:22 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 15:21 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 15:20 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 14:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 14:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 14:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T352010)', diff saved to https://phabricator.wikimedia.org/P62585 and previous config saved to /var/cache/conftool/dbconfig/20240517-140648-ladsgroup.json
  • 13:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P62584 and previous config saved to /var/cache/conftool/dbconfig/20240517-135138-ladsgroup.json
  • 13:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P62583 and previous config saved to /var/cache/conftool/dbconfig/20240517-133630-ladsgroup.json
  • 13:26 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:25 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:24 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:23 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:22 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:22 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T352010)', diff saved to https://phabricator.wikimedia.org/P62582 and previous config saved to /var/cache/conftool/dbconfig/20240517-132122-ladsgroup.json
  • 12:56 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kubestagetcd[1004-1006].eqiad.wmnet
  • 12:56 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:56 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagetcd[1004-1006].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:55 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagetcd[1004-1006].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ldap-replica1006.wikimedia.org
  • 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica1006.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:46 kevinbazira@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ldap-replica1005.wikimedia.org
  • 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica1005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica1005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:24 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on kubestagetcd[1004-1006].eqiad.wmnet with reason: decom
  • 12:24 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on kubestagetcd[1004-1006].eqiad.wmnet with reason: decom
  • 12:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 12:12 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kubestagemaster[1001-1002].eqiad.wmnet
  • 12:12 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:12 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagemaster[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:11 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagemaster[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:11 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ldap-replica1005.wikimedia.org
  • 12:11 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 12:09 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 12:08 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 12:07 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config: apply
  • 12:07 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ldap-replica2008.wikimedia.org
  • 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica2008.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:05 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica2008.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:02 jayme@cumin1002: START - Cookbook sre.hosts.decommission for hosts kubestagemaster[1001-1002].eqiad.wmnet
  • 11:56 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 11:53 jayme@cumin1002: conftool action : set/pooled=inactive; selector: name=kubestagemaster100[12].eqiad.wmnet
  • 11:51 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ldap-replica2008.wikimedia.org
  • 11:51 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on kubestagemaster[1001-1002].eqiad.wmnet with reason: decom
  • 11:51 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on kubestagemaster[1001-1002].eqiad.wmnet with reason: decom
  • 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ldap-replica2007.wikimedia.org
  • 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica2007.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 11:47 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica2007.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 11:44 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 11:39 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ldap-replica2007.wikimedia.org
  • 11:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T352010)', diff saved to https://phabricator.wikimedia.org/P62579 and previous config saved to /var/cache/conftool/dbconfig/20240517-113142-ladsgroup.json
  • 11:31 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 11:31 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 11:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T352010)', diff saved to https://phabricator.wikimedia.org/P62578 and previous config saved to /var/cache/conftool/dbconfig/20240517-113119-ladsgroup.json
  • 11:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P62577 and previous config saved to /var/cache/conftool/dbconfig/20240517-111611-ladsgroup.json
  • 11:08 jayme@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=kubestagemaster100[3-5].eqiad.wmnet
  • 11:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P62576 and previous config saved to /var/cache/conftool/dbconfig/20240517-110101-ladsgroup.json
  • 10:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T352010)', diff saved to https://phabricator.wikimedia.org/P62575 and previous config saved to /var/cache/conftool/dbconfig/20240517-104553-ladsgroup.json
  • 09:44 isaranto@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 09:39 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 09:25 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1016.eqiad.wmnet
  • 09:17 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host snapshot1016.eqiad.wmnet
  • 09:06 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1015.eqiad.wmnet
  • 09:01 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host snapshot1015.eqiad.wmnet
  • 08:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T352010)', diff saved to https://phabricator.wikimedia.org/P62574 and previous config saved to /var/cache/conftool/dbconfig/20240517-082636-ladsgroup.json
  • 08:26 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 08:26 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 08:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T352010)', diff saved to https://phabricator.wikimedia.org/P62573 and previous config saved to /var/cache/conftool/dbconfig/20240517-082613-ladsgroup.json
  • 08:17 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
  • 08:17 jayme@deploy1002: helmfile [eqiad] START helmfile.d/services/zotero: apply
  • 08:16 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
  • 08:16 jayme@deploy1002: helmfile [codfw] START helmfile.d/services/zotero: apply
  • 08:15 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/zotero: apply
  • 08:14 jayme@deploy1002: helmfile [staging] START helmfile.d/services/zotero: apply
  • 08:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P62572 and previous config saved to /var/cache/conftool/dbconfig/20240517-081105-ladsgroup.json
  • 07:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P62571 and previous config saved to /var/cache/conftool/dbconfig/20240517-075558-ladsgroup.json
  • 07:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T352010)', diff saved to https://phabricator.wikimedia.org/P62570 and previous config saved to /var/cache/conftool/dbconfig/20240517-074050-ladsgroup.json
  • 06:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62568 and previous config saved to /var/cache/conftool/dbconfig/20240517-065920-marostegui.json
  • 06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T364299)', diff saved to https://phabricator.wikimedia.org/P62567 and previous config saved to /var/cache/conftool/dbconfig/20240517-065857-marostegui.json
  • 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P62566 and previous config saved to /var/cache/conftool/dbconfig/20240517-064350-marostegui.json
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P62565 and previous config saved to /var/cache/conftool/dbconfig/20240517-062842-marostegui.json
  • 06:18 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 55 hosts
  • 06:17 ryankemper@cumin2002: START - Cookbook sre.hosts.remove-downtime for 55 hosts
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T364299)', diff saved to https://phabricator.wikimedia.org/P62564 and previous config saved to /var/cache/conftool/dbconfig/20240517-061334-marostegui.json
  • 06:10 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: T363975 eqiad cluster restart - ryankemper@cumin2002 - T363975
  • 05:52 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: T363975 eqiad cluster restart - ryankemper@cumin2002 - T363975
  • 05:52 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 55 hosts with reason: T363975
  • 05:50 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on 55 hosts with reason: T363975
  • 05:17 marostegui: Restart wikibugs
  • 05:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T352010)', diff saved to https://phabricator.wikimedia.org/P62563 and previous config saved to /var/cache/conftool/dbconfig/20240517-051721-ladsgroup.json
  • 05:17 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 05:17 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 05:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62562 and previous config saved to /var/cache/conftool/dbconfig/20240517-051658-ladsgroup.json
  • 05:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 05:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 05:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P62561 and previous config saved to /var/cache/conftool/dbconfig/20240517-050150-ladsgroup.json
  • 04:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P62560 and previous config saved to /var/cache/conftool/dbconfig/20240517-044642-ladsgroup.json
  • 04:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62559 and previous config saved to /var/cache/conftool/dbconfig/20240517-043134-ladsgroup.json
  • 02:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62558 and previous config saved to /var/cache/conftool/dbconfig/20240517-021211-ladsgroup.json
  • 02:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 02:11 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 02:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62557 and previous config saved to /var/cache/conftool/dbconfig/20240517-021148-ladsgroup.json
  • 01:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62556 and previous config saved to /var/cache/conftool/dbconfig/20240517-015640-ladsgroup.json
  • 01:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62555 and previous config saved to /var/cache/conftool/dbconfig/20240517-014132-ladsgroup.json
  • 01:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62554 and previous config saved to /var/cache/conftool/dbconfig/20240517-012622-ladsgroup.json

2024-05-16

  • 23:43 cwhite: restart apache on gerrit1003
  • 23:17 zabe@deploy1002: Synchronized private/PrivateSettings.php: Add secret for encrypting user password hashes - T150647 (duration: 16m 42s)
  • 23:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62553 and previous config saved to /var/cache/conftool/dbconfig/20240516-230951-ladsgroup.json
  • 23:09 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 23:09 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 23:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62552 and previous config saved to /var/cache/conftool/dbconfig/20240516-230939-ladsgroup.json
  • 23:05 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@312e2be]: Correct new range partition sensor granularity (duration: 00m 21s)
  • 23:04 ebernhardson@deploy1002: Started deploy [airflow-dags/search@312e2be]: Correct new range partition sensor granularity
  • 22:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P62551 and previous config saved to /var/cache/conftool/dbconfig/20240516-225430-ladsgroup.json
  • 22:47 jsn@deploy1002: Finished scap: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036) (duration: 21m 57s)
  • 22:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P62550 and previous config saved to /var/cache/conftool/dbconfig/20240516-223922-ladsgroup.json
  • 22:27 jsn@deploy1002: jsn and cscott: Continuing with sync
  • 22:27 jsn@deploy1002: jsn and cscott: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:25 jsn@deploy1002: Started scap: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036)
  • 22:24 jsn@deploy1002: Sync cancelled.
  • 22:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62549 and previous config saved to /var/cache/conftool/dbconfig/20240516-222414-ladsgroup.json
  • 22:02 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@cb359e4]: add dags to collect daily webrequest and satisfaction search metrics (duration: 00m 25s)
  • 22:02 ebernhardson@deploy1002: Started deploy [airflow-dags/search@cb359e4]: add dags to collect daily webrequest and satisfaction search metrics
  • 21:52 jsn@deploy1002: cscott and jsn: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:49 jsn@deploy1002: Started scap: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036)
  • 21:31 jsn@deploy1002: Finished scap: Backport for Update VE core submodule to master (27296e0e3) (T230323 T365052) (duration: 25m 10s)
  • 21:11 jsn@deploy1002: jsn and esanders: Continuing with sync
  • 21:09 mutante: LDAP - added uid rickijay to group nda (T365138)
  • 21:08 jsn@deploy1002: jsn and esanders: Backport for Update VE core submodule to master (27296e0e3) (T230323 T365052) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:06 jsn@deploy1002: Started scap: Backport for Update VE core submodule to master (27296e0e3) (T230323 T365052)
  • 21:05 mutante: LDAP - added uid dmuthuri to group wmf T364320
  • 20:43 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 20:43 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 20:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T352010)', diff saved to https://phabricator.wikimedia.org/P62548 and previous config saved to /var/cache/conftool/dbconfig/20240516-204342-ladsgroup.json
  • 20:33 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1013.eqiad.wmnet
  • 20:33 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for aqs1013.eqiad.wmnet
  • 20:33 mutante: contint2002 - as usual have to manually "a2dismod mpm_event" on a machine using apache that has just been installed to fix the race condition with apache modules
  • 20:33 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2002.wikimedia.org with OS bullseye
  • 20:31 jdrewniak@deploy1002: Finished scap: Backport for Fix exclude list for dark mode (T365084) (duration: 22m 36s)
  • 20:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P62547 and previous config saved to /var/cache/conftool/dbconfig/20240516-202834-ladsgroup.json
  • 20:14 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint2002.wikimedia.org with reason: host reimage
  • 20:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P62546 and previous config saved to /var/cache/conftool/dbconfig/20240516-201326-ladsgroup.json
  • 20:12 jdrewniak@deploy1002: jdrewniak and mabualruz: Continuing with sync
  • 20:11 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint2002.wikimedia.org with reason: host reimage
  • 20:11 jdrewniak@deploy1002: jdrewniak and mabualruz: Backport for Fix exclude list for dark mode (T365084) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:08 ryankemper: [Hadoop] Restarted `hadoop-hdfs-datanode` on `an-worker1172`
  • 20:08 jdrewniak@deploy1002: Started scap: Backport for Fix exclude list for dark mode (T365084)
  • 20:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62545 and previous config saved to /var/cache/conftool/dbconfig/20240516-200618-ladsgroup.json
  • 20:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 20:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 20:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T352010)', diff saved to https://phabricator.wikimedia.org/P62544 and previous config saved to /var/cache/conftool/dbconfig/20240516-200552-ladsgroup.json
  • 20:03 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hadoop.roll-restart-workers (exit_code=99) restart workers for Hadoop analytics cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 19:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T352010)', diff saved to https://phabricator.wikimedia.org/P62543 and previous config saved to /var/cache/conftool/dbconfig/20240516-195817-ladsgroup.json
  • 19:55 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 19:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P62542 and previous config saved to /var/cache/conftool/dbconfig/20240516-195044-ladsgroup.json
  • 19:46 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2166 (T364299)', diff saved to https://phabricator.wikimedia.org/P62541 and previous config saved to /var/cache/conftool/dbconfig/20240516-194613-marostegui.json
  • 19:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 19:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T364299)', diff saved to https://phabricator.wikimedia.org/P62540 and previous config saved to /var/cache/conftool/dbconfig/20240516-194548-marostegui.json
  • 19:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P62539 and previous config saved to /var/cache/conftool/dbconfig/20240516-193535-ladsgroup.json
  • 19:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P62538 and previous config saved to /var/cache/conftool/dbconfig/20240516-193040-marostegui.json
  • 19:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T352010)', diff saved to https://phabricator.wikimedia.org/P62537 and previous config saved to /var/cache/conftool/dbconfig/20240516-192027-ladsgroup.json
  • 19:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P62536 and previous config saved to /var/cache/conftool/dbconfig/20240516-191532-marostegui.json
  • 19:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T364299)', diff saved to https://phabricator.wikimedia.org/P62535 and previous config saved to /var/cache/conftool/dbconfig/20240516-190024-marostegui.json
  • 18:58 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS buster
  • 18:46 dzahn@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host contint2002.wikimedia.org with OS bullseye
  • 18:32 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 18:17 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 18:15 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host contint2002.wikimedia.org
  • 18:13 cmooney@cumin1002: START - Cookbook sre.hosts.dhcp for host contint2002.wikimedia.org
  • 18:04 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint2002.wikimedia.org with OS buster
  • 17:53 brennen@deploy1002: Finished deploy [phabricator/deployment@7d858df]: test scap deployment with keyholder key misconfigured for T313624 (duration: 00m 38s)
  • 17:52 brennen@deploy1002: Started deploy [phabricator/deployment@7d858df]: test scap deployment with keyholder key misconfigured for T313624
  • 17:45 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 17:34 dani@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 17:34 dani@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 17:34 dani@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 17:33 dani@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 17:33 dani@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 17:33 dani@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 17:02 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 17:02 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 17:02 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 17:01 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 17:01 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 17:00 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 17:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T352010)', diff saved to https://phabricator.wikimedia.org/P62529 and previous config saved to /var/cache/conftool/dbconfig/20240516-170035-ladsgroup.json
  • 17:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 16:59 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 16:58 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS buster
  • 16:57 ryankemper@cumin2002: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop analytics cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 16:57 dzahn@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host contint2002.wikimedia.org with OS bullseye
  • 16:57 ryankemper@cumin2002: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 16:41 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 16:41 ryankemper@cumin2002: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 16:40 dzahn@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host contint2002.wikimedia.org with OS bullseye
  • 16:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62528 and previous config saved to /var/cache/conftool/dbconfig/20240516-163915-arnaudb.json
  • 16:39 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:38 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:37 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:37 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:37 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:37 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:32 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:31 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:31 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:30 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62526 and previous config saved to /var/cache/conftool/dbconfig/20240516-162408-arnaudb.json
  • 16:12 topranks: announcing wikidough anycast ranges to Inernet (transit) in magru T362421
  • 16:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62525 and previous config saved to /var/cache/conftool/dbconfig/20240516-160902-arnaudb.json
  • 15:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62523 and previous config saved to /var/cache/conftool/dbconfig/20240516-155356-arnaudb.json
  • 15:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62522 and previous config saved to /var/cache/conftool/dbconfig/20240516-155034-arnaudb.json
  • 15:45 dhinus: systemctl restart mariadb@s4.service on clouddb1015 (using too much RAM) T365164
  • 15:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62521 and previous config saved to /var/cache/conftool/dbconfig/20240516-153850-arnaudb.json
  • 15:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62520 and previous config saved to /var/cache/conftool/dbconfig/20240516-153527-arnaudb.json
  • 15:25 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 15:24 dzahn@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host contint2002.wikimedia.org with OS bullseye
  • 15:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62519 and previous config saved to /var/cache/conftool/dbconfig/20240516-152343-arnaudb.json
  • 15:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62518 and previous config saved to /var/cache/conftool/dbconfig/20240516-152021-arnaudb.json
  • 15:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62517 and previous config saved to /var/cache/conftool/dbconfig/20240516-150837-arnaudb.json
  • 15:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62516 and previous config saved to /var/cache/conftool/dbconfig/20240516-150515-arnaudb.json
  • 15:03 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 14:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62515 and previous config saved to /var/cache/conftool/dbconfig/20240516-145330-arnaudb.json
  • 14:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62514 and previous config saved to /var/cache/conftool/dbconfig/20240516-145009-arnaudb.json
  • 14:49 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P62513 and previous config saved to /var/cache/conftool/dbconfig/20240516-144945-root.json
  • 14:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2174.codfw.wmnet with OS bookworm
  • 14:43 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on contint2002.wikimedia.org with reason: T334517
  • 14:43 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on contint2002.wikimedia.org with reason: T334517
  • 14:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62512 and previous config saved to /var/cache/conftool/dbconfig/20240516-143503-arnaudb.json
  • 14:34 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P62511 and previous config saved to /var/cache/conftool/dbconfig/20240516-143439-root.json
  • 14:28 ladsgroup@deploy1002: Finished scap: Backport for Stop writing to the old columns of pagelinks in s6 (T352010) (duration: 15m 42s)
  • 14:28 hnowlan: migrated 5% of commons traffic to k8s
  • 14:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
  • 14:25 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
  • 14:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62510 and previous config saved to /var/cache/conftool/dbconfig/20240516-141957-arnaudb.json
  • 14:19 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P62509 and previous config saved to /var/cache/conftool/dbconfig/20240516-141932-root.json
  • 14:15 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 14:15 ladsgroup@deploy1002: ladsgroup: Backport for Stop writing to the old columns of pagelinks in s6 (T352010) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:13 ladsgroup@deploy1002: Started scap: Backport for Stop writing to the old columns of pagelinks in s6 (T352010)
  • 14:09 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["76318767"]' 2>&1 | tee -a ~/T315510-enwiki-5; date
  • 14:08 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2174.codfw.wmnet with OS bookworm
  • 14:07 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2174.codfw.wmnet with reason: reimage
  • 14:07 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2174.codfw.wmnet with reason: reimage
  • 14:06 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db2174', diff saved to https://phabricator.wikimedia.org/P62508 and previous config saved to /var/cache/conftool/dbconfig/20240516-140620-arnaudb.json
  • 14:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62507 and previous config saved to /var/cache/conftool/dbconfig/20240516-140451-arnaudb.json
  • 14:04 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P62506 and previous config saved to /var/cache/conftool/dbconfig/20240516-140426-root.json
  • 14:04 jsn@deploy1002: Finished scap: Backport for Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001), Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001) (duration: 16m 11s)
  • 14:03 Emperor: depool, restart swift-proxy, repool ms-fe1010 as ~12% connection failures reported by envoy since late 14th May T360913
  • 13:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2176.codfw.wmnet with OS bookworm
  • 13:51 jsn@deploy1002: jsn and lucaswerkmeister-wmde: Continuing with sync
  • 13:50 jsn@deploy1002: jsn and lucaswerkmeister-wmde: Backport for Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001), Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:49 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P62505 and previous config saved to /var/cache/conftool/dbconfig/20240516-134918-root.json
  • 13:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1024.eqiad.wmnet with OS bookworm
  • 13:47 jsn@deploy1002: Started scap: Backport for Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001), Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001)
  • 13:37 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
  • 13:34 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
  • 13:32 jsn@deploy1002: Finished scap: Backport for Enable async jobqueue-powered URL uploads on commons (T295007) (duration: 18m 18s)
  • 13:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1024.eqiad.wmnet with reason: host reimage
  • 13:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1024.eqiad.wmnet with reason: host reimage
  • 13:19 jsn@deploy1002: jsn and hnowlan: Continuing with sync
  • 13:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62503 and previous config saved to /var/cache/conftool/dbconfig/20240516-131800-ladsgroup.json
  • 13:17 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2176.codfw.wmnet with OS bookworm
  • 13:16 jsn@deploy1002: jsn and hnowlan: Backport for Enable async jobqueue-powered URL uploads on commons (T295007) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:15 arnaudb@cumin1002: END (ERROR) - Cookbook sre.mysql.upgrade (exit_code=97) for db2176.codfw.wmnet
  • 13:15 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2176.codfw.wmnet
  • 13:14 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db2176', diff saved to https://phabricator.wikimedia.org/P62502 and previous config saved to /var/cache/conftool/dbconfig/20240516-131429-arnaudb.json
  • 13:14 jsn@deploy1002: Started scap: Backport for Enable async jobqueue-powered URL uploads on commons (T295007)
  • 13:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1024.eqiad.wmnet with OS bookworm
  • 13:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1024 T364289', diff saved to https://phabricator.wikimedia.org/P62501 and previous config saved to /var/cache/conftool/dbconfig/20240516-131111-root.json
  • 13:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62500 and previous config saved to /var/cache/conftool/dbconfig/20240516-130252-ladsgroup.json
  • 12:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62499 and previous config saved to /var/cache/conftool/dbconfig/20240516-124743-ladsgroup.json
  • 10:48 fnegri@cumin1002: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
  • 10:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P62497 and previous config saved to /var/cache/conftool/dbconfig/20240516-104601-ladsgroup.json
  • 10:43 claime: New redirects for T25216 T204830 T31186 operational
  • 10:37 fnegri@cumin1002: START - Cookbook sre.wikireplicas.update-views
  • 10:32 claime: cumin 'A:all-mw' -b30 "run-puppet-agent -q" - T25216 T204830 T31186
  • 10:31 claime: cumin 'A:all-mw' "enable-puppet 'New redirects T25216 T204830 T31186 - cgoubert'"
  • 10:31 marostegui@cumin1002: dbctl commit (dc=all): 'Test pc4 master switch', diff saved to https://phabricator.wikimedia.org/P62496 and previous config saved to /var/cache/conftool/dbconfig/20240516-103148-marostegui.json
  • 10:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P62495 and previous config saved to /var/cache/conftool/dbconfig/20240516-103055-ladsgroup.json
  • 10:30 marostegui@cumin1002: dbctl commit (dc=all): 'Test pc4 master switch', diff saved to https://phabricator.wikimedia.org/P62494 and previous config saved to /var/cache/conftool/dbconfig/20240516-103039-marostegui.json
  • 10:30 cgoubert@deploy1002: Finished scap: Deploy new redirects to mw-on-k8s - T25216 T204830 T31186 (duration: 08m 06s)
  • 10:22 cgoubert@deploy1002: Started scap: Deploy new redirects to mw-on-k8s - T25216 T204830 T31186
  • 10:21 claime: New redirects ok on mwdebug - T25216 T204830 T31186
  • 10:19 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 10:19 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 10:18 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2015 and pc1015 to pc4 as depooled spares T362786', diff saved to https://phabricator.wikimedia.org/P62493 and previous config saved to /var/cache/conftool/dbconfig/20240516-101829-marostegui.json
  • 10:15 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P62492 and previous config saved to /var/cache/conftool/dbconfig/20240516-101553-root.json
  • 10:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P62491 and previous config saved to /var/cache/conftool/dbconfig/20240516-101548-ladsgroup.json
  • 10:15 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2016 and pc1016 to pc4 T362786', diff saved to https://phabricator.wikimedia.org/P62490 and previous config saved to /var/cache/conftool/dbconfig/20240516-101543-marostegui.json
  • 10:11 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2014 and pc1014 to pc4 T362786', diff saved to https://phabricator.wikimedia.org/P62489 and previous config saved to /var/cache/conftool/dbconfig/20240516-101122-marostegui.json
  • 10:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 100%: post fix repool', diff saved to https://phabricator.wikimedia.org/P62488 and previous config saved to /var/cache/conftool/dbconfig/20240516-101018-arnaudb.json
  • 10:10 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2013 and pc1013 to pc2 T362786', diff saved to https://phabricator.wikimedia.org/P62487 and previous config saved to /var/cache/conftool/dbconfig/20240516-101009-marostegui.json
  • 10:09 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2012 and pc1012 to pc2 T362786', diff saved to https://phabricator.wikimedia.org/P62486 and previous config saved to /var/cache/conftool/dbconfig/20240516-100858-marostegui.json
  • 10:07 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2011 to pc1 T362786', diff saved to https://phabricator.wikimedia.org/P62485 and previous config saved to /var/cache/conftool/dbconfig/20240516-100744-marostegui.json
  • 10:04 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc1011 to pc1 T362786', diff saved to https://phabricator.wikimedia.org/P62484 and previous config saved to /var/cache/conftool/dbconfig/20240516-100418-marostegui.json
  • 10:02 claime: cumin 'A:all-mw' "disable-puppet 'New redirects T25216 T204830 T31186 - cgoubert'"
  • 10:00 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P62483 and previous config saved to /var/cache/conftool/dbconfig/20240516-100040-root.json
  • 09:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P62482 and previous config saved to /var/cache/conftool/dbconfig/20240516-095927-ladsgroup.json
  • 09:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T352010)', diff saved to https://phabricator.wikimedia.org/P62481 and previous config saved to /var/cache/conftool/dbconfig/20240516-095817-ladsgroup.json
  • 09:58 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 09:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 09:56 ladsgroup@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 09:56 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 09:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 75%: post fix repool', diff saved to https://phabricator.wikimedia.org/P62480 and previous config saved to /var/cache/conftool/dbconfig/20240516-095459-arnaudb.json
  • 09:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62479 and previous config saved to /var/cache/conftool/dbconfig/20240516-094717-ladsgroup.json
  • 09:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 09:47 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 09:45 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P62478 and previous config saved to /var/cache/conftool/dbconfig/20240516-094534-root.json
  • 09:44 godog: clean up MediaWiki.rest_api_latency and MediaWiki.rest_api_errors - T365111
  • 09:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 50%: post fix repool', diff saved to https://phabricator.wikimedia.org/P62476 and previous config saved to /var/cache/conftool/dbconfig/20240516-093803-arnaudb.json
  • 09:30 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P62475 and previous config saved to /var/cache/conftool/dbconfig/20240516-093028-root.json
  • 09:28 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:28 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 25%: post fix repool', diff saved to https://phabricator.wikimedia.org/P62474 and previous config saved to /var/cache/conftool/dbconfig/20240516-092257-arnaudb.json
  • 09:18 dani@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 09:18 dani@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 09:18 dani@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 09:17 dani@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 09:17 dani@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 09:17 dani@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 09:16 arnaudb@cumin1002: dbctl commit (dc=all): 'vslow/dump T364814 fix', diff saved to https://phabricator.wikimedia.org/P62473 and previous config saved to /var/cache/conftool/dbconfig/20240516-091613-arnaudb.json
  • 09:15 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P62472 and previous config saved to /var/cache/conftool/dbconfig/20240516-091522-root.json
  • 09:15 arnaudb@cumin1002: dbctl commit (dc=all): 'vslow/dump T364814 fix', diff saved to https://phabricator.wikimedia.org/P62471 and previous config saved to /var/cache/conftool/dbconfig/20240516-091515-arnaudb.json
  • 09:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2204 to vslow/dump T364814', diff saved to https://phabricator.wikimedia.org/P62470 and previous config saved to /var/cache/conftool/dbconfig/20240516-091400-arnaudb.json
  • 09:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Group test readd', diff saved to https://phabricator.wikimedia.org/P62469 and previous config saved to /var/cache/conftool/dbconfig/20240516-090753-arnaudb.json
  • 09:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Group test removal', diff saved to https://phabricator.wikimedia.org/P62468 and previous config saved to /var/cache/conftool/dbconfig/20240516-090732-arnaudb.json
  • 09:03 Dreamy_Jazz: Stopping MediaModeration scanning script on `medium.dblist`
  • 09:03 Dreamy_Jazz: Stopping MediaModeration scanning script on `enwiki`
  • 08:59 Dreamy_Jazz: Scanning `enwiki` with MediaModeration script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 08:58 Dreamy_Jazz: Starting MediaModeration scanning script on `medium.dblist` - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 08:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2204 with weight 500 T364814', diff saved to https://phabricator.wikimedia.org/P62466 and previous config saved to /var/cache/conftool/dbconfig/20240516-085123-arnaudb.json
  • 08:44 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2207 to s2 primary T364814', diff saved to https://phabricator.wikimedia.org/P62465 and previous config saved to /var/cache/conftool/dbconfig/20240516-084420-root.json
  • 08:41 arnaudb: Starting s2 codfw failover from db2204 to db2207 - T364814
  • 08:33 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:23 hashar@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.43.0-wmf.5 refs T361399
  • 08:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 depool', diff saved to https://phabricator.wikimedia.org/P62463 and previous config saved to /var/cache/conftool/dbconfig/20240516-081207-arnaudb.json
  • 08:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T360332)', diff saved to https://phabricator.wikimedia.org/P62462 and previous config saved to /var/cache/conftool/dbconfig/20240516-081136-arnaudb.json
  • 08:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2165 (T364299)', diff saved to https://phabricator.wikimedia.org/P62461 and previous config saved to /var/cache/conftool/dbconfig/20240516-081107-marostegui.json
  • 08:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 08:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T364299)', diff saved to https://phabricator.wikimedia.org/P62460 and previous config saved to /var/cache/conftool/dbconfig/20240516-081044-marostegui.json
  • 08:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1021.eqiad.wmnet with reason: host reimage
  • 08:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1021.eqiad.wmnet with reason: host reimage
  • 07:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62458 and previous config saved to /var/cache/conftool/dbconfig/20240516-075628-arnaudb.json
  • 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P62457 and previous config saved to /var/cache/conftool/dbconfig/20240516-075537-marostegui.json
  • 07:51 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1021.eqiad.wmnet with OS bookworm
  • 07:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Remove db2207 from API/vslow/dump T364814', diff saved to https://phabricator.wikimedia.org/P62456 and previous config saved to /var/cache/conftool/dbconfig/20240516-075024-arnaudb.json
  • 07:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2207 with weight 0 T364814', diff saved to https://phabricator.wikimedia.org/P62455 and previous config saved to /var/cache/conftool/dbconfig/20240516-074927-arnaudb.json
  • 07:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s2 T364814
  • 07:48 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 26 hosts with reason: Primary switchover s2 T364814
  • 07:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1021 T364289', diff saved to https://phabricator.wikimedia.org/P62454 and previous config saved to /var/cache/conftool/dbconfig/20240516-074837-root.json
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'Increase es1024 weight', diff saved to https://phabricator.wikimedia.org/P62453 and previous config saved to /var/cache/conftool/dbconfig/20240516-074625-marostegui.json
  • 07:44 mabualruz@deploy1002: Finished scap: Backport for Correct behaviour of ConfigHelper, add tests (T365084) (duration: 17m 31s)
  • 07:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62452 and previous config saved to /var/cache/conftool/dbconfig/20240516-074121-arnaudb.json
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P62451 and previous config saved to /var/cache/conftool/dbconfig/20240516-074030-marostegui.json
  • 07:37 marostegui@cumin1002: dbctl commit (dc=all): 'Increase es1024 weight', diff saved to https://phabricator.wikimedia.org/P62450 and previous config saved to /var/cache/conftool/dbconfig/20240516-073750-marostegui.json
  • 07:37 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1025 to es5 primary master T365094', diff saved to https://phabricator.wikimedia.org/P62449 and previous config saved to /var/cache/conftool/dbconfig/20240516-073719-marostegui.json
  • 07:30 mabualruz@deploy1002: mabualruz: Continuing with sync
  • 07:30 mabualruz@deploy1002: mabualruz: Backport for Correct behaviour of ConfigHelper, add tests (T365084) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:26 mabualruz@deploy1002: Started scap: Backport for Correct behaviour of ConfigHelper, add tests (T365084)
  • 07:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T360332)', diff saved to https://phabricator.wikimedia.org/P62448 and previous config saved to /var/cache/conftool/dbconfig/20240516-072614-arnaudb.json
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T364299)', diff saved to https://phabricator.wikimedia.org/P62447 and previous config saved to /var/cache/conftool/dbconfig/20240516-072521-marostegui.json
  • 07:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T360332)', diff saved to https://phabricator.wikimedia.org/P62446 and previous config saved to /var/cache/conftool/dbconfig/20240516-072355-arnaudb.json
  • 07:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 07:23 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P62445 and previous config saved to /var/cache/conftool/dbconfig/20240516-065823-root.json
  • 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P62444 and previous config saved to /var/cache/conftool/dbconfig/20240516-064317-root.json
  • 06:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Making es4 standalone T364447
  • 06:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Making es4 standalone T364447
  • 06:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Making es5 standalone T364447
  • 06:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Making es5 standalone T364447
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P62443 and previous config saved to /var/cache/conftool/dbconfig/20240516-062812-root.json
  • 06:18 marostegui: Make es5 standalone and disconnect replication T364447
  • 06:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Making es5 standalone T364447
  • 06:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Making es5 standalone T364447
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P62442 and previous config saved to /var/cache/conftool/dbconfig/20240516-061306-root.json
  • 06:05 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1020 to es4 primary master T364816', diff saved to https://phabricator.wikimedia.org/P62441 and previous config saved to /var/cache/conftool/dbconfig/20240516-060532-marostegui.json
  • 05:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P62440 and previous config saved to /var/cache/conftool/dbconfig/20240516-055759-root.json
  • 05:43 marostegui: Make es4 standalone and disconnect replication T364447
  • 05:37 marostegui@cumin1002: dbctl commit (dc=all): 'Increase es1021 weight', diff saved to https://phabricator.wikimedia.org/P62439 and previous config saved to /var/cache/conftool/dbconfig/20240516-053746-marostegui.json
  • 05:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Making es4 standalone T364447
  • 05:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Making es4 standalone T364447
  • 05:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 05:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 05:23 marostegui: Deploy schema change dbmaint db1173 eqiad s6 T355609
  • 05:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1173 T364523', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240516-051853-root.json
  • 05:18 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db1231 to s6 primary and set section read-write T364523', diff saved to https://phabricator.wikimedia.org/P62437 and previous config saved to /var/cache/conftool/dbconfig/20240516-051808-marostegui.json
  • 05:17 marostegui: Starting s6 eqiad failover from db1173 to db1231 - T364523
  • 04:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s6 T364523
  • 04:58 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1231 with weight 0 T364523', diff saved to https://phabricator.wikimedia.org/P62435 and previous config saved to /var/cache/conftool/dbconfig/20240516-045831-marostegui.json
  • 04:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s6 T364523
  • 04:04 eileen: civicrm upgraded from 26e7422a to 4f6f2dc3
  • 02:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T352010)', diff saved to https://phabricator.wikimedia.org/P62434 and previous config saved to /var/cache/conftool/dbconfig/20240516-020200-ladsgroup.json
  • 02:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 02:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 02:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T352010)', diff saved to https://phabricator.wikimedia.org/P62433 and previous config saved to /var/cache/conftool/dbconfig/20240516-020137-ladsgroup.json
  • 01:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P62432 and previous config saved to /var/cache/conftool/dbconfig/20240516-014630-ladsgroup.json
  • 01:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P62431 and previous config saved to /var/cache/conftool/dbconfig/20240516-013122-ladsgroup.json
  • 01:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T352010)', diff saved to https://phabricator.wikimedia.org/P62430 and previous config saved to /var/cache/conftool/dbconfig/20240516-011613-ladsgroup.json
  • 01:12 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 00:28 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye

2024-05-15

  • 22:41 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@12e0cb9]: bump discolytics to 0.19.0 (duration: 00m 27s)
  • 22:40 ebernhardson@deploy1002: Started deploy [airflow-dags/search@12e0cb9]: bump discolytics to 0.19.0
  • 21:55 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:55 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: delete ssw1-d1-codfw mgmt ip - cmooney@cumin1002"
  • 21:54 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: delete ssw1-d1-codfw mgmt ip - cmooney@cumin1002"
  • 21:44 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@718b2dd]: specify analytics-hadoop in hdfs urls (duration: 00m 25s)
  • 21:44 ebernhardson@deploy1002: Started deploy [airflow-dags/search@718b2dd]: specify analytics-hadoop in hdfs urls
  • 21:27 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 21:22 eileen: civicrm upgraded from ddc96594 to 26e7422a
  • 21:17 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:17 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add ssw1-d1-codfw mgmt ip - cmooney@cumin1002"
  • 21:16 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add ssw1-d1-codfw mgmt ip - cmooney@cumin1002"
  • 21:16 TheresNoTime: UTC late backport window complete
  • 21:16 samtar@deploy1002: Finished scap: Backport for AbuseFilterHooks: Provide feature flags for AF custom actions (T20110) (duration: 16m 31s)
  • 21:14 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 21:03 samtar@deploy1002: samtar and kharlan: Continuing with sync
  • 21:02 samtar@deploy1002: samtar and kharlan: Backport for AbuseFilterHooks: Provide feature flags for AF custom actions (T20110) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:59 samtar@deploy1002: Started scap: Backport for AbuseFilterHooks: Provide feature flags for AF custom actions (T20110)
  • 20:48 samtar@deploy1002: Finished scap: Backport for Enable night mode as a desktop beta feature (T363814), [enwiki] Throttle exemption for Editathon (T364708) (duration: 17m 35s)
  • 20:35 samtar@deploy1002: samtar and superpes and jdlrobson: Continuing with sync
  • 20:33 samtar@deploy1002: samtar and superpes and jdlrobson: Backport for Enable night mode as a desktop beta feature (T363814), [enwiki] Throttle exemption for Editathon (T364708) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2164 (T364299)', diff saved to https://phabricator.wikimedia.org/P62427 and previous config saved to /var/cache/conftool/dbconfig/20240515-203116-marostegui.json
  • 20:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 20:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 20:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 20:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 20:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T364299)', diff saved to https://phabricator.wikimedia.org/P62426 and previous config saved to /var/cache/conftool/dbconfig/20240515-203037-marostegui.json
  • 20:30 samtar@deploy1002: Started scap: Backport for Enable night mode as a desktop beta feature (T363814), [enwiki] Throttle exemption for Editathon (T364708)
  • 20:28 samtar@deploy1002: Finished scap: Backport for [ParserCache] Preserve information from the JsonException when logging failures (T365036) (duration: 16m 41s)
  • 20:16 samtar@deploy1002: cscott and samtar: Continuing with sync
  • 20:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P62425 and previous config saved to /var/cache/conftool/dbconfig/20240515-201529-marostegui.json
  • 20:15 samtar@deploy1002: cscott and samtar: Backport for [ParserCache] Preserve information from the JsonException when logging failures (T365036) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:12 samtar@deploy1002: Started scap: Backport for [ParserCache] Preserve information from the JsonException when logging failures (T365036)
  • 20:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P62424 and previous config saved to /var/cache/conftool/dbconfig/20240515-200022-marostegui.json
  • 19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T364299)', diff saved to https://phabricator.wikimedia.org/P62423 and previous config saved to /var/cache/conftool/dbconfig/20240515-194514-marostegui.json
  • 19:06 cstone: payments-wiki upgraded from 3380990f to 98189883
  • 18:44 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 18:13 ryankemper@cumin2002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid public cluster: Roll restart of Druid jvm daemons.
  • 18:03 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1007.eqiad.wmnet with OS bullseye
  • 17:46 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 17:40 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 17:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T352010)', diff saved to https://phabricator.wikimedia.org/P62420 and previous config saved to /var/cache/conftool/dbconfig/20240515-173259-ladsgroup.json
  • 17:32 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 17:32 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 17:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T352010)', diff saved to https://phabricator.wikimedia.org/P62419 and previous config saved to /var/cache/conftool/dbconfig/20240515-173236-ladsgroup.json
  • 17:28 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 17:22 ryankemper@cumin2002: START - Cookbook sre.druid.roll-restart-workers for Druid public cluster: Roll restart of Druid jvm daemons.
  • 17:21 ryankemper@cumin2002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid analytics cluster: Roll restart of Druid jvm daemons.
  • 17:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P62418 and previous config saved to /var/cache/conftool/dbconfig/20240515-171729-ladsgroup.json
  • 17:17 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS bullseye
  • 17:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P62417 and previous config saved to /var/cache/conftool/dbconfig/20240515-170221-ladsgroup.json
  • 16:50 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 16:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T352010)', diff saved to https://phabricator.wikimedia.org/P62416 and previous config saved to /var/cache/conftool/dbconfig/20240515-164713-ladsgroup.json
  • 16:40 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 16:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:34 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 16:33 ryankemper@cumin2002: START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons.
  • 16:31 mutante: gerrit2002 - mv /run/motd.dynamic.new /run/motd.dynamic
  • 16:24 mutante: gerrit1003 - MOTD wasn't updating anymore but manual "run-parts /etc/update-motd.d" showed updated data - while /run/motd.dynamic was outdated. fixed by manually renaming /run/motd.dynamic.new to /run/motd.dynamic and logging in because it's triggered by PAM.. but .. why
  • 16:06 hashar: Gerrit was briefly unreachable between 15:42 and 15:55 UTC | T365041
  • 15:58 vgutierrez: repool upload@ulsfo with IPIP encapsulation enabled - T357257
  • 15:56 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/opentelemetry-collector: apply
  • 15:56 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/opentelemetry-collector: apply
  • 15:55 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 15:51 cgoubert@deploy1002: Finished scap: mw-on-k8s: Bump maxUnavailable to 6% - T362323 (duration: 02m 01s)
  • 15:49 cgoubert@deploy1002: Started scap: mw-on-k8s: Bump maxUnavailable to 6% - T362323
  • 15:43 hnowlan@deploy1002: Finished deploy [restbase/deploy@92abb6a]: Deploying new wikis T360304 T360311 T363244 T363250 T363257 T363264 T363271 (duration: 16m 52s)
  • 15:37 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/opentelemetry-collector: apply
  • 15:37 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/services/opentelemetry-collector: apply
  • 15:36 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 15:36 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 15:35 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 15:35 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 15:32 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 15:32 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 15:31 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 15:31 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 15:28 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 15:28 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 15:26 hnowlan@deploy1002: Started deploy [restbase/deploy@92abb6a]: Deploying new wikis T360304 T360311 T363244 T363250 T363257 T363264 T363271
  • 15:25 vgutierrez: rolling restart of pybal on lvs4010 and lvs4009 - T357257
  • 15:06 jsn@deploy1002: Finished scap: Backport for [Follow-up] Override VE overlays in night-mode (T363861), Mark night mode as a valid beta feature (T363814), Mark night mode as a valid beta feature (T363814) (duration: 18m 26s)
  • 15:05 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 14:57 vgutierrez: re-enable puppet on A:lvs - T357257
  • 14:53 jsn@deploy1002: jsn and jdlrobson: Continuing with sync
  • 14:51 vgutierrez: disable puppet on A:lvs before merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1031827- T357257
  • 14:51 jsn@deploy1002: jsn and jdlrobson: Backport for [Follow-up] Override VE overlays in night-mode (T363861), Mark night mode as a valid beta feature (T363814), Mark night mode as a valid beta feature (T363814) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:48 jsn@deploy1002: Started scap: Backport for [Follow-up] Override VE overlays in night-mode (T363861), Mark night mode as a valid beta feature (T363814), Mark night mode as a valid beta feature (T363814)
  • 14:44 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2010.codfw.wmnet with OS bullseye
  • 14:44 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:43 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:41 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2008.codfw.wmnet with OS bullseye
  • 14:41 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:40 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:39 claime: Repooling mw2286.codfw.wmnet - T364863
  • 14:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2007.codfw.wmnet with OS bullseye
  • 14:39 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:38 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw2286.codfw.wmnet
  • 14:38 cgoubert@cumin1002: START - Cookbook sre.hosts.remove-downtime for mw2286.codfw.wmnet
  • 14:38 claime: Removing downtime on mw2286.codfw.wmnet - T364863
  • 14:37 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 14:32 vgutierrez: depool upload@ulsfo before enabling IPIP encapsulation - T357257
  • 14:26 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
  • 14:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
  • 14:24 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
  • 14:22 jsn@deploy1002: Finished scap: Backport for InitialiseSettings.php: Add wmgUseAutoModerator (T364034) (duration: 16m 44s)
  • 14:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
  • 14:20 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
  • 14:20 fab@deploy1002: Finished deploy [airflow-dags/research@ecf603d]: (no justification provided) (duration: 00m 32s)
  • 14:20 fab@deploy1002: Started deploy [airflow-dags/research@ecf603d]: (no justification provided)
  • 14:19 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
  • 14:17 vgutierrez: uploaded tcp-mss-clamper 0.5.1 to bullseye-wikimedia (apt.wm.o) - T357257
  • 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1214.eqiad.wmnet
  • 14:10 jsn@deploy1002: jsn: Continuing with sync
  • 14:10 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 14:10 vgutierrez: re-enable puppet on A:lvs - T357257
  • 14:09 jsn@deploy1002: jsn: Backport for InitialiseSettings.php: Add wmgUseAutoModerator (T364034) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:06 jsn@deploy1002: Started scap: Backport for InitialiseSettings.php: Add wmgUseAutoModerator (T364034)
  • 14:06 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1214.eqiad.wmnet
  • 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1211.eqiad.wmnet
  • 14:02 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bullseye
  • 14:02 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 14:02 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS bullseye
  • 14:02 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS bullseye
  • 14:01 jsn@deploy1002: Finished scap: Backport for extension-list: Add AutoModerator (T364034) (duration: 51m 44s)
  • 14:01 vgutierrez: disable puppet on A:lvs before merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1031814 - T357257
  • 14:00 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 13:54 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1211.eqiad.wmnet
  • 13:54 moritzm: installing nghttp2 security updates
  • 13:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1209.eqiad.wmnet
  • 13:52 eevans@deploy1002: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
  • 13:51 eevans@deploy1002: helmfile [eqiad] START helmfile.d/services/echostore: apply
  • 13:49 eevans@deploy1002: helmfile [codfw] DONE helmfile.d/services/echostore: apply
  • 13:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2006.codfw.wmnet with OS bullseye
  • 13:49 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 13:48 eevans@deploy1002: helmfile [codfw] START helmfile.d/services/echostore: apply
  • 13:45 eevans@deploy1002: helmfile [staging] DONE helmfile.d/services/echostore: apply
  • 13:44 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 13:44 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 13:44 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 13:44 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 13:44 eevans@deploy1002: helmfile [staging] START helmfile.d/services/echostore: apply
  • 13:43 jsn@deploy1002: jsn: Continuing with sync
  • 13:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 13:42 jsn@deploy1002: jsn: Backport for extension-list: Add AutoModerator (T364034) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:41 moritzm: installing libpgjava security updates
  • 13:40 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=thanos-fe1001.eqiad.wmnet
  • 13:40 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1209.eqiad.wmnet
  • 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1203.eqiad.wmnet
  • 13:34 elukey: depool thanos-fe1001 and move envoy to PKI TLS cert
  • 13:34 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=thanos-fe1001.eqiad.wmnet
  • 13:32 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:32 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:27 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:27 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:27 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:27 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:26 arnaudb@cumin1002: END (ERROR) - Cookbook sre.mysql.reboot_sanitaria (exit_code=97) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:26 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:25 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.reboot_sanitaria (exit_code=99) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:25 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
  • 13:22 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
  • 13:18 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:18 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:17 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.reboot_sanitaria (exit_code=99) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:17 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:16 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:15 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:15 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:12 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:12 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:11 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:11 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:11 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:11 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:10 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:10 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:09 jsn@deploy1002: Started scap: Backport for extension-list: Add AutoModerator (T364034)
  • 13:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:09 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:07 vgutierrez: uploaded golang-github-florianl-go-tc 0.4.4-0.20240511074908-d584238bf6cb to apt.wm.o (bookworm-wikimedia)
  • 13:04 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS bullseye
  • 13:03 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:03 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS bullseye
  • 13:01 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS bullseye
  • 13:01 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 13:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.reboot_sanitaria (exit_code=0) Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 13:00 arnaudb@cumin1002: START - Cookbook sre.mysql.reboot_sanitaria Will restart a pool of Sanitarium MariaDB instances and/or hosts.
  • 12:58 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kubestagetcd[2001-2003].codfw.wmnet
  • 12:57 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:57 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagetcd[2001-2003].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:57 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 12:56 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagetcd[2001-2003].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:53 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 12:46 jayme@cumin1002: START - Cookbook sre.hosts.decommission for hosts kubestagetcd[2001-2003].codfw.wmnet
  • 12:23 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on kubestagetcd[2001-2003].codfw.wmnet with reason: decom
  • 12:23 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on kubestagetcd[2001-2003].codfw.wmnet with reason: decom
  • 12:19 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1203.eqiad.wmnet
  • 11:52 aborrero@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 11:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: openldap::rw
  • 11:34 mvolz@deploy1002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
  • 11:33 mvolz@deploy1002: helmfile [eqiad] START helmfile.d/services/zotero: apply
  • 11:33 mvolz@deploy1002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
  • 11:32 mvolz@deploy1002: helmfile [codfw] START helmfile.d/services/zotero: apply
  • 11:31 mvolz@deploy1002: helmfile [staging] DONE helmfile.d/services/zotero: apply
  • 11:31 mvolz@deploy1002: helmfile [staging] START helmfile.d/services/zotero: apply
  • 11:29 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: openldap::rw
  • 11:28 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for backend: Fix Unknown column 'Array' in 'where clause' (T364974), backend: Fix Unknown column 'Array' in 'where clause' (T364974) (duration: 15m 36s)
  • 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1193.eqiad.wmnet
  • 11:16 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde: Continuing with sync
  • 11:15 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde: Backport for backend: Fix Unknown column 'Array' in 'where clause' (T364974), backend: Fix Unknown column 'Array' in 'where clause' (T364974) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:13 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for backend: Fix Unknown column 'Array' in 'where clause' (T364974), backend: Fix Unknown column 'Array' in 'where clause' (T364974)
  • 11:10 aborrero@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 11:09 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1193.eqiad.wmnet
  • 11:05 aborrero@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 11:03 logmsgbot: lucaswerkmeister-wmde@deploy1002 Sync cancelled.
  • 10:54 gmodena@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 10:54 gmodena@deploy1002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 10:53 gmodena@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 10:53 gmodena@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 10:53 logmsgbot: lucaswerkmeister-wmde@deploy1002 zabe and lucaswerkmeister-wmde: Backport for Fix capitalization of Subquery (T364974), Fix capitalization of Subquery (T364974) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:52 aborrero@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 10:50 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Fix capitalization of Subquery (T364974), Fix capitalization of Subquery (T364974)
  • 10:49 gmodena@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 10:49 gmodena@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 10:40 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 10:32 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:32 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:31 cmooney@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device cloudsw1-e4-eqiad
  • 10:29 cmooney@cumin1002: START - Cookbook sre.network.tls for network device cloudsw1-e4-eqiad
  • 10:28 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 10:28 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 10:20 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 10:15 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:15 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:09 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 10:09 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 10:06 btullis@deploy1002: Finished deploy [airflow-dags/analytics@ecf603d]: (no justification provided) (duration: 00m 30s)
  • 10:06 btullis@deploy1002: Started deploy [airflow-dags/analytics@ecf603d]: (no justification provided)
  • 10:06 btullis@deploy1002: Finished deploy [airflow-dags/analytics_test@ecf603d]: (no justification provided) (duration: 00m 11s)
  • 10:06 btullis@deploy1002: Started deploy [airflow-dags/analytics_test@ecf603d]: (no justification provided)
  • 10:02 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 10:02 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:59 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:59 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:57 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:57 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:54 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:54 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:53 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kubestagemaster[2001-2002].codfw.wmnet
  • 09:53 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:53 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagemaster[2001-2002].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 09:52 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagemaster[2001-2002].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 09:50 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 09:49 claime: Manually relaunching mediawiki_job_update_special_pages_s5.service
  • 09:47 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:47 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:43 btullis@deploy1002: Finished deploy [analytics/refinery@88ed505] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@88ed505e] (duration: 02m 53s)
  • 09:43 jayme@cumin1002: START - Cookbook sre.hosts.decommission for hosts kubestagemaster[2001-2002].codfw.wmnet
  • 09:40 btullis@deploy1002: Started deploy [analytics/refinery@88ed505] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@88ed505e]
  • 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host seaborgium.wikimedia.org
  • 09:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host seaborgium.wikimedia.org
  • 09:25 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 09:22 Dreamy_Jazz: Starting MediaModeration script on group2 wikis for a test
  • 09:20 arnaudb@cumin1002: END (ERROR) - Cookbook sre.mysql.copy (exit_code=97) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 09:19 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 09:14 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.copy (exit_code=0) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 09:13 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 09:11 jayme@cumin1002: conftool action : set/pooled=inactive; selector: name=kubestagemaster200[12].codfw.wmnet
  • 09:10 arnaudb@cumin1002: END (ERROR) - Cookbook sre.mysql.copy (exit_code=97) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 09:09 btullis@deploy1002: Finished deploy [analytics/refinery@88ed505] (thin): Regular analytics weekly train THIN [analytics/refinery@88ed505e] (duration: 04m 17s)
  • 09:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T352010)', diff saved to https://phabricator.wikimedia.org/P62410 and previous config saved to /var/cache/conftool/dbconfig/20240515-090522-ladsgroup.json
  • 09:05 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 09:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 09:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T352010)', diff saved to https://phabricator.wikimedia.org/P62409 and previous config saved to /var/cache/conftool/dbconfig/20240515-090458-ladsgroup.json
  • 09:04 btullis@deploy1002: Started deploy [analytics/refinery@88ed505] (thin): Regular analytics weekly train THIN [analytics/refinery@88ed505e]
  • 09:03 moritzm: upgrade seaborgium to bullseye T364823
  • 09:02 btullis@deploy1002: Finished deploy [analytics/refinery@88ed505]: Regular analytics weekly train [analytics/refinery@88ed505e] (duration: 14m 41s)
  • 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on seaborgium.wikimedia.org with reason: OS update
  • 09:00 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on seaborgium.wikimedia.org with reason: OS update
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2163 (T364299)', diff saved to https://phabricator.wikimedia.org/P62408 and previous config saved to /var/cache/conftool/dbconfig/20240515-085247-marostegui.json
  • 08:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1192.eqiad.wmnet
  • 08:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T364299)', diff saved to https://phabricator.wikimedia.org/P62407 and previous config saved to /var/cache/conftool/dbconfig/20240515-085224-marostegui.json
  • 08:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P62406 and previous config saved to /var/cache/conftool/dbconfig/20240515-084950-ladsgroup.json
  • 08:48 btullis@deploy1002: Started deploy [analytics/refinery@88ed505]: Regular analytics weekly train [analytics/refinery@88ed505e]
  • 08:42 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 08:40 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 08:40 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 08:38 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 08:38 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P62405 and previous config saved to /var/cache/conftool/dbconfig/20240515-083717-marostegui.json
  • 08:35 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.43.0-wmf.5 refs T361399
  • 08:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P62404 and previous config saved to /var/cache/conftool/dbconfig/20240515-083443-ladsgroup.json
  • 08:31 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1192.eqiad.wmnet
  • 08:30 moritzm: installing openjdk-17/jetty9 security updates on Bookworm
  • 08:30 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:29 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 08:29 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1178.eqiad.wmnet
  • 08:26 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P62403 and previous config saved to /var/cache/conftool/dbconfig/20240515-082209-marostegui.json
  • 08:21 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad
  • 08:20 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
  • 08:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T352010)', diff saved to https://phabricator.wikimedia.org/P62402 and previous config saved to /var/cache/conftool/dbconfig/20240515-081934-ladsgroup.json
  • 08:17 filippo@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 08:16 filippo@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 08:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1178.eqiad.wmnet
  • 08:15 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw
  • 08:13 filippo@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 08:13 kartik@deploy1002: Finished scap: Backport for Section Translation: Fix nds-nl language code (duration: 17m 14s)
  • 08:07 filippo@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 08:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T364299)', diff saved to https://phabricator.wikimedia.org/P62401 and previous config saved to /var/cache/conftool/dbconfig/20240515-080700-marostegui.json
  • 08:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1177.eqiad.wmnet
  • 08:03 moritzm: installing nodejs security updates on buster
  • 08:01 kartik@deploy1002: kartik: Continuing with sync
  • 07:59 kartik@deploy1002: kartik: Backport for Section Translation: Fix nds-nl language code synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:56 kartik@deploy1002: Started scap: Backport for Section Translation: Fix nds-nl language code
  • 07:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1177.eqiad.wmnet
  • 07:49 kartik@deploy1002: Finished scap: Backport for Enable Content/Section translation in io, nds, nds-nl and, mwl (T354666) (duration: 18m 06s)
  • 07:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1172.eqiad.wmnet
  • 07:38 moritzm: installing curl security updates
  • 07:37 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 07:37 kartik@deploy1002: kartik: Continuing with sync
  • 07:36 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 07:34 kartik@deploy1002: kartik: Backport for Enable Content/Section translation in io, nds, nds-nl and, mwl (T354666) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:34 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1172.eqiad.wmnet
  • 07:31 kartik@deploy1002: Started scap: Backport for Enable Content/Section translation in io, nds, nds-nl and, mwl (T354666)
  • 07:30 kartik@deploy1002: Sync cancelled.
  • 07:21 kartik@deploy1002: kartik: Backport for Enable Content/Section translation in io, nds, nds-nl and, mwl (T354666) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:20 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-d1-codfw
  • 07:20 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device lsw1-d1-codfw
  • 07:19 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-d1-codfw
  • 07:19 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device lsw1-d1-codfw
  • 07:19 kartik@deploy1002: Started scap: Backport for Enable Content/Section translation in io, nds, nds-nl and, mwl (T354666)
  • 07:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.copy (exit_code=0) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 07:04 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 07:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1181 (T352010)', diff saved to https://phabricator.wikimedia.org/P62399 and previous config saved to /var/cache/conftool/dbconfig/20240515-002923-ladsgroup.json
  • 00:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 00:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 00:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T352010)', diff saved to https://phabricator.wikimedia.org/P62398 and previous config saved to /var/cache/conftool/dbconfig/20240515-002900-ladsgroup.json
  • 00:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P62397 and previous config saved to /var/cache/conftool/dbconfig/20240515-001352-ladsgroup.json

2024-05-14

  • 23:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P62396 and previous config saved to /var/cache/conftool/dbconfig/20240514-235844-ladsgroup.json
  • 23:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T352010)', diff saved to https://phabricator.wikimedia.org/P62395 and previous config saved to /var/cache/conftool/dbconfig/20240514-234337-ladsgroup.json
  • 22:48 zabe: start running migrateGuSalt.php in screen session # T364435
  • 22:22 zabe: zabe@mwmaint1002:/tmp/upload$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user="Yann" . # T364877
  • 22:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2162 (T364299)', diff saved to https://phabricator.wikimedia.org/P62394 and previous config saved to /var/cache/conftool/dbconfig/20240514-220640-marostegui.json
  • 22:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 22:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 22:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T364299)', diff saved to https://phabricator.wikimedia.org/P62393 and previous config saved to /var/cache/conftool/dbconfig/20240514-220617-marostegui.json
  • 21:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P62392 and previous config saved to /var/cache/conftool/dbconfig/20240514-215109-marostegui.json
  • 21:39 eileen: civicrm upgraded from c7b0dfbb to 9268acf3
  • 21:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P62391 and previous config saved to /var/cache/conftool/dbconfig/20240514-213601-marostegui.json
  • 21:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T364299)', diff saved to https://phabricator.wikimedia.org/P62390 and previous config saved to /var/cache/conftool/dbconfig/20240514-212052-marostegui.json
  • 21:00 cjming@deploy1002: Finished scap: Backport for Override VE overlays in night-mode (T363861) (duration: 18m 44s)
  • 20:49 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:48 cjming@deploy1002: cjming and jdlrobson: Continuing with sync
  • 20:44 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:44 cjming@deploy1002: cjming and jdlrobson: Backport for Override VE overlays in night-mode (T363861) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:44 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:42 cjming@deploy1002: Started scap: Backport for Override VE overlays in night-mode (T363861)
  • 20:41 cjming@deploy1002: Finished scap: Backport for cirrus: Shift 25% of public wikis writes in eqiad to replacement updater (T363475) (duration: 15m 02s)
  • 20:29 cjming@deploy1002: cjming and ebernhardson: Continuing with sync
  • 20:29 cjming@deploy1002: cjming and ebernhardson: Backport for cirrus: Shift 25% of public wikis writes in eqiad to replacement updater (T363475) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:28 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:28 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:26 cjming@deploy1002: Started scap: Backport for cirrus: Shift 25% of public wikis writes in eqiad to replacement updater (T363475)
  • 20:24 cjming@deploy1002: Finished scap: Backport for Enable night mode on Vector on testwiki, disable on Special:Homepage (T357699 T363814) (duration: 18m 40s)
  • 20:14 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@ecf603d]: update discolytics to 0.18.0 (duration: 00m 27s)
  • 20:14 ebernhardson@deploy1002: Started deploy [airflow-dags/search@ecf603d]: update discolytics to 0.18.0
  • 20:11 cjming@deploy1002: jdlrobson and cjming: Continuing with sync
  • 20:08 cjming@deploy1002: jdlrobson and cjming: Backport for Enable night mode on Vector on testwiki, disable on Special:Homepage (T357699 T363814) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:08 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/opentelemetry-collector: apply
  • 20:07 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/services/opentelemetry-collector: apply
  • 20:06 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 20:06 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:06 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 20:05 cjming@deploy1002: Started scap: Backport for Enable night mode on Vector on testwiki, disable on Special:Homepage (T357699 T363814)
  • 20:04 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 20:04 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 20:01 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 19:53 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/opentelemetry-collector: apply
  • 19:53 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/services/opentelemetry-collector: apply
  • 19:47 vriley@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:47 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:47 vriley@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:46 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:45 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
  • 19:41 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
  • 19:39 vriley@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:38 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:38 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1010 - vriley@cumin1002"
  • 19:37 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1010 - vriley@cumin1002"
  • 19:32 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 19:30 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1008.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:26 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 19:25 vriley@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['kafka-main1006']
  • 19:23 vriley@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main1006']
  • 19:19 vriley@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:18 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:18 cdanis: T364907 💔cdanis@apt1002.wikimedia.org ~ 🕞🍵 sudo -i reprepro --keepunreferencedfiles includedeb bullseye-wikimedia ~/otelcol-contrib_0.100.0_linux_amd64.deb
  • 19:18 vriley@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1008.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:17 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 19:16 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:16 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1008 - vriley@cumin1002"
  • 19:16 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1008 - vriley@cumin1002"
  • 19:13 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 18:18 sukhe: restart pybal on backup LVSes
  • 18:17 sukhe: [CORRECTION] above pybal restart was NOT run
  • 18:15 amastilovic@deploy1002: Finished deploy [airflow-dags/analytics@6270c72]: (no justification provided) (duration: 00m 34s)
  • 18:14 amastilovic@deploy1002: Started deploy [airflow-dags/analytics@6270c72]: (no justification provided)
  • 18:10 sukhe: sudo cumin -b1 -s120 'A:lvs' 'systemctl restart pybal.service': clearing up alert for reverted pybal.conf CR 1031470
  • 17:47 ejegg: donorwiki upgraded from b005071a to fa7de70f
  • 17:33 ryankemper@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
  • 17:27 ryankemper@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
  • 17:25 ryankemper@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.
  • 17:19 ryankemper@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.
  • 17:18 ryankemper@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
  • 17:12 ryankemper@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
  • 17:11 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 17:11 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 17:09 ryankemper@cumin2002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid test cluster: Roll restart of Druid jvm daemons.
  • 17:02 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1007.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:00 ryankemper@cumin2002: START - Cookbook sre.druid.roll-restart-workers for Druid test cluster: Roll restart of Druid jvm daemons.
  • 16:51 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1006.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:50 vriley@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1007.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:49 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:49 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1007 - vriley@cumin1002"
  • 16:48 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1007 - vriley@cumin1002"
  • 16:46 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 16:44 pfischer@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:41 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw2286.codfw.wmnet with reason: T364863
  • 16:40 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw2286.codfw.wmnet with reason: T364863
  • 16:39 vriley@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1006.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:39 mutante: depooled mw2286.codfw.wmnet because of interface error / needed cable replacement T364863
  • 16:38 dzahn@cumin2002: conftool action : set/pooled=no; selector: name=mw2286.codfw.wmnet
  • 16:38 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:38 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1006 - vriley@cumin1002"
  • 16:37 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1006 - vriley@cumin1002"
  • 16:34 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 16:21 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Add notheme class to Echo (T363779), Convert function to arrow function to fix context (T364783) (duration: 22m 43s)
  • 16:14 cmooney@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 16:14 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 16:14 pfischer@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:12 cmooney@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 16:12 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 16:12 pfischer@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:11 pfischer@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:08 logmsgbot: lucaswerkmeister-wmde@deploy1002 jdlrobson and jforrester and lucaswerkmeister-wmde: Continuing with sync
  • 16:08 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: Release v0.6.5 update to add modified wmf homer plugin - cmooney@cumin1002 - T364480
  • 16:06 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: Release v0.6.5 update to add modified wmf homer plugin - cmooney@cumin1002 - T364480
  • 16:05 logmsgbot: lucaswerkmeister-wmde@deploy1002 jdlrobson and jforrester and lucaswerkmeister-wmde: Backport for Add notheme class to Echo (T363779), Convert function to arrow function to fix context (T364783) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:58 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Add notheme class to Echo (T363779), Convert function to arrow function to fix context (T364783)
  • 15:47 jayme@cumin1002: conftool action : set/weight=10; selector: name=kubestagemaster2005.codfw.wmnet
  • 15:47 jayme@cumin1002: conftool action : set/pooled=yes; selector: name=kubestagemaster2005.codfw.wmnet
  • 15:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2195.codfw.wmnet
  • 15:34 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2195.codfw.wmnet
  • 15:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2181.codfw.wmnet
  • 15:26 pfischer@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:26 pfischer@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:26 pfischer@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:26 pfischer@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:25 pfischer@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:25 pfischer@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:25 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
  • 15:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T352010)', diff saved to https://phabricator.wikimedia.org/P62387 and previous config saved to /var/cache/conftool/dbconfig/20240514-151838-ladsgroup.json
  • 15:18 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 15:18 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 15:16 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2181.codfw.wmnet
  • 15:16 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
  • 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2167.codfw.wmnet
  • 15:13 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:13 moritzm: installing expat security updates
  • 15:13 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:12 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:12 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:11 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:11 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:05 brennen@deploy1002: Finished deploy [phabricator/deployment@7d858df]: test deploy phab2002 for T364850 (duration: 00m 50s)
  • 15:05 brennen@deploy1002: Started deploy [phabricator/deployment@7d858df]: test deploy phab2002 for T364850
  • 15:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.copy (exit_code=0) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:04 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 15:04 brennen@deploy1002: Finished deploy [phabricator/deployment@7d858df]: test deploy phab2002 for T364850 (duration: 00m 33s)
  • 15:04 brennen@deploy1002: Started deploy [phabricator/deployment@7d858df]: test deploy phab2002 for T364850
  • 15:04 aokoth@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge update
  • 15:03 aokoth@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge update
  • 15:03 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2167.codfw.wmnet
  • 15:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2166.codfw.wmnet
  • 14:49 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2166.codfw.wmnet
  • 14:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2165.codfw.wmnet
  • 14:38 moritzm: installing dav1d security updates
  • 14:35 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2165.codfw.wmnet
  • 14:33 vgutierrez: repool cp4049
  • 14:31 vgutierrez: depool cp4049
  • 14:28 vgutierrez: repool cp4049
  • 14:24 vgutierrez: depool cp4049
  • 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2163.codfw.wmnet
  • 14:14 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.5 refs T361399
  • 14:12 vgutierrez: repool upload@ulsfo IPIP encapsulation NOT enabled - T357257
  • 14:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2163.codfw.wmnet
  • 14:07 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/opentelemetry-collector: apply
  • 14:06 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/opentelemetry-collector: apply
  • 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2162.codfw.wmnet
  • 13:57 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2162.codfw.wmnet
  • 13:57 Lucas_WMDE: UTC afternoon backport+config window done
  • 13:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2161.codfw.wmnet
  • 13:57 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Deploy disabled limited width on main page (T357706), Phase 5: Vector-2022.js should no longer load legacy Vector code (T301212) (duration: 16m 32s)
  • 13:46 vgutierrez: re-enable puppet on A:cp-upload - T357257
  • 13:44 logmsgbot: lucaswerkmeister-wmde@deploy1002 jdlrobson and ksarabia and lucaswerkmeister-wmde: Continuing with sync
  • 13:44 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2161.codfw.wmnet
  • 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2154.codfw.wmnet
  • 13:43 logmsgbot: lucaswerkmeister-wmde@deploy1002 jdlrobson and ksarabia and lucaswerkmeister-wmde: Backport for Deploy disabled limited width on main page (T357706), Phase 5: Vector-2022.js should no longer load legacy Vector code (T301212) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:42 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/opentelemetry-collector: apply
  • 13:42 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/services/opentelemetry-collector: apply
  • 13:40 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Deploy disabled limited width on main page (T357706), Phase 5: Vector-2022.js should no longer load legacy Vector code (T301212)
  • 13:36 vgutierrez: re-enable puppet on A:cp-text - T357257
  • 13:32 vgutierrez: disable puppet on A:cp before merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1030051 - T357257
  • 13:28 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2154.codfw.wmnet
  • 13:25 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/opentelemetry-collector: apply
  • 13:25 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/services/opentelemetry-collector: apply
  • 13:24 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host netmon2002.wikimedia.org
  • 13:24 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/opentelemetry-collector: apply
  • 13:24 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/opentelemetry-collector: apply
  • 13:22 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Use ConditionalUserOptions for "echo-subscriptions-email-dt-subscription" (T357221), Use ConditionalUserOptions for "discussiontools-autotopicsub" (T357221) (duration: 17m 59s)
  • 13:18 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:18 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2152.codfw.wmnet
  • 13:10 klausman@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 13:10 logmsgbot: lucaswerkmeister-wmde@deploy1002 matmarex and lucaswerkmeister-wmde: Continuing with sync
  • 13:08 ayounsi@cumin1002: START - Cookbook sre.hosts.dhcp for host netmon2002.wikimedia.org
  • 13:07 logmsgbot: lucaswerkmeister-wmde@deploy1002 matmarex and lucaswerkmeister-wmde: Backport for Use ConditionalUserOptions for "echo-subscriptions-email-dt-subscription" (T357221), Use ConditionalUserOptions for "discussiontools-autotopicsub" (T357221) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:05 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2152.codfw.wmnet
  • 13:04 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Use ConditionalUserOptions for "echo-subscriptions-email-dt-subscription" (T357221), Use ConditionalUserOptions for "discussiontools-autotopicsub" (T357221)
  • 13:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 12:00:00 on db2114.codfw.wmnet,db1125.eqiad.wmnet with reason: Testing
  • 12:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 12:00:00 on db2114.codfw.wmnet,db1125.eqiad.wmnet with reason: Testing
  • 12:59 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.copy (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 12:58 arnaudb@cumin1002: START - Cookbook sre.mysql.copy Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 12:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host serpens.wikimedia.org
  • 12:24 ladsgroup@deploy1002: Finished scap: Backport for Enable section-wide circuit breaking (T360930) (duration: 21m 12s)
  • 12:13 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P62384 and previous config saved to /var/cache/conftool/dbconfig/20240514-121326-root.json
  • 12:11 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 12:06 ladsgroup@deploy1002: ladsgroup: Backport for Enable section-wide circuit breaking (T360930) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:03 ladsgroup@deploy1002: Started scap: Backport for Enable section-wide circuit breaking (T360930)
  • 11:58 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P62383 and previous config saved to /var/cache/conftool/dbconfig/20240514-115820-root.json
  • 11:47 ladsgroup@deploy1002: Finished scap: Backport for etcd: Ignore parsercache clusters in externalLoads (T362786) (duration: 17m 22s)
  • 11:43 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P62382 and previous config saved to /var/cache/conftool/dbconfig/20240514-114314-root.json
  • 11:35 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 11:33 ladsgroup@deploy1002: ladsgroup: Backport for etcd: Ignore parsercache clusters in externalLoads (T362786) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:30 ladsgroup@deploy1002: Started scap: Backport for etcd: Ignore parsercache clusters in externalLoads (T362786)
  • 11:28 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P62381 and previous config saved to /var/cache/conftool/dbconfig/20240514-112807-root.json
  • 11:18 ladsgroup@deploy1002: Finished scap: Backport for rdbms: Fix picking the database from the LB domain (T364827) (duration: 15m 47s)
  • 11:13 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P62379 and previous config saved to /var/cache/conftool/dbconfig/20240514-111302-root.json
  • 11:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2154 (T364299)', diff saved to https://phabricator.wikimedia.org/P62378 and previous config saved to /var/cache/conftool/dbconfig/20240514-110704-marostegui.json
  • 11:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 11:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 11:05 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 11:05 ladsgroup@deploy1002: ladsgroup: Backport for rdbms: Fix picking the database from the LB domain (T364827) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:02 ladsgroup@deploy1002: Started scap: Backport for rdbms: Fix picking the database from the LB domain (T364827)
  • 10:17 jayme@cumin1002: conftool action : set/weight=10; selector: name=kubestagemaster2004.codfw.wmnet
  • 10:17 jayme@cumin1002: conftool action : set/pooled=yes; selector: name=kubestagemaster2004.codfw.wmnet
  • 10:12 hashar@deploy1002: rebuilt and synchronized wikiversions files: Revert "group0 wikis to 1.43.0-wmf.5" - T361399
  • 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org
  • 09:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on 6 hosts with reason: Checking RO status
  • 09:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:05:00 on 6 hosts with reason: Checking RO status
  • 09:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on 6 hosts with reason: Primary switchover es4 T364451
  • 09:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 0:05:00 on 6 hosts with reason: Primary switchover es4 T364451
  • 09:50 marostegui@deploy1002: Finished scap: Backport for db-production.php: Make es4 and es5 RO (T364447) (duration: 15m 28s)
  • 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org
  • 09:50 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
  • 09:49 jayme@cumin1002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
  • 09:49 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1004.eqiad.wmnet to plain
  • 09:48 jayme@cumin1002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1004.eqiad.wmnet to plain
  • 09:48 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1003.eqiad.wmnet to plain
  • 09:47 jayme@cumin1002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1003.eqiad.wmnet to plain
  • 09:47 jayme@cumin1002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of kubestagemaster1003.eqiad.wmnet to plain
  • 09:47 jayme@cumin1002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1003.eqiad.wmnet to plain
  • 09:45 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host kubestagemaster1005.eqiad.wmnet
  • 09:45 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster1005.eqiad.wmnet with OS bullseye
  • 09:37 marostegui@deploy1002: marostegui: Continuing with sync
  • 09:37 marostegui@deploy1002: marostegui: Backport for db-production.php: Make es4 and es5 RO (T364447) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:35 marostegui@deploy1002: Started scap: Backport for db-production.php: Make es4 and es5 RO (T364447)
  • 09:31 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster1005.eqiad.wmnet with reason: host reimage
  • 09:27 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster1005.eqiad.wmnet with reason: host reimage
  • 09:24 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.5 refs T361399
  • 09:20 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host kubestagemaster1004.eqiad.wmnet
  • 09:20 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster1004.eqiad.wmnet with OS bullseye
  • 09:18 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host kubestagemaster1003.eqiad.wmnet
  • 09:18 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster1003.eqiad.wmnet with OS bullseye
  • 09:14 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster1005.eqiad.wmnet with OS bullseye
  • 09:06 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster1004.eqiad.wmnet with reason: host reimage
  • 09:04 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster1003.eqiad.wmnet with reason: host reimage
  • 09:02 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster1004.eqiad.wmnet with reason: host reimage
  • 09:02 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster1003.eqiad.wmnet with reason: host reimage
  • 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on serpens.wikimedia.org with reason: OS update
  • 08:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on serpens.wikimedia.org with reason: OS update
  • 08:57 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster1005.eqiad.wmnet - jayme@cumin1002"
  • 08:54 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster1005.eqiad.wmnet - jayme@cumin1002"
  • 08:54 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubestagemaster1005.eqiad.wmnet on all recursors
  • 08:54 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache kubestagemaster1005.eqiad.wmnet on all recursors
  • 08:54 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:54 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster1005.eqiad.wmnet - jayme@cumin1002"
  • 08:52 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster1005.eqiad.wmnet - jayme@cumin1002"
  • 08:52 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster1004.eqiad.wmnet with OS bullseye
  • 08:50 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster1004.eqiad.wmnet - jayme@cumin1002"
  • 08:49 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster1004.eqiad.wmnet - jayme@cumin1002"
  • 08:49 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster1003.eqiad.wmnet with OS bullseye
  • 08:49 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:49 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubestagemaster1004.eqiad.wmnet on all recursors
  • 08:49 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache kubestagemaster1004.eqiad.wmnet on all recursors
  • 08:49 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:49 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster1004.eqiad.wmnet - jayme@cumin1002"
  • 08:48 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster1004.eqiad.wmnet - jayme@cumin1002"
  • 08:48 jayme@cumin1002: START - Cookbook sre.ganeti.makevm for new host kubestagemaster1005.eqiad.wmnet
  • 08:47 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster1003.eqiad.wmnet - jayme@cumin1002"
  • 08:46 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster1003.eqiad.wmnet - jayme@cumin1002"
  • 08:45 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:45 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubestagemaster1003.eqiad.wmnet on all recursors
  • 08:45 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache kubestagemaster1003.eqiad.wmnet on all recursors
  • 08:45 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:45 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster1003.eqiad.wmnet - jayme@cumin1002"
  • 08:44 jayme@cumin1002: START - Cookbook sre.ganeti.makevm for new host kubestagemaster1004.eqiad.wmnet
  • 08:44 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster1003.eqiad.wmnet - jayme@cumin1002"
  • 08:43 dcausse@deploy1002: Finished scap: Backport for Fix the loss of ParserOutput pointer in ContentDOMTransformStages (T364597) (duration: 16m 17s)
  • 08:41 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:41 jayme@cumin1002: START - Cookbook sre.ganeti.makevm for new host kubestagemaster1003.eqiad.wmnet
  • 08:30 dcausse@deploy1002: dcausse and cscott: Continuing with sync
  • 08:29 dcausse@deploy1002: dcausse and cscott: Backport for Fix the loss of ParserOutput pointer in ContentDOMTransformStages (T364597) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:26 dcausse@deploy1002: Started scap: Backport for Fix the loss of ParserOutput pointer in ContentDOMTransformStages (T364597)
  • 08:22 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Bdgreenlee out of all services on: 2208 hosts
  • 08:21 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Bdgreenlee out of all services on: 2208 hosts
  • 08:15 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kubestagemaster2005.codfw.wmnet with OS bullseye
  • 07:57 kartik@deploy1002: Finished scap: Backport for CX: Add mw.cx.UserPermissionChecker (T349959) (duration: 17m 52s)
  • 07:55 klausman@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 07:54 klausman@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 07:54 moritzm: installing PHP 7.3 security updates
  • 07:53 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 215887
  • 07:53 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 215887
  • 07:46 moritzm: installing libgd2 security updates
  • 07:44 kartik@deploy1002: kartik: Continuing with sync
  • 07:42 kartik@deploy1002: kartik: Backport for CX: Add mw.cx.UserPermissionChecker (T349959) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:39 kartik@deploy1002: Started scap: Backport for CX: Add mw.cx.UserPermissionChecker (T349959)
  • 07:27 kartik@deploy1002: Finished scap: Backport for Set $wgSignatureValidation to 'disallow' on Polish Wikipedia (T364769) (duration: 18m 28s)
  • 07:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2185.codfw.wmnet with OS bookworm
  • 07:15 kartik@deploy1002: kartik and msz2001: Continuing with sync
  • 07:12 kartik@deploy1002: kartik and msz2001: Backport for Set $wgSignatureValidation to 'disallow' on Polish Wikipedia (T364769) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:09 kartik@deploy1002: Started scap: Backport for Set $wgSignatureValidation to 'disallow' on Polish Wikipedia (T364769)
  • 07:04 moritzm: installing glib2.0 security updates
  • 06:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2185.codfw.wmnet with reason: host reimage
  • 06:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2185.codfw.wmnet with reason: host reimage
  • 06:35 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2185.codfw.wmnet with OS bookworm
  • 06:33 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host db2185.codfw.wmnet with OS bookworm
  • 06:33 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2185.codfw.wmnet with OS bookworm
  • 05:31 kart_: Updated cxserver to 2024-04-23-221507-production (T363263, T333969, T360303, T360310)
  • 05:25 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 05:24 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:22 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:22 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 05:19 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:19 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 05:15 kart_: Updated MinT to 2024-03-28-061726-production (T333969)
  • 05:08 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
  • 04:59 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
  • 04:33 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
  • 04:25 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
  • 04:18 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
  • 04:14 kartik@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
  • 04:00 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.43.0-wmf.5 refs T361399 (duration: 57m 45s)
  • 03:03 mwpresync@deploy1002: Started scap: testwikis wikis to 1.43.0-wmf.5 refs T361399
  • 02:34 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 02:34 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 02:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T352010)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240514-023316-ladsgroup.json
  • 02:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P62375 and previous config saved to /var/cache/conftool/dbconfig/20240514-021809-ladsgroup.json
  • 02:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P62374 and previous config saved to /var/cache/conftool/dbconfig/20240514-020301-ladsgroup.json
  • 01:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T352010)', diff saved to https://phabricator.wikimedia.org/P62373 and previous config saved to /var/cache/conftool/dbconfig/20240514-014753-ladsgroup.json
  • 01:18 ejegg: fundraising civicrm upgraded from c854dd3a to c7b0dfbb
  • 00:35 tstarling@deploy1002: Finished scap: Fix SecurePoll exception T209892 and CodeMirror 5 RTL T363752 (duration: 14m 56s)
  • 00:20 tstarling@deploy1002: Started scap: Fix SecurePoll exception T209892 and CodeMirror 5 RTL T363752
  • 00:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2152 (T364299)', diff saved to https://phabricator.wikimedia.org/P62372 and previous config saved to /var/cache/conftool/dbconfig/20240514-001956-marostegui.json
  • 00:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 00:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance

2024-05-13

  • 22:55 bking@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=elastic110[5|7]\.eqiad\.wmnet
  • 22:43 ryankemper@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: T363975 eqiad cluster restart - ryankemper@cumin2002 - T363975
  • 22:30 zabe: zabe@mwmaint1002:~$ mwscript cleanupTitles.php itwikivoyage # T298315
  • 22:27 bking@cumin2002: conftool action : set/weight=10:pooled=no; selector: name=elastic110[5|7]\.eqiad\.wmnet
  • 21:47 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: T363975 eqiad cluster restart - ryankemper@cumin2002 - T363975
  • 21:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
  • 21:39 ryankemper@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
  • 21:39 eileen: civicrm upgraded from 447e1472 to c854dd3a
  • 21:32 ryankemper@cumin2002: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:datahubsearch
  • 21:32 ebernhardson@deploy1002: Finished scap: Backport for Unbreak link buttons (T364062) (duration: 22m 00s)
  • 21:22 ryankemper@cumin2002: START - Cookbook sre.opensearch.roll-restart-reboot rolling restart_daemons on A:datahubsearch
  • 21:20 ebernhardson@deploy1002: jdlrobson and ebernhardson: Continuing with sync
  • 21:12 ebernhardson@deploy1002: jdlrobson and ebernhardson: Backport for Unbreak link buttons (T364062) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:10 ebernhardson@deploy1002: Started scap: Backport for Unbreak link buttons (T364062)
  • 20:57 ebernhardson@deploy1002: Finished scap: Backport for IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath (T361884) (duration: 17m 22s)
  • 20:45 ebernhardson@deploy1002: ebernhardson and tchanders: Continuing with sync
  • 20:42 ebernhardson@deploy1002: ebernhardson and tchanders: Backport for IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath (T361884) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:40 ebernhardson@deploy1002: Started scap: Backport for IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath (T361884)
  • 20:38 ebernhardson@deploy1002: Finished scap: Backport for Remove old CampaignEvents DB config (prod) (T348281) (duration: 21m 14s)
  • 20:29 eileen: civicrm upgraded from 4f55a7cf to 447e1472
  • 20:25 ebernhardson@deploy1002: ebernhardson and daimona: Continuing with sync
  • 20:19 ebernhardson@deploy1002: ebernhardson and daimona: Backport for Remove old CampaignEvents DB config (prod) (T348281) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:17 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/opentelemetry-collector: apply
  • 20:17 ebernhardson@deploy1002: Started scap: Backport for Remove old CampaignEvents DB config (prod) (T348281)
  • 19:57 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/opentelemetry-collector: apply
  • 19:47 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/opentelemetry-collector: apply
  • 19:27 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/opentelemetry-collector: apply
  • 19:26 cdanis@deploy1002: helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply
  • 19:26 cdanis@deploy1002: helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply
  • 18:49 herron@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-eqiad
  • 18:49 ryankemper@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
  • 18:30 ryankemper@cumin2002: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
  • 18:24 herron@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-eqiad
  • 18:24 ryankemper@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad
  • 18:20 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply
  • 18:19 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply
  • 18:08 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply
  • 18:07 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply
  • 18:04 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
  • 18:03 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
  • 17:40 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:40 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for new linknets on codfw spines - cmooney@cumin1002"
  • 17:39 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for new linknets on codfw spines - cmooney@cumin1002"
  • 17:38 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:38 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:37 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 17:27 ryankemper: T363973 [Kafka] Restarting `jumbo-eqiad` brokers, followed by mirror maker
  • 17:27 ryankemper@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad
  • 17:05 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
  • 17:02 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
  • 16:50 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster2005.codfw.wmnet with OS bullseye
  • 16:49 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to plain
  • 16:47 jayme@cumin1002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to plain
  • 16:47 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2004.codfw.wmnet to plain
  • 16:46 jayme@cumin1002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2004.codfw.wmnet to plain
  • 16:46 ejegg: fundraising civicrm upgraded from c0d2fa95 to 4f55a7cf
  • 16:46 jayme@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host kubestagemaster2005.codfw.wmnet
  • 16:46 jayme@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kubestagemaster2005.codfw.wmnet with OS bullseye
  • 16:34 brouberol@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: JVM restart - brouberol@cumin2002 - T363975
  • 16:16 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 13m 47s)
  • 16:13 ejegg: restarted fundraising scheduled jobs
  • 16:11 ejegg: fundraising civicrm rolled back from 3fef5849 to c0d2fa95
  • 16:02 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 14m 23s)
  • 15:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T352010)', diff saved to https://phabricator.wikimedia.org/P62370 and previous config saved to /var/cache/conftool/dbconfig/20240513-155911-ladsgroup.json
  • 15:59 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 15:58 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 15:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62369 and previous config saved to /var/cache/conftool/dbconfig/20240513-155849-ladsgroup.json
  • 15:55 ejegg: fundraising civicrm upgraded from c0d2fa95 to 3fef5849
  • 15:54 ejegg: disabled fundraising scheduled jobs for CiviCRM deploy
  • 15:49 herron@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-codfw
  • 15:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P62368 and previous config saved to /var/cache/conftool/dbconfig/20240513-154341-ladsgroup.json
  • 15:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P62367 and previous config saved to /var/cache/conftool/dbconfig/20240513-152833-ladsgroup.json
  • 15:27 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 15:25 herron@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-codfw
  • 15:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 15:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 15:18 brouberol@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: JVM restart - brouberol@cumin2002 - T363975
  • 15:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62366 and previous config saved to /var/cache/conftool/dbconfig/20240513-151325-ladsgroup.json
  • 14:55 Lucas_WMDE: UTC afternoon backport+config window don
  • 14:50 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Include mw-jobrunner port in host header check (duration: 16m 04s)
  • 14:49 jayme@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host kubestagemaster2004.codfw.wmnet
  • 14:49 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster2004.codfw.wmnet with OS bullseye
  • 14:42 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
  • 14:39 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage
  • 14:37 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and hnowlan: Continuing with sync
  • 14:36 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and hnowlan: Backport for Include mw-jobrunner port in host header check synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:35 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2004.codfw.wmnet with reason: host reimage
  • 14:34 mutante: CI - switch over to other contint server finished - T334517
  • 14:34 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Include mw-jobrunner port in host header check
  • 14:32 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync
  • 14:32 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2004.codfw.wmnet with reason: host reimage
  • 14:32 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync
  • 14:25 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster2005.codfw.wmnet with OS bullseye
  • 14:22 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster2005.codfw.wmnet - jayme@cumin1002"
  • 14:19 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster2005.codfw.wmnet - jayme@cumin1002"
  • 14:19 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubestagemaster2005.codfw.wmnet on all recursors
  • 14:19 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache kubestagemaster2005.codfw.wmnet on all recursors
  • 14:19 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:19 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster2005.codfw.wmnet - jayme@cumin1002"
  • 14:18 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster2005.codfw.wmnet - jayme@cumin1002"
  • 14:18 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestagemaster2004.codfw.wmnet with OS bullseye
  • 14:17 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster2004.codfw.wmnet - jayme@cumin1002"
  • 14:16 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kubestagemaster2004.codfw.wmnet - jayme@cumin1002"
  • 14:16 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubestagemaster2004.codfw.wmnet on all recursors
  • 14:15 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 14:15 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache kubestagemaster2004.codfw.wmnet on all recursors
  • 14:15 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:15 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster2004.codfw.wmnet - jayme@cumin1002"
  • 14:15 mutante: CI - migration in progress - stopping jenkins and zuul (T334517)
  • 14:15 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kubestagemaster2004.codfw.wmnet - jayme@cumin1002"
  • 14:13 jayme@cumin1002: START - Cookbook sre.ganeti.makevm for new host kubestagemaster2005.codfw.wmnet
  • 14:12 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 14:12 jayme@cumin1002: START - Cookbook sre.ganeti.makevm for new host kubestagemaster2004.codfw.wmnet
  • 14:12 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Enable async upload-by-URL via jobqueue on testwiki (T295007) (duration: 25m 09s)
  • 14:09 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on contint1002.wikimedia.org with reason: T334517
  • 14:09 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on contint1002.wikimedia.org with reason: T334517
  • 14:09 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on contint2002.wikimedia.org with reason: T334517
  • 14:09 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on contint2002.wikimedia.org with reason: T334517
  • 14:05 brouberol@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: JVM restart - brouberol@cumin2002 - T363975
  • 14:00 logmsgbot: lucaswerkmeister-wmde@deploy1002 hnowlan and lucaswerkmeister-wmde: Continuing with sync
  • 13:59 brouberol@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: JVM restart - brouberol@cumin2002 - T363975
  • 13:56 brouberol@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: JVM restart - brouberol@cumin2002 - T363975
  • 13:56 brouberol@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: JVM restart - brouberol@cumin2002 - T363975
  • 13:49 logmsgbot: lucaswerkmeister-wmde@deploy1002 hnowlan and lucaswerkmeister-wmde: Backport for Enable async upload-by-URL via jobqueue on testwiki (T295007) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T360332)', diff saved to https://phabricator.wikimedia.org/P62363 and previous config saved to /var/cache/conftool/dbconfig/20240513-134852-arnaudb.json
  • 13:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T360332)', diff saved to https://phabricator.wikimedia.org/P62362 and previous config saved to /var/cache/conftool/dbconfig/20240513-134721-arnaudb.json
  • 13:47 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Enable async upload-by-URL via jobqueue on testwiki (T295007)
  • 13:47 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:45 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for specials: Fix "include templates" query builder for Special:Export (T364554), ArticleTarget: Fix return of getVisualDiffGeneratorPromise (T364635) (duration: 16m 04s)
  • 13:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 13:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 13:37 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P62361 and previous config saved to /var/cache/conftool/dbconfig/20240513-133345-arnaudb.json
  • 13:33 logmsgbot: lucaswerkmeister-wmde@deploy1002 umherirrender and lucaswerkmeister-wmde and matmarex: Continuing with sync
  • 13:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P62360 and previous config saved to /var/cache/conftool/dbconfig/20240513-133214-arnaudb.json
  • 13:32 logmsgbot: lucaswerkmeister-wmde@deploy1002 umherirrender and lucaswerkmeister-wmde and matmarex: Backport for specials: Fix "include templates" query builder for Special:Export (T364554), ArticleTarget: Fix return of getVisualDiffGeneratorPromise (T364635) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:29 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for specials: Fix "include templates" query builder for Special:Export (T364554), ArticleTarget: Fix return of getVisualDiffGeneratorPromise (T364635)
  • 13:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P62359 and previous config saved to /var/cache/conftool/dbconfig/20240513-131837-arnaudb.json
  • 13:17 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P62358 and previous config saved to /var/cache/conftool/dbconfig/20240513-131706-arnaudb.json
  • 13:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 13:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 13:11 filippo@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 13:11 filippo@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 13:07 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:07 filippo@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 13:07 filippo@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 13:05 filippo@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 13:05 filippo@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 13:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T360332)', diff saved to https://phabricator.wikimedia.org/P62357 and previous config saved to /var/cache/conftool/dbconfig/20240513-130329-arnaudb.json
  • 13:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T360332)', diff saved to https://phabricator.wikimedia.org/P62356 and previous config saved to /var/cache/conftool/dbconfig/20240513-130158-arnaudb.json
  • 13:01 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2202.codfw.wmnet with reason: Maintenance
  • 13:01 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2202.codfw.wmnet with reason: Maintenance
  • 13:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1181 (T360332)', diff saved to https://phabricator.wikimedia.org/P62355 and previous config saved to /var/cache/conftool/dbconfig/20240513-130049-arnaudb.json
  • 13:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 13:00 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 12:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2165 (T360332)', diff saved to https://phabricator.wikimedia.org/P62354 and previous config saved to /var/cache/conftool/dbconfig/20240513-125940-arnaudb.json
  • 12:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 12:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 12:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 12:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 12:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T364299)', diff saved to https://phabricator.wikimedia.org/P62353 and previous config saved to /var/cache/conftool/dbconfig/20240513-124752-marostegui.json
  • 12:44 brouberol@cumin2002: END (PASS) - Cookbook sre.apifeatureusage.roll-restart-reboot-logstash (exit_code=0) rolling restart_daemons on A:apifeatureusage
  • 12:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to https://phabricator.wikimedia.org/P62351 and previous config saved to /var/cache/conftool/dbconfig/20240513-121737-marostegui.json
  • 12:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T364299)', diff saved to https://phabricator.wikimedia.org/P62350 and previous config saved to /var/cache/conftool/dbconfig/20240513-120229-marostegui.json
  • 11:58 hashar: Restarted CI Jenkins to update the Parameterized build plugin | T336782
  • 11:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213 (T364299)', diff saved to https://phabricator.wikimedia.org/P62349 and previous config saved to /var/cache/conftool/dbconfig/20240513-113215-marostegui.json
  • 11:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 11:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 11:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T364299)', diff saved to https://phabricator.wikimedia.org/P62348 and previous config saved to /var/cache/conftool/dbconfig/20240513-113152-marostegui.json
  • 11:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P62347 and previous config saved to /var/cache/conftool/dbconfig/20240513-111644-marostegui.json
  • 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
  • 11:04 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad
  • 11:04 moritzm: installing tomcat9 security updates
  • 11:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P62346 and previous config saved to /var/cache/conftool/dbconfig/20240513-110137-marostegui.json
  • 10:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T364299)', diff saved to https://phabricator.wikimedia.org/P62345 and previous config saved to /var/cache/conftool/dbconfig/20240513-104627-marostegui.json
  • 10:37 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
  • 10:32 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw
  • 10:19 moritzm: installing expat security updates
  • 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1210 (T364299)', diff saved to https://phabricator.wikimedia.org/P62343 and previous config saved to /var/cache/conftool/dbconfig/20240513-101748-marostegui.json
  • 10:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T364299)', diff saved to https://phabricator.wikimedia.org/P62342 and previous config saved to /var/cache/conftool/dbconfig/20240513-101724-marostegui.json
  • 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P62341 and previous config saved to /var/cache/conftool/dbconfig/20240513-100216-marostegui.json
  • 09:47 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: sync
  • 09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P62340 and previous config saved to /var/cache/conftool/dbconfig/20240513-094709-marostegui.json
  • 09:46 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: sync
  • 09:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2184.codfw.wmnet with OS bookworm
  • 09:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T364299)', diff saved to https://phabricator.wikimedia.org/P62338 and previous config saved to /var/cache/conftool/dbconfig/20240513-093200-marostegui.json
  • 09:28 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
  • 09:27 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: sync
  • 09:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
  • 09:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
  • 09:05 jynus: deploy new stat grants at m1:dbbackups T362509
  • 09:03 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS bookworm
  • 09:02 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2184.codfw.wmnet with OS bookworm
  • 09:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T364299)', diff saved to https://phabricator.wikimedia.org/P62337 and previous config saved to /var/cache/conftool/dbconfig/20240513-090035-marostegui.json
  • 09:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 09:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 09:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T364299)', diff saved to https://phabricator.wikimedia.org/P62336 and previous config saved to /var/cache/conftool/dbconfig/20240513-090011-marostegui.json
  • 09:00 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts snapshot1009.eqiad.wmnet
  • 09:00 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:00 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: snapshot1009.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 08:58 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: snapshot1009.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 08:56 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 08:53 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS bookworm
  • 08:51 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts snapshot1009.eqiad.wmnet
  • 08:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P62335 and previous config saved to /var/cache/conftool/dbconfig/20240513-084503-marostegui.json
  • 08:45 marostegui@deploy1002: Finished scap: Backport for db-production.php: Enable writes on es6 and es7 (T364446) (duration: 44m 00s)
  • 08:32 marostegui@deploy1002: marostegui: Continuing with sync
  • 08:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P62334 and previous config saved to /var/cache/conftool/dbconfig/20240513-082956-marostegui.json
  • 08:24 moritzm: installing PHP 7.3 security updates
  • 08:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T364299)', diff saved to https://phabricator.wikimedia.org/P62333 and previous config saved to /var/cache/conftool/dbconfig/20240513-081448-marostegui.json
  • 08:03 marostegui@deploy1002: marostegui: Backport for db-production.php: Enable writes on es6 and es7 (T364446) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:01 marostegui@deploy1002: Started scap: Backport for db-production.php: Enable writes on es6 and es7 (T364446)
  • 08:00 moritzm: installing python2.7 security updates
  • 07:58 ladsgroup@deploy1002: Finished scap: Backport for Fix static cache access (T364693) (duration: 16m 54s)
  • 07:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 17451
  • 07:53 moritzm: installing libgd2 security updates
  • 07:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62332 and previous config saved to /var/cache/conftool/dbconfig/20240513-075256-root.json
  • 07:46 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 07:44 brouberol@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 07:44 ladsgroup@deploy1002: ladsgroup: Backport for Fix static cache access (T364693) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:41 ladsgroup@deploy1002: Started scap: Backport for Fix static cache access (T364693)
  • 07:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T364299)', diff saved to https://phabricator.wikimedia.org/P62331 and previous config saved to /var/cache/conftool/dbconfig/20240513-074103-marostegui.json
  • 07:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 07:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T364299)', diff saved to https://phabricator.wikimedia.org/P62330 and previous config saved to /var/cache/conftool/dbconfig/20240513-074041-marostegui.json
  • 07:38 brouberol@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 07:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62329 and previous config saved to /var/cache/conftool/dbconfig/20240513-073750-root.json
  • 07:37 kartik@deploy1002: Finished scap: Backport for ContentTranslation: Update publishing setting for cswiki (T353049) (duration: 32m 03s)
  • 07:35 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 17451
  • 07:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62328 and previous config saved to /var/cache/conftool/dbconfig/20240513-073031-ladsgroup.json
  • 07:30 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 07:30 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 07:30 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 07:30 brouberol@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 07:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P62327 and previous config saved to /var/cache/conftool/dbconfig/20240513-072533-marostegui.json
  • 07:23 brouberol@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 07:23 kartik@deploy1002: kartik: Continuing with sync
  • 07:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62326 and previous config saved to /var/cache/conftool/dbconfig/20240513-072244-root.json
  • 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wmcs::openstack::eqiad1::instance_backups
  • 07:19 kartik@deploy1002: kartik: Backport for ContentTranslation: Update publishing setting for cswiki (T353049) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: wmcs::openstack::eqiad1::instance_backups
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P62325 and previous config saved to /var/cache/conftool/dbconfig/20240513-071026-marostegui.json
  • 07:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cloudbackup1004.eqiad.wmnet
  • 07:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62324 and previous config saved to /var/cache/conftool/dbconfig/20240513-070738-root.json
  • 07:05 kartik@deploy1002: Started scap: Backport for ContentTranslation: Update publishing setting for cswiki (T353049)
  • 06:59 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host cloudbackup1004.eqiad.wmnet
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T364299)', diff saved to https://phabricator.wikimedia.org/P62323 and previous config saved to /var/cache/conftool/dbconfig/20240513-065518-marostegui.json
  • 06:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62322 and previous config saved to /var/cache/conftool/dbconfig/20240513-065230-root.json
  • 06:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2183.codfw.wmnet with OS bookworm
  • 06:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62321 and previous config saved to /var/cache/conftool/dbconfig/20240513-063724-root.json
  • 06:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
  • 06:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
  • 06:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62320 and previous config saved to /var/cache/conftool/dbconfig/20240513-062219-root.json
  • 06:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1183 (T364299)', diff saved to https://phabricator.wikimedia.org/P62319 and previous config saved to /var/cache/conftool/dbconfig/20240513-062129-marostegui.json
  • 06:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 06:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 06:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T364299)', diff saved to https://phabricator.wikimedia.org/P62318 and previous config saved to /var/cache/conftool/dbconfig/20240513-062117-marostegui.json
  • 06:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: Reimage of the master
  • 06:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: Reimage of the master
  • 06:07 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2183.codfw.wmnet with OS bookworm
  • 06:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: Reimage
  • 06:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: Reimage
  • 06:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P62317 and previous config saved to /var/cache/conftool/dbconfig/20240513-060610-marostegui.json
  • 06:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2213.codfw.wmnet with reason: Schema change
  • 06:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2213.codfw.wmnet with reason: Schema change
  • 05:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2213.codfw.wmnet with reason: Schema change
  • 05:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2213.codfw.wmnet with reason: Schema change
  • 05:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P62316 and previous config saved to /var/cache/conftool/dbconfig/20240513-055102-marostegui.json
  • 05:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2213 T364703', diff saved to https://phabricator.wikimedia.org/P62315 and previous config saved to /var/cache/conftool/dbconfig/20240513-054841-root.json
  • 05:48 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2123 to s5 primary T364703', diff saved to https://phabricator.wikimedia.org/P62314 and previous config saved to /var/cache/conftool/dbconfig/20240513-054802-root.json
  • 05:47 marostegui: Starting s5 codfw failover from db2213 to db2123 - T364703
  • 05:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T364299)', diff saved to https://phabricator.wikimedia.org/P62313 and previous config saved to /var/cache/conftool/dbconfig/20240513-053553-marostegui.json
  • 05:24 marostegui@cumin1002: dbctl commit (dc=all): 'Remove vslow from db2123 T364703', diff saved to https://phabricator.wikimedia.org/P62312 and previous config saved to /var/cache/conftool/dbconfig/20240513-052424-marostegui.json
  • 05:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Primary switchover s5 T364703
  • 05:23 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2123 with weight 0 T364703', diff saved to https://phabricator.wikimedia.org/P62311 and previous config saved to /var/cache/conftool/dbconfig/20240513-052304-root.json
  • 05:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 24 hosts with reason: Primary switchover s5 T364703
  • 05:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T364299)', diff saved to https://phabricator.wikimedia.org/P62310 and previous config saved to /var/cache/conftool/dbconfig/20240513-050237-marostegui.json
  • 05:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 05:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 05:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 05:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 03:21 cwhite: restart apache2 on phab1004
  • 01:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T352010)', diff saved to https://phabricator.wikimedia.org/P62309 and previous config saved to /var/cache/conftool/dbconfig/20240513-014623-ladsgroup.json
  • 01:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P62308 and previous config saved to /var/cache/conftool/dbconfig/20240513-013113-ladsgroup.json
  • 01:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P62307 and previous config saved to /var/cache/conftool/dbconfig/20240513-011605-ladsgroup.json
  • 01:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T352010)', diff saved to https://phabricator.wikimedia.org/P62306 and previous config saved to /var/cache/conftool/dbconfig/20240513-010055-ladsgroup.json

2024-05-12

  • 19:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2209 (T352010)', diff saved to https://phabricator.wikimedia.org/P62305 and previous config saved to /var/cache/conftool/dbconfig/20240512-195220-ladsgroup.json
  • 19:52 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 19:52 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 19:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T352010)', diff saved to https://phabricator.wikimedia.org/P62304 and previous config saved to /var/cache/conftool/dbconfig/20240512-195156-ladsgroup.json
  • 19:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P62303 and previous config saved to /var/cache/conftool/dbconfig/20240512-193645-ladsgroup.json
  • 19:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P62302 and previous config saved to /var/cache/conftool/dbconfig/20240512-192137-ladsgroup.json
  • 19:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T352010)', diff saved to https://phabricator.wikimedia.org/P62301 and previous config saved to /var/cache/conftool/dbconfig/20240512-190629-ladsgroup.json
  • 13:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2194 (T352010)', diff saved to https://phabricator.wikimedia.org/P62300 and previous config saved to /var/cache/conftool/dbconfig/20240512-134125-ladsgroup.json
  • 13:41 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 13:41 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 13:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T352010)', diff saved to https://phabricator.wikimedia.org/P62299 and previous config saved to /var/cache/conftool/dbconfig/20240512-134101-ladsgroup.json
  • 13:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P62298 and previous config saved to /var/cache/conftool/dbconfig/20240512-132554-ladsgroup.json
  • 13:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P62297 and previous config saved to /var/cache/conftool/dbconfig/20240512-131046-ladsgroup.json
  • 12:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T352010)', diff saved to https://phabricator.wikimedia.org/P62296 and previous config saved to /var/cache/conftool/dbconfig/20240512-125539-ladsgroup.json
  • 07:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T352010)', diff saved to https://phabricator.wikimedia.org/P62295 and previous config saved to /var/cache/conftool/dbconfig/20240512-072559-ladsgroup.json
  • 07:25 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 07:25 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 07:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T352010)', diff saved to https://phabricator.wikimedia.org/P62294 and previous config saved to /var/cache/conftool/dbconfig/20240512-072534-ladsgroup.json
  • 07:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P62293 and previous config saved to /var/cache/conftool/dbconfig/20240512-071026-ladsgroup.json
  • 06:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P62292 and previous config saved to /var/cache/conftool/dbconfig/20240512-065519-ladsgroup.json
  • 06:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T352010)', diff saved to https://phabricator.wikimedia.org/P62291 and previous config saved to /var/cache/conftool/dbconfig/20240512-064011-ladsgroup.json
  • 00:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T352010)', diff saved to https://phabricator.wikimedia.org/P62290 and previous config saved to /var/cache/conftool/dbconfig/20240512-000104-ladsgroup.json
  • 00:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 00:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 00:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T352010)', diff saved to https://phabricator.wikimedia.org/P62289 and previous config saved to /var/cache/conftool/dbconfig/20240512-000040-ladsgroup.json

2024-05-11

  • 23:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P62288 and previous config saved to /var/cache/conftool/dbconfig/20240511-234532-ladsgroup.json
  • 23:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P62287 and previous config saved to /var/cache/conftool/dbconfig/20240511-233023-ladsgroup.json
  • 23:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T352010)', diff saved to https://phabricator.wikimedia.org/P62286 and previous config saved to /var/cache/conftool/dbconfig/20240511-231515-ladsgroup.json
  • 16:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T352010)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240511-163653-ladsgroup.json
  • 16:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 16:36 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 16:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 16:36 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 16:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P62284 and previous config saved to /var/cache/conftool/dbconfig/20240511-163614-ladsgroup.json
  • 16:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P62283 and previous config saved to /var/cache/conftool/dbconfig/20240511-162106-ladsgroup.json
  • 16:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P62282 and previous config saved to /var/cache/conftool/dbconfig/20240511-160558-ladsgroup.json
  • 15:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P62281 and previous config saved to /var/cache/conftool/dbconfig/20240511-155050-ladsgroup.json
  • 13:20 Dreamy_Jazz: Running `foreachwiki userOptions.php --delete betafeatures-geonotahack --nowarn` - T300371
  • 13:17 Dreamy_Jazz: Running `foreachwiki userOptions.php --delete betafeatures-vector-compact-personal-bar --nowarn` - T300371
  • 13:14 Dreamy_Jazz: Running `foreachwiki userOptions.php --delete betafeatures-vector-typography-update --nowarn` - T300371
  • 13:11 Dreamy_Jazz: Running `foreachwiki userOptions.php --delete betafeatures-popup-disable` - T300371
  • 12:07 Dreamy_Jazz: Running `foreachwiki userOptions.php --delete templatewizard-betafeature` - T300371
  • 09:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P62280 and previous config saved to /var/cache/conftool/dbconfig/20240511-090631-ladsgroup.json
  • 09:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 09:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 01:14 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 01:14 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 01:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T352010)', diff saved to https://phabricator.wikimedia.org/P62279 and previous config saved to /var/cache/conftool/dbconfig/20240511-011416-ladsgroup.json
  • 00:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P62278 and previous config saved to /var/cache/conftool/dbconfig/20240511-005908-ladsgroup.json
  • 00:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P62277 and previous config saved to /var/cache/conftool/dbconfig/20240511-004400-ladsgroup.json
  • 00:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T352010)', diff saved to https://phabricator.wikimedia.org/P62276 and previous config saved to /var/cache/conftool/dbconfig/20240511-002853-ladsgroup.json

2024-05-10

  • 21:19 ryankemper@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons.
  • 21:08 ryankemper@cumin2002: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons.
  • 20:19 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:19 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 18:41 fab@deploy1002: Finished deploy [airflow-dags/research@75163c7]: (no justification provided) (duration: 00m 32s)
  • 18:41 fab@deploy1002: Started deploy [airflow-dags/research@75163c7]: (no justification provided)
  • 18:22 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:22 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for spine to spine links codfw - cmooney@cumin1002"
  • 17:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2127 (T352010)', diff saved to https://phabricator.wikimedia.org/P62275 and previous config saved to /var/cache/conftool/dbconfig/20240510-174044-ladsgroup.json
  • 17:40 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 17:40 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 17:13 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for spine to spine links codfw - cmooney@cumin1002"
  • 17:07 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 17:00 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 16:59 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 19165
  • 16:58 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 19165
  • 16:58 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 26073
  • 16:58 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 26073
  • 16:53 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'configure' for AS: 15830
  • 16:52 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 15830
  • 16:48 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9269
  • 16:47 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 9269
  • 16:46 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 17451
  • 16:45 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 17451
  • 16:16 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:16 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns records for new codfw row c and d networks - cmooney@cumin1002"
  • 16:14 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns records for new codfw row c and d networks - cmooney@cumin1002"
  • 16:12 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 14:15 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 21574
  • 14:15 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 21574
  • 14:14 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 38565
  • 14:14 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 38565
  • 14:13 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 23473
  • 14:12 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 23473
  • 14:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 5769
  • 14:12 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 5769
  • 14:09 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 5418
  • 14:09 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 5418
  • 14:08 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 7337
  • 14:08 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 7337
  • 14:07 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 30640
  • 14:06 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 30640
  • 14:06 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
  • 13:59 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 64049
  • 12:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2002.wikimedia.org
  • 12:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2002.wikimedia.org
  • 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1002.wikimedia.org
  • 12:03 Dreamy_Jazz: Restarting MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 12:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1002.wikimedia.org
  • 11:36 moritzm: roll out debdeploy 0.0.99.14
  • 11:35 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 11:35 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 11:05 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 10:50 elukey: add amd-k8s-device-plugin_1.25.2.8 to bullseye-wikimedia
  • 10:32 moritzm: installing Linux 5.10.216 on Bullseye systems
  • 08:42 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 08:30 godog: restore SRE business hours oncall for EMEA - T350192
  • 07:55 moritzm: installing Linux 6.1.90 on Bookworm systems
  • 06:10 eoghan@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade for T364481
  • 06:03 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade for T364481
  • 05:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 05:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 05:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T352010)', diff saved to https://phabricator.wikimedia.org/P62272 and previous config saved to /var/cache/conftool/dbconfig/20240510-050102-ladsgroup.json
  • 04:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P62271 and previous config saved to /var/cache/conftool/dbconfig/20240510-044554-ladsgroup.json
  • 04:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P62270 and previous config saved to /var/cache/conftool/dbconfig/20240510-043046-ladsgroup.json
  • 04:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T352010)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240510-041534-ladsgroup.json
  • 00:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T352010)', diff saved to https://phabricator.wikimedia.org/P62269 and previous config saved to /var/cache/conftool/dbconfig/20240510-004703-ladsgroup.json
  • 00:46 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 00:46 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 00:46 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 00:46 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 00:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T352010)', diff saved to https://phabricator.wikimedia.org/P62268 and previous config saved to /var/cache/conftool/dbconfig/20240510-004621-ladsgroup.json
  • 00:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P62267 and previous config saved to /var/cache/conftool/dbconfig/20240510-003113-ladsgroup.json
  • 00:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P62266 and previous config saved to /var/cache/conftool/dbconfig/20240510-001605-ladsgroup.json
  • 00:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T352010)', diff saved to https://phabricator.wikimedia.org/P62265 and previous config saved to /var/cache/conftool/dbconfig/20240510-000058-ladsgroup.json

2024-05-09

  • 23:06 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 23:06 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 23:06 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 22:28 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-main2006']
  • 22:27 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2006']
  • 21:47 ryankemper: [wdqs] Re-enabled puppet on `wdqs2023`
  • 21:41 ryankemper@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
  • 21:18 ryankemper@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
  • 21:11 jhuneidi@deploy1002: Finished scap: Backport for Skin: Fix UrlUtils calls (T364539) (duration: 23m 42s)
  • 20:58 jhuneidi@deploy1002: jhuneidi and lucaswerkmeister: Continuing with sync
  • 20:49 jhuneidi@deploy1002: jhuneidi and lucaswerkmeister: Backport for Skin: Fix UrlUtils calls (T364539) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:47 jhuneidi@deploy1002: Started scap: Backport for Skin: Fix UrlUtils calls (T364539)
  • 20:18 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.43.0-wmf.4 refs T361398
  • 19:59 jhuneidi@deploy1002: Finished scap: Backport for Revert "Migrate to IReadableDatabase::newSelectQueryBuilder" (T312418 T364499) (duration: 17m 37s)
  • 19:46 jhuneidi@deploy1002: jhuneidi and zabe: Continuing with sync
  • 19:44 jhuneidi@deploy1002: jhuneidi and zabe: Backport for Revert "Migrate to IReadableDatabase::newSelectQueryBuilder" (T312418 T364499) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:42 jhuneidi@deploy1002: Started scap: Backport for Revert "Migrate to IReadableDatabase::newSelectQueryBuilder" (T312418 T364499)
  • 19:29 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcontrol2001-dev.codfw.wmnet
  • 19:29 andrew@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:29 andrew@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2001-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 19:29 eileen: civicrm upgraded from 6256c944 to c0d2fa95
  • 19:28 andrew@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2001-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
  • 19:26 andrew@cumin1002: START - Cookbook sre.dns.netbox
  • 19:19 andrew@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudcontrol2001-dev.codfw.wmnet
  • 19:09 denisse: Restarting `pyrra-filesystem-notify-thanos.path`, and `reset-failed thanos-rule-reload.service` units on titan1001
  • 19:08 denisse: Reset failed `pyrra-filesystem-notify-thanos.path`, and `reset-failed thanos-rule-reload.service` units on titan1001
  • 17:58 jforrester@deploy1002: Finished scap: Backport for Revert "Action APIs: Set most of our APIs to emit a cache header for 24 hours" (T364567) (duration: 17m 17s)
  • 17:45 jforrester@deploy1002: jforrester: Continuing with sync
  • 17:44 ejegg: SmashPig (standalone IPN listener) upgraded from 67db9d96 to 82392d54
  • 17:43 jforrester@deploy1002: jforrester: Backport for Revert "Action APIs: Set most of our APIs to emit a cache header for 24 hours" (T364567) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:41 jforrester@deploy1002: Started scap: Backport for Revert "Action APIs: Set most of our APIs to emit a cache header for 24 hours" (T364567)
  • 17:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T352010)', diff saved to https://phabricator.wikimedia.org/P62263 and previous config saved to /var/cache/conftool/dbconfig/20240509-173728-ladsgroup.json
  • 17:37 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 17:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 17:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T352010)', diff saved to https://phabricator.wikimedia.org/P62262 and previous config saved to /var/cache/conftool/dbconfig/20240509-173705-ladsgroup.json
  • 17:34 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet with OS bookworm
  • 17:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P62261 and previous config saved to /var/cache/conftool/dbconfig/20240509-172157-ladsgroup.json
  • 17:16 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol2006-dev.codfw.wmnet with reason: host reimage
  • 17:13 andrew@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol2006-dev.codfw.wmnet with reason: host reimage
  • 17:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P62260 and previous config saved to /var/cache/conftool/dbconfig/20240509-170649-ladsgroup.json
  • 16:56 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2010.codfw.wmnet with OS bullseye
  • 16:55 sukhe: sudo cumin -b30 'A:cp' 'run-puppet-agent --enable "merging CR 1029614"'
  • 16:53 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcontrol2006-dev.codfw.wmnet with OS bookworm
  • 16:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T352010)', diff saved to https://phabricator.wikimedia.org/P62259 and previous config saved to /var/cache/conftool/dbconfig/20240509-165141-ladsgroup.json
  • 16:49 sukhe: sudo cumin 'A:cp' 'disable-puppet "merging CR 1029614"'
  • 16:47 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2008.codfw.wmnet with OS bullseye
  • 16:35 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudcontrol2006-dev.private.codfw.wikimedia.cloud on all recursors
  • 16:35 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache cloudcontrol2006-dev.private.codfw.wikimedia.cloud on all recursors
  • 16:34 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 16:32 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main2006.codfw.wmnet with OS bullseye
  • 16:32 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main2007.codfw.wmnet with OS bullseye
  • 16:32 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:32 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add entries for new codfw cloudcontrol nodes - cmooney@cumin1002"
  • 16:31 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add entries for new codfw cloudcontrol nodes - cmooney@cumin1002"
  • 16:29 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 16:20 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 15:36 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bullseye
  • 15:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-main2010']
  • 15:35 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2010']
  • 15:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-main2010']
  • 15:35 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2010']
  • 15:34 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:31 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1013.eqiad.wmnet with reason: Bootstrapping — T364422
  • 15:31 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on aqs1013.eqiad.wmnet with reason: Bootstrapping — T364422
  • 15:29 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aqs1013.eqiad.wmnet with OS bullseye
  • 15:27 eevans@deploy1002: Finished deploy [cassandra/logstash-logback-encoder@42653e6] (aqs): (no justification provided) (duration: 00m 33s)
  • 15:27 eevans@deploy1002: Started deploy [cassandra/logstash-logback-encoder@42653e6] (aqs): (no justification provided)
  • 15:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T364299)', diff saved to https://phabricator.wikimedia.org/P62258 and previous config saved to /var/cache/conftool/dbconfig/20240509-152501-marostegui.json
  • 15:23 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:22 dancy@deploy1002: Installation of scap version "4.83.0" completed for 307 hosts
  • 15:22 dancy@deploy1002: Installing scap version "4.83.0" for 307 hosts
  • 15:21 dancy@deploy1002: Installing scap version "4.83.0" for 308 hosts
  • 15:20 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:20 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding kafka-main2010 to codfw - jhancock@cumin2002"
  • 15:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding kafka-main2010 to codfw - jhancock@cumin2002"
  • 15:17 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 15:15 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1010.eqiad.wmnet
  • 15:15 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:15 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding kafka-main2010 to codfw - jhancock@cumin2002"
  • 15:14 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding kafka-main2010 to codfw - jhancock@cumin2002"
  • 15:14 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 15:14 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS bullseye
  • 15:14 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS bullseye
  • 15:14 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS bullseye
  • 15:11 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 15:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-main2008']
  • 15:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-main2009']
  • 15:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P62257 and previous config saved to /var/cache/conftool/dbconfig/20240509-150953-marostegui.json
  • 15:09 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2008']
  • 15:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-main2007']
  • 15:08 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2009']
  • 15:08 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host snapshot1010.eqiad.wmnet
  • 15:08 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2007']
  • 15:08 ladsgroup@deploy1002: Finished scap: Backport for Disable namespaceDupes again (T364546), Disable namespaceDupes again (T364546) (duration: 16m 02s)
  • 15:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main2007.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:01 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:00 sukhe: sudo cumin 'A:cp' 'run-puppet-agent --enable "merging CR 1029570"'
  • 14:59 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:57 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main2009.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:57 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1013.eqiad.wmnet with reason: host reimage
  • 14:56 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:56 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2007.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:55 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-main2006']
  • 14:55 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2006']
  • 14:55 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 14:55 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main2007.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:54 ladsgroup@deploy1002: ladsgroup: Backport for Disable namespaceDupes again (T364546), Disable namespaceDupes again (T364546) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P62256 and previous config saved to /var/cache/conftool/dbconfig/20240509-145445-marostegui.json
  • 14:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:54 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:54 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1013.eqiad.wmnet with reason: host reimage
  • 14:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:52 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:52 ladsgroup@deploy1002: Started scap: Backport for Disable namespaceDupes again (T364546), Disable namespaceDupes again (T364546)
  • 14:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2009.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:44 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:43 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:43 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:41 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T364299)', diff saved to https://phabricator.wikimedia.org/P62255 and previous config saved to /var/cache/conftool/dbconfig/20240509-143938-marostegui.json
  • 14:34 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main2009.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:34 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2009.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main2009.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2007.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2009.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:32 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host kafka-main2010.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:29 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:29 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding kafka-main2006 to codfw - jhancock@cumin2002"
  • 14:28 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding kafka-main2006 to codfw - jhancock@cumin2002"
  • 14:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:18 eevans@cumin1002: START - Cookbook sre.hosts.reimage for host aqs1013.eqiad.wmnet with OS bullseye
  • 14:16 eevans@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host aqs1013.eqiad.wmnet with OS bullseye
  • 14:15 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62254 and previous config saved to /var/cache/conftool/dbconfig/20240509-141526-root.json
  • 14:09 denisse: Restarting envoyproxy on titan* hosts as part of the CFSSL migration - T360414
  • 14:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2211 (T364299)', diff saved to https://phabricator.wikimedia.org/P62253 and previous config saved to /var/cache/conftool/dbconfig/20240509-140858-marostegui.json
  • 14:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 14:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 14:06 TheresNoTime: ftr, did run `[samtar@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki quwiki --fix` for T355129, cancelled before complete due to outage
  • 14:00 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62252 and previous config saved to /var/cache/conftool/dbconfig/20240509-140020-root.json
  • 13:57 eevans@cumin1002: START - Cookbook sre.hosts.reimage for host aqs1013.eqiad.wmnet with OS bullseye
  • 13:47 samtar@deploy1002: Finished scap: Backport for quwiki: Set MetaNamespaceName to Wikipidiya (T355129) (duration: 19m 41s)
  • 13:45 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62250 and previous config saved to /var/cache/conftool/dbconfig/20240509-134514-root.json
  • 13:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 13:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 13:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T364299)', diff saved to https://phabricator.wikimedia.org/P62249 and previous config saved to /var/cache/conftool/dbconfig/20240509-134412-marostegui.json
  • 13:34 samtar@deploy1002: dreamrimmer and samtar: Continuing with sync
  • 13:30 samtar@deploy1002: dreamrimmer and samtar: Backport for quwiki: Set MetaNamespaceName to Wikipidiya (T355129) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:30 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62248 and previous config saved to /var/cache/conftool/dbconfig/20240509-133009-root.json
  • 13:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P62247 and previous config saved to /var/cache/conftool/dbconfig/20240509-132905-marostegui.json
  • 13:27 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet2005-dev.codfw.wmnet with OS bookworm
  • 13:27 samtar@deploy1002: Started scap: Backport for quwiki: Set MetaNamespaceName to Wikipidiya (T355129)
  • 13:23 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2014.codfw.wmnet
  • 13:19 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2014.codfw.wmnet
  • 13:17 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2013.codfw.wmnet
  • 13:15 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62246 and previous config saved to /var/cache/conftool/dbconfig/20240509-131501-root.json
  • 13:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P62245 and previous config saved to /var/cache/conftool/dbconfig/20240509-131355-marostegui.json
  • 13:13 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2013.codfw.wmnet
  • 13:12 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2012.codfw.wmnet
  • 13:08 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2012.codfw.wmnet
  • 13:07 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2011.codfw.wmnet
  • 13:06 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2005-dev.codfw.wmnet with reason: host reimage
  • 13:04 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2011.codfw.wmnet
  • 13:03 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2010.codfw.wmnet
  • 13:03 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2005-dev.codfw.wmnet with reason: host reimage
  • 12:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62244 and previous config saved to /var/cache/conftool/dbconfig/20240509-125955-root.json
  • 12:59 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2010.codfw.wmnet
  • 12:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T364299)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-125843-marostegui.json
  • 12:58 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2009.codfw.wmnet
  • 12:52 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2009.codfw.wmnet
  • 12:50 elukey: depool/upgrade/repool ms-fe20[09-14] to upgrade envoy to TLS PKI certs
  • 12:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS bookworm
  • 12:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
  • 12:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
  • 12:21 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 12:20 ladsgroup@deploy1002: ladsgroup: Backport for Return array from LocalAuth::getCentralLists (T364538) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:18 ladsgroup@deploy1002: Started scap: Backport for Return array from LocalAuth::getCentralLists (T364538)
  • 12:16 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2008-dev.codfw.wmnet with reason: host reimage
  • 12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P62240 and previous config saved to /var/cache/conftool/dbconfig/20240509-121433-marostegui.json
  • 12:12 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2007-dev.codfw.wmnet with reason: host reimage
  • 12:11 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS bookworm
  • 12:10 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2008-dev.codfw.wmnet with reason: host reimage
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1192', diff saved to https://phabricator.wikimedia.org/P62239 and previous config saved to /var/cache/conftool/dbconfig/20240509-120955-root.json
  • 12:09 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2007-dev.codfw.wmnet with reason: host reimage
  • 11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P62237 and previous config saved to /var/cache/conftool/dbconfig/20240509-115925-marostegui.json
  • 11:51 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudnet2008-dev.codfw.wmnet with OS bookworm
  • 11:50 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudnet2007-dev.codfw.wmnet with OS bookworm
  • 11:50 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
  • 11:50 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
  • 11:49 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
  • 11:49 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
  • 11:48 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 11:47 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 11:46 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
  • 11:45 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
  • 11:45 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
  • 11:45 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T364299)', diff saved to https://phabricator.wikimedia.org/P62236 and previous config saved to /var/cache/conftool/dbconfig/20240509-114417-marostegui.json
  • 11:44 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 11:43 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 11:34 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62235 and previous config saved to /var/cache/conftool/dbconfig/20240509-113443-root.json
  • 11:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62234 and previous config saved to /var/cache/conftool/dbconfig/20240509-111936-root.json
  • 11:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2178 (T364299)', diff saved to https://phabricator.wikimedia.org/P62233 and previous config saved to /var/cache/conftool/dbconfig/20240509-111100-marostegui.json
  • 11:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 11:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 11:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T364299)', diff saved to https://phabricator.wikimedia.org/P62232 and previous config saved to /var/cache/conftool/dbconfig/20240509-111037-marostegui.json
  • 11:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62231 and previous config saved to /var/cache/conftool/dbconfig/20240509-110430-root.json
  • 10:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P62230 and previous config saved to /var/cache/conftool/dbconfig/20240509-105527-marostegui.json
  • 10:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62229 and previous config saved to /var/cache/conftool/dbconfig/20240509-104922-root.json
  • 10:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P62228 and previous config saved to /var/cache/conftool/dbconfig/20240509-104019-marostegui.json
  • 10:34 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62227 and previous config saved to /var/cache/conftool/dbconfig/20240509-103417-root.json
  • 10:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T364299)', diff saved to https://phabricator.wikimedia.org/P62226 and previous config saved to /var/cache/conftool/dbconfig/20240509-102512-marostegui.json
  • 10:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62225 and previous config saved to /var/cache/conftool/dbconfig/20240509-101911-root.json
  • 10:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1167.eqiad.wmnet with OS bookworm
  • 10:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62224 and previous config saved to /var/cache/conftool/dbconfig/20240509-100405-root.json
  • 10:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1189 (T352010)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-100006-ladsgroup.json
  • 10:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 09:59 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 09:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T352010)', diff saved to https://phabricator.wikimedia.org/P62222 and previous config saved to /var/cache/conftool/dbconfig/20240509-095943-ladsgroup.json
  • 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171 (T364299)', diff saved to https://phabricator.wikimedia.org/P62221 and previous config saved to /var/cache/conftool/dbconfig/20240509-095313-marostegui.json
  • 09:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 09:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T364299)', diff saved to https://phabricator.wikimedia.org/P62220 and previous config saved to /var/cache/conftool/dbconfig/20240509-095249-marostegui.json
  • 09:52 jforrester@deploy1002: Finished scap: Backport for Disable ParserMigration on commonswiki (T364228) (duration: 16m 17s)
  • 09:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
  • 09:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-094431-ladsgroup.json
  • 09:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
  • 09:39 jforrester@deploy1002: lucaswerkmeister-wmde and jforrester: Continuing with sync
  • 09:38 jforrester@deploy1002: lucaswerkmeister-wmde and jforrester: Backport for Disable ParserMigration on commonswiki (T364228) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P62219 and previous config saved to /var/cache/conftool/dbconfig/20240509-093742-marostegui.json
  • 09:36 jforrester@deploy1002: Started scap: Backport for Disable ParserMigration on commonswiki (T364228)
  • 09:31 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1171.eqiad.wmnet with reason: upgrade to 10.6
  • 09:31 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1171.eqiad.wmnet with reason: upgrade to 10.6
  • 09:31 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: upgrade to 10.6
  • 09:31 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1150.eqiad.wmnet with reason: upgrade to 10.6
  • 09:29 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1167.eqiad.wmnet with OS bookworm
  • 09:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P62218 and previous config saved to /var/cache/conftool/dbconfig/20240509-092921-ladsgroup.json
  • 09:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1167', diff saved to https://phabricator.wikimedia.org/P62217 and previous config saved to /var/cache/conftool/dbconfig/20240509-092757-root.json
  • 09:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P62216 and previous config saved to /var/cache/conftool/dbconfig/20240509-092234-marostegui.json
  • 09:14 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62215 and previous config saved to /var/cache/conftool/dbconfig/20240509-091445-root.json
  • 09:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T352010)', diff saved to https://phabricator.wikimedia.org/P62214 and previous config saved to /var/cache/conftool/dbconfig/20240509-091413-ladsgroup.json
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T364299)', diff saved to https://phabricator.wikimedia.org/P62213 and previous config saved to /var/cache/conftool/dbconfig/20240509-090726-marostegui.json
  • 09:04 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 08:59 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62212 and previous config saved to /var/cache/conftool/dbconfig/20240509-085939-root.json
  • 08:54 btullis@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 08:53 jynus: deploy new grants for es6, es7 backups T363812
  • 08:44 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62211 and previous config saved to /var/cache/conftool/dbconfig/20240509-084433-root.json
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2157 (T364299)', diff saved to https://phabricator.wikimedia.org/P62210 and previous config saved to /var/cache/conftool/dbconfig/20240509-083705-marostegui.json
  • 08:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 08:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T364299)', diff saved to https://phabricator.wikimedia.org/P62209 and previous config saved to /var/cache/conftool/dbconfig/20240509-083643-marostegui.json
  • 08:30 godog: set batphone oncall for May 9th only for EMEA, not Americas - T350192
  • 08:29 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62208 and previous config saved to /var/cache/conftool/dbconfig/20240509-082927-root.json
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P62207 and previous config saved to /var/cache/conftool/dbconfig/20240509-082135-marostegui.json
  • 08:14 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62206 and previous config saved to /var/cache/conftool/dbconfig/20240509-081422-root.json
  • 08:13 godog: set batphone oncall for May 9th - T350192
  • 08:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62205 and previous config saved to /var/cache/conftool/dbconfig/20240509-080936-root.json
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P62204 and previous config saved to /var/cache/conftool/dbconfig/20240509-080627-marostegui.json
  • 08:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62203 and previous config saved to /var/cache/conftool/dbconfig/20240509-080549-root.json
  • 08:02 zabe@deploy1002: Finished scap: Backport for Fix error when marking a new page for translations (T364522) (duration: 19m 28s)
  • 07:59 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62202 and previous config saved to /var/cache/conftool/dbconfig/20240509-075914-root.json
  • 07:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62201 and previous config saved to /var/cache/conftool/dbconfig/20240509-075429-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T364299)', diff saved to https://phabricator.wikimedia.org/P62200 and previous config saved to /var/cache/conftool/dbconfig/20240509-075118-marostegui.json
  • 07:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62199 and previous config saved to /var/cache/conftool/dbconfig/20240509-075043-root.json
  • 07:50 zabe@deploy1002: zabe and abi: Continuing with sync
  • 07:45 zabe@deploy1002: zabe and abi: Backport for Fix error when marking a new page for translations (T364522) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:44 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62198 and previous config saved to /var/cache/conftool/dbconfig/20240509-074408-root.json
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Fully repool db1172', diff saved to https://phabricator.wikimedia.org/P62197 and previous config saved to /var/cache/conftool/dbconfig/20240509-074355-marostegui.json
  • 07:43 zabe@deploy1002: Started scap: Backport for Fix error when marking a new page for translations (T364522)
  • 07:42 zabe@deploy1002: Finished scap: Backport for Move wgGroupsAddToSelf and wgGroupsRemoveFromSelf to core-Permissions (duration: 17m 37s)
  • 07:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62196 and previous config saved to /var/cache/conftool/dbconfig/20240509-073922-root.json
  • 07:35 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62195 and previous config saved to /var/cache/conftool/dbconfig/20240509-073537-root.json
  • 07:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1172 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62194 and previous config saved to /var/cache/conftool/dbconfig/20240509-073311-root.json
  • 07:29 zabe@deploy1002: zabe: Continuing with sync
  • 07:28 zabe@deploy1002: zabe: Backport for Move wgGroupsAddToSelf and wgGroupsRemoveFromSelf to core-Permissions synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:24 zabe@deploy1002: Started scap: Backport for Move wgGroupsAddToSelf and wgGroupsRemoveFromSelf to core-Permissions
  • 07:24 marostegui@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 25%: Repooling', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-072411-root.json
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62192 and previous config saved to /var/cache/conftool/dbconfig/20240509-072032-root.json
  • 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1172 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62191 and previous config saved to /var/cache/conftool/dbconfig/20240509-071805-root.json
  • 07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2128 (T364299)', diff saved to https://phabricator.wikimedia.org/P62190 and previous config saved to /var/cache/conftool/dbconfig/20240509-071527-marostegui.json
  • 07:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 07:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 07:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 07:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 07:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T364299)', diff saved to https://phabricator.wikimedia.org/P62189 and previous config saved to /var/cache/conftool/dbconfig/20240509-071449-marostegui.json
  • 07:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2020.codfw.wmnet with OS bookworm
  • 07:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62188 and previous config saved to /var/cache/conftool/dbconfig/20240509-070905-root.json
  • 07:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62187 and previous config saved to /var/cache/conftool/dbconfig/20240509-070526-root.json
  • 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1172 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62186 and previous config saved to /var/cache/conftool/dbconfig/20240509-070300-root.json
  • 06:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P62185 and previous config saved to /var/cache/conftool/dbconfig/20240509-065941-marostegui.json
  • 06:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 5%: Repooling', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-065355-root.json
  • 06:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2020.codfw.wmnet with reason: host reimage
  • 06:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62183 and previous config saved to /var/cache/conftool/dbconfig/20240509-065020-root.json
  • 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1172 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62182 and previous config saved to /var/cache/conftool/dbconfig/20240509-064754-root.json
  • 06:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2020.codfw.wmnet with reason: host reimage
  • 06:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P62181 and previous config saved to /var/cache/conftool/dbconfig/20240509-064434-marostegui.json
  • 06:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62180 and previous config saved to /var/cache/conftool/dbconfig/20240509-063845-root.json
  • 06:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62179 and previous config saved to /var/cache/conftool/dbconfig/20240509-063832-root.json
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62178 and previous config saved to /var/cache/conftool/dbconfig/20240509-063514-root.json
  • 06:35 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1180.eqiad.wmnet onto db1231.eqiad.wmnet
  • 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1172 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62177 and previous config saved to /var/cache/conftool/dbconfig/20240509-063248-root.json
  • 06:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T364299)', diff saved to https://phabricator.wikimedia.org/P62176 and previous config saved to /var/cache/conftool/dbconfig/20240509-062926-marostegui.json
  • 06:24 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2020.codfw.wmnet with OS bookworm
  • 06:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62175 and previous config saved to /var/cache/conftool/dbconfig/20240509-062327-root.json
  • 06:20 marostegui@cumin1002: dbctl commit (dc=all): 'Give some weight to es4 codfw master', diff saved to https://phabricator.wikimedia.org/P62174 and previous config saved to /var/cache/conftool/dbconfig/20240509-062027-marostegui.json
  • 06:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2020 T364451', diff saved to https://phabricator.wikimedia.org/P62173 and previous config saved to /var/cache/conftool/dbconfig/20240509-061957-root.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2021 to es4 primary and set section read-write T364451', diff saved to https://phabricator.wikimedia.org/P62172 and previous config saved to /var/cache/conftool/dbconfig/20240509-061904-marostegui.json
  • 06:18 marostegui: Starting es4 codfw failover from es2020 to es2021 - T364451
  • 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1172 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62171 and previous config saved to /var/cache/conftool/dbconfig/20240509-061742-root.json
  • 06:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 T364451
  • 06:15 marostegui@cumin1002: dbctl commit (dc=all): 'Set es2021 with weight 0 T364451', diff saved to https://phabricator.wikimedia.org/P62170 and previous config saved to /var/cache/conftool/dbconfig/20240509-061500-root.json
  • 06:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 T364451
  • 06:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS bookworm
  • 06:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62169 and previous config saved to /var/cache/conftool/dbconfig/20240509-060821-root.json
  • 05:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2123 (T364299)', diff saved to https://phabricator.wikimedia.org/P62168 and previous config saved to /var/cache/conftool/dbconfig/20240509-055429-marostegui.json
  • 05:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 05:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62167 and previous config saved to /var/cache/conftool/dbconfig/20240509-055314-root.json
  • 05:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
  • 05:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
  • 05:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 05:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 05:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 10%: Repooling', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-053804-root.json
  • 05:37 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS bookworm
  • 05:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1172 T363792', diff saved to https://phabricator.wikimedia.org/P62166 and previous config saved to /var/cache/conftool/dbconfig/20240509-053442-marostegui.json
  • 05:32 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db1180.eqiad.wmnet onto db1231.eqiad.wmnet
  • 05:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1231', diff saved to https://phabricator.wikimedia.org/P62165 and previous config saved to /var/cache/conftool/dbconfig/20240509-052912-root.json
  • 05:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 05:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 05:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62164 and previous config saved to /var/cache/conftool/dbconfig/20240509-052258-root.json
  • 05:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 05:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 05:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62163 and previous config saved to /var/cache/conftool/dbconfig/20240509-050752-root.json
  • 05:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 05:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 04:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 04:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 04:52 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1231 with weight 0 T364067', diff saved to https://phabricator.wikimedia.org/P62162 and previous config saved to /var/cache/conftool/dbconfig/20240509-045216-marostegui.json
  • 04:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s6 T364067
  • 04:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s6 T364067
  • 04:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1235 (T361627)', diff saved to https://phabricator.wikimedia.org/P62161 and previous config saved to /var/cache/conftool/dbconfig/20240509-043908-marostegui.json
  • 04:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 04:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 04:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T361627)', diff saved to https://phabricator.wikimedia.org/P62160 and previous config saved to /var/cache/conftool/dbconfig/20240509-043845-marostegui.json
  • 04:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P62159 and previous config saved to /var/cache/conftool/dbconfig/20240509-042337-marostegui.json
  • 04:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P62158 and previous config saved to /var/cache/conftool/dbconfig/20240509-040830-marostegui.json
  • 03:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T361627)', diff saved to https://phabricator.wikimedia.org/P62157 and previous config saved to /var/cache/conftool/dbconfig/20240509-035320-marostegui.json
  • 03:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T361627)', diff saved to https://phabricator.wikimedia.org/P62156 and previous config saved to /var/cache/conftool/dbconfig/20240509-034128-marostegui.json
  • 03:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 03:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 03:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T361627)', diff saved to https://phabricator.wikimedia.org/P62155 and previous config saved to /var/cache/conftool/dbconfig/20240509-034105-marostegui.json
  • 03:32 eileen: civicrm upgraded from 3c8a3095 to 6256c944
  • 03:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-032552-marostegui.json
  • 03:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P62153 and previous config saved to /var/cache/conftool/dbconfig/20240509-031045-marostegui.json
  • 02:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T361627)', diff saved to https://phabricator.wikimedia.org/P62152 and previous config saved to /var/cache/conftool/dbconfig/20240509-025537-marostegui.json
  • 02:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T352010)', diff saved to https://phabricator.wikimedia.org/P62151 and previous config saved to /var/cache/conftool/dbconfig/20240509-024531-ladsgroup.json
  • 02:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 02:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 02:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T352010)', diff saved to https://phabricator.wikimedia.org/P62150 and previous config saved to /var/cache/conftool/dbconfig/20240509-024508-ladsgroup.json
  • 02:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T361627)', diff saved to https://phabricator.wikimedia.org/P62149 and previous config saved to /var/cache/conftool/dbconfig/20240509-024455-marostegui.json
  • 02:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 02:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 02:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T361627)', diff saved to https://phabricator.wikimedia.org/P62148 and previous config saved to /var/cache/conftool/dbconfig/20240509-024432-marostegui.json
  • 02:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P62147 and previous config saved to /var/cache/conftool/dbconfig/20240509-023000-ladsgroup.json
  • 02:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P62146 and previous config saved to /var/cache/conftool/dbconfig/20240509-022925-marostegui.json
  • 02:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P62145 and previous config saved to /var/cache/conftool/dbconfig/20240509-021452-ladsgroup.json
  • 02:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P62144 and previous config saved to /var/cache/conftool/dbconfig/20240509-021417-marostegui.json
  • 01:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T352010)', diff saved to https://phabricator.wikimedia.org/P62143 and previous config saved to /var/cache/conftool/dbconfig/20240509-015942-ladsgroup.json
  • 01:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T361627)', diff saved to https://phabricator.wikimedia.org/P62142 and previous config saved to /var/cache/conftool/dbconfig/20240509-015909-marostegui.json
  • 01:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1228 (T361627)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240509-014836-marostegui.json
  • 01:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 01:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 01:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T361627)', diff saved to https://phabricator.wikimedia.org/P62140 and previous config saved to /var/cache/conftool/dbconfig/20240509-014814-marostegui.json
  • 01:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P62139 and previous config saved to /var/cache/conftool/dbconfig/20240509-013305-marostegui.json
  • 01:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P62138 and previous config saved to /var/cache/conftool/dbconfig/20240509-011758-marostegui.json
  • 01:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T361627)', diff saved to https://phabricator.wikimedia.org/P62137 and previous config saved to /var/cache/conftool/dbconfig/20240509-010250-marostegui.json
  • 00:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T361627)', diff saved to https://phabricator.wikimedia.org/P62136 and previous config saved to /var/cache/conftool/dbconfig/20240509-005146-marostegui.json
  • 00:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 00:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 00:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T361627)', diff saved to https://phabricator.wikimedia.org/P62135 and previous config saved to /var/cache/conftool/dbconfig/20240509-005122-marostegui.json
  • 00:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P62134 and previous config saved to /var/cache/conftool/dbconfig/20240509-003614-marostegui.json
  • 00:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P62133 and previous config saved to /var/cache/conftool/dbconfig/20240509-002105-marostegui.json
  • 00:14 eileen: civicrm upgraded from bf49ecdc to 3c8a3095
  • 00:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T361627)', diff saved to https://phabricator.wikimedia.org/P62132 and previous config saved to /var/cache/conftool/dbconfig/20240509-000554-marostegui.json

2024-05-08

  • 23:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T361627)', diff saved to https://phabricator.wikimedia.org/P62131 and previous config saved to /var/cache/conftool/dbconfig/20240508-235350-marostegui.json
  • 23:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 23:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 23:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T361627)', diff saved to https://phabricator.wikimedia.org/P62130 and previous config saved to /var/cache/conftool/dbconfig/20240508-235327-marostegui.json
  • 23:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P62129 and previous config saved to /var/cache/conftool/dbconfig/20240508-233820-marostegui.json
  • 23:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240508-232308-marostegui.json
  • 23:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T361627)', diff saved to https://phabricator.wikimedia.org/P62127 and previous config saved to /var/cache/conftool/dbconfig/20240508-230800-marostegui.json
  • 22:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1207 (T361627)', diff saved to https://phabricator.wikimedia.org/P62126 and previous config saved to /var/cache/conftool/dbconfig/20240508-225652-marostegui.json
  • 22:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 22:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 22:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T361627)', diff saved to https://phabricator.wikimedia.org/P62125 and previous config saved to /var/cache/conftool/dbconfig/20240508-225628-marostegui.json
  • 22:53 mutante: contint1003 - systemctl start wmf_auto_restart_envoyproxy T364510 T358237
  • 22:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P62124 and previous config saved to /var/cache/conftool/dbconfig/20240508-224120-marostegui.json
  • 22:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P62123 and previous config saved to /var/cache/conftool/dbconfig/20240508-222613-marostegui.json
  • 22:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T361627)', diff saved to https://phabricator.wikimedia.org/P62122 and previous config saved to /var/cache/conftool/dbconfig/20240508-221105-marostegui.json
  • 21:22 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T361627)', diff saved to https://phabricator.wikimedia.org/P62121 and previous config saved to /var/cache/conftool/dbconfig/20240508-212242-marostegui.json
  • 21:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 21:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 21:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T361627)', diff saved to https://phabricator.wikimedia.org/P62119 and previous config saved to /var/cache/conftool/dbconfig/20240508-212219-marostegui.json
  • 21:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P62118 and previous config saved to /var/cache/conftool/dbconfig/20240508-210711-marostegui.json
  • 20:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P62117 and previous config saved to /var/cache/conftool/dbconfig/20240508-205203-marostegui.json
  • 20:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T361627)', diff saved to https://phabricator.wikimedia.org/P62116 and previous config saved to /var/cache/conftool/dbconfig/20240508-203655-marostegui.json
  • 20:25 ebernhardson@deploy1002: Finished scap: Backport for cirrus: Shift remaining public wikis in codfw to replacement updater (T363475) (duration: 16m 00s)
  • 20:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T361627)', diff saved to https://phabricator.wikimedia.org/P62115 and previous config saved to /var/cache/conftool/dbconfig/20240508-202516-marostegui.json
  • 20:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 20:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 20:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 20:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 20:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T361627)', diff saved to https://phabricator.wikimedia.org/P62114 and previous config saved to /var/cache/conftool/dbconfig/20240508-202446-marostegui.json
  • 20:12 ebernhardson@deploy1002: ebernhardson: Continuing with sync
  • 20:12 ebernhardson@deploy1002: ebernhardson: Backport for cirrus: Shift remaining public wikis in codfw to replacement updater (T363475) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:09 ebernhardson@deploy1002: Started scap: Backport for cirrus: Shift remaining public wikis in codfw to replacement updater (T363475)
  • 20:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P62113 and previous config saved to /var/cache/conftool/dbconfig/20240508-200935-marostegui.json
  • 19:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P62112 and previous config saved to /var/cache/conftool/dbconfig/20240508-195428-marostegui.json
  • 19:51 taavi@deploy1002: Finished scap: Backport for cawiki: Restore normal logo (T363057) (duration: 15m 29s)
  • 19:49 ebernhardson@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:48 ebernhardson@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T361627)', diff saved to https://phabricator.wikimedia.org/P62111 and previous config saved to /var/cache/conftool/dbconfig/20240508-193920-marostegui.json
  • 19:38 taavi@deploy1002: taavi: Continuing with sync
  • 19:38 taavi@deploy1002: taavi: Backport for cawiki: Restore normal logo (T363057) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T352010)', diff saved to https://phabricator.wikimedia.org/P62110 and previous config saved to /var/cache/conftool/dbconfig/20240508-193624-ladsgroup.json
  • 19:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 19:36 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 19:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T352010)', diff saved to https://phabricator.wikimedia.org/P62109 and previous config saved to /var/cache/conftool/dbconfig/20240508-193601-ladsgroup.json
  • 19:36 taavi@deploy1002: Started scap: Backport for cawiki: Restore normal logo (T363057)
  • 19:33 ladsgroup@deploy1002: Finished scap: Backport for FlaggedRevsStats: Fix migration to query builder (duration: 16m 39s)
  • 19:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T361627)', diff saved to https://phabricator.wikimedia.org/P62108 and previous config saved to /var/cache/conftool/dbconfig/20240508-192743-marostegui.json
  • 19:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 19:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 19:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T361627)', diff saved to https://phabricator.wikimedia.org/P62107 and previous config saved to /var/cache/conftool/dbconfig/20240508-192715-marostegui.json
  • 19:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P62106 and previous config saved to /var/cache/conftool/dbconfig/20240508-192054-ladsgroup.json
  • 19:20 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 19:20 ladsgroup@deploy1002: ladsgroup: Backport for FlaggedRevsStats: Fix migration to query builder synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:16 ladsgroup@deploy1002: Started scap: Backport for FlaggedRevsStats: Fix migration to query builder
  • 19:16 ladsgroup@deploy1002: Finished scap: Backport for Revert "logos: Add fawiki logo for 1,000,000 article" (duration: 16m 18s)
  • 19:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P62105 and previous config saved to /var/cache/conftool/dbconfig/20240508-191207-marostegui.json
  • 19:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P62104 and previous config saved to /var/cache/conftool/dbconfig/20240508-190546-ladsgroup.json
  • 19:03 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 19:02 ladsgroup@deploy1002: ladsgroup: Backport for Revert "logos: Add fawiki logo for 1,000,000 article" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:59 ladsgroup@deploy1002: Started scap: Backport for Revert "logos: Add fawiki logo for 1,000,000 article"
  • 18:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P62103 and previous config saved to /var/cache/conftool/dbconfig/20240508-185700-marostegui.json
  • 18:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T352010)', diff saved to https://phabricator.wikimedia.org/P62102 and previous config saved to /var/cache/conftool/dbconfig/20240508-185038-ladsgroup.json
  • 18:49 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.43.0-wmf.4 refs T361398
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T361627)', diff saved to https://phabricator.wikimedia.org/P62101 and previous config saved to /var/cache/conftool/dbconfig/20240508-184152-marostegui.json
  • 18:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T361627)', diff saved to https://phabricator.wikimedia.org/P62100 and previous config saved to /var/cache/conftool/dbconfig/20240508-183014-marostegui.json
  • 18:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 18:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 18:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T361627)', diff saved to https://phabricator.wikimedia.org/P62099 and previous config saved to /var/cache/conftool/dbconfig/20240508-182951-marostegui.json
  • 18:24 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.4 refs T361398
  • 18:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P62098 and previous config saved to /var/cache/conftool/dbconfig/20240508-181443-marostegui.json
  • 17:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P62097 and previous config saved to /var/cache/conftool/dbconfig/20240508-175936-marostegui.json
  • 17:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T361627)', diff saved to https://phabricator.wikimedia.org/P62096 and previous config saved to /var/cache/conftool/dbconfig/20240508-174428-marostegui.json
  • 17:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1163 (T361627)', diff saved to https://phabricator.wikimedia.org/P62095 and previous config saved to /var/cache/conftool/dbconfig/20240508-173353-marostegui.json
  • 17:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 17:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 17:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 17:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 17:04 sfaci@deploy1002: Finished deploy [airflow-dags/analytics@1f72038]: (no justification provided) (duration: 00m 29s)
  • 17:03 sfaci@deploy1002: Started deploy [airflow-dags/analytics@1f72038]: (no justification provided)
  • 16:45 sfaci@deploy1002: Finished deploy [analytics/refinery@1c45ef4] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@1c45ef4d] (duration: 02m 52s)
  • 16:45 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:45 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 16:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 16:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T361627)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240508-164322-marostegui.json
  • 16:43 sfaci@deploy1002: Started deploy [analytics/refinery@1c45ef4] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@1c45ef4d]
  • 16:42 sfaci@deploy1002: Finished deploy [analytics/refinery@1c45ef4] (thin): Regular analytics weekly train THIN [analytics/refinery@1c45ef4d] (duration: 03m 53s)
  • 16:38 sfaci@deploy1002: Started deploy [analytics/refinery@1c45ef4] (thin): Regular analytics weekly train THIN [analytics/refinery@1c45ef4d]
  • 16:38 sfaci@deploy1002: Finished deploy [analytics/refinery@1c45ef4]: Regular analytics weekly train [analytics/refinery@1c45ef4d] (duration: 16m 37s)
  • 16:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P62094 and previous config saved to /var/cache/conftool/dbconfig/20240508-162812-marostegui.json
  • 16:25 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 16:24 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 16:21 sfaci@deploy1002: Started deploy [analytics/refinery@1c45ef4]: Regular analytics weekly train [analytics/refinery@1c45ef4d]
  • 16:21 sfaci: Deploying refinery
  • 16:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P62093 and previous config saved to /var/cache/conftool/dbconfig/20240508-161305-marostegui.json
  • 16:06 klausman@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 16:03 jelto@cumin1002: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 15:58 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 15:58 klausman@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 15:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T361627)', diff saved to https://phabricator.wikimedia.org/P62092 and previous config saved to /var/cache/conftool/dbconfig/20240508-155757-marostegui.json
  • 15:56 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 15:50 vgutierrez: tested fifo-log-demux 0.7.3 on cp4052, downgraded to 0.6.5
  • 15:38 moritzm: imported tomcat9 9.0.43-2~deb11u10+wmf12u1 to component/tomcat9 for bookworm-wikimedia (rebasing our forward port to the latest security update)
  • 15:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1226 (T361627)', diff saved to https://phabricator.wikimedia.org/P62091 and previous config saved to /var/cache/conftool/dbconfig/20240508-153738-marostegui.json
  • 15:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 15:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 15:35 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 15:35 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 15:28 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 15:28 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 15:22 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 15:21 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 15:21 jelto: bump apt package gitlab-ce to 16.9.7-ce.0
  • 15:17 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 15:16 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 15:09 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 15:08 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 15:08 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 15:06 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 15:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 15:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 15:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T361627)', diff saved to https://phabricator.wikimedia.org/P62090 and previous config saved to /var/cache/conftool/dbconfig/20240508-150611-marostegui.json
  • 15:05 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 15:05 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 14:58 klausman@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:58 klausman@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P62089 and previous config saved to /var/cache/conftool/dbconfig/20240508-145100-marostegui.json
  • 14:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1157 (T352010)', diff saved to https://phabricator.wikimedia.org/P62088 and previous config saved to /var/cache/conftool/dbconfig/20240508-144501-ladsgroup.json
  • 14:44 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 14:44 moritzm: installing Java 8 security updates
  • 14:44 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 14:38 moritzm: installing Java 11 security updates
  • 14:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P62087 and previous config saved to /var/cache/conftool/dbconfig/20240508-143552-marostegui.json
  • 14:23 jiji@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:22 jiji@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T361627)', diff saved to https://phabricator.wikimedia.org/P62086 and previous config saved to /var/cache/conftool/dbconfig/20240508-142045-marostegui.json
  • 14:18 jiji@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:17 jiji@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:17 moritzm: installing libgd2 security updates
  • 14:15 jiji@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 14:15 jiji@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 14:14 jiji@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:13 jiji@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 14:09 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 14:08 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 13:59 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 13:57 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 13:57 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 13:55 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 13:54 zabe@deploy1002: Finished scap: Backport for Enable 'flood' user group at en.wikiquote (T351250), Remove wmgCollectionArticleNamespaces config for enWS (T361422) (duration: 19m 22s)
  • 13:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1214 (T361627)', diff saved to https://phabricator.wikimedia.org/P62084 and previous config saved to /var/cache/conftool/dbconfig/20240508-135314-marostegui.json
  • 13:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 13:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T361627)', diff saved to https://phabricator.wikimedia.org/P62083 and previous config saved to /var/cache/conftool/dbconfig/20240508-135250-marostegui.json
  • 13:52 moritzm: installing Java 17 security updates
  • 13:47 jiji@deploy1002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 13:45 jiji@deploy1002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 13:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1236.eqiad.wmnet
  • 13:43 vgutierrez: update to tcp-mss-clamper 0.5 on ncredir6001
  • 13:41 zabe@deploy1002: zabe and dreamrimmer: Continuing with sync
  • 13:41 vgutierrez: uploaded tcp-mss-clamper 0.5 (bullseye|bookworm)-wikimedia (apt.wm.o)
  • 13:39 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:38 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1236.eqiad.wmnet
  • 13:37 zabe@deploy1002: zabe and dreamrimmer: Backport for Enable 'flood' user group at en.wikiquote (T351250), Remove wmgCollectionArticleNamespaces config for enWS (T361422) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1227.eqiad.wmnet
  • 13:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P62082 and previous config saved to /var/cache/conftool/dbconfig/20240508-133742-marostegui.json
  • 13:35 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1014.eqiad.wmnet
  • 13:35 zabe@deploy1002: Started scap: Backport for Enable 'flood' user group at en.wikiquote (T351250), Remove wmgCollectionArticleNamespaces config for enWS (T361422)
  • 13:32 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1014.eqiad.wmnet
  • 13:31 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1013.eqiad.wmnet
  • 13:28 zabe@deploy1002: Finished scap: Backport for Add tm: as alias to template: on English Wikipedia (T363757), [ruwiki] Limit the use of the ContentTranslation tool (T362440) (duration: 21m 36s)
  • 13:28 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 13:27 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 13:27 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 13:27 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 13:27 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1013.eqiad.wmnet
  • 13:27 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1227.eqiad.wmnet
  • 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1202.eqiad.wmnet
  • 13:23 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1012.eqiad.wmnet
  • 13:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P62081 and previous config saved to /var/cache/conftool/dbconfig/20240508-132235-marostegui.json
  • 13:21 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:17 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1012.eqiad.wmnet
  • 13:16 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1202.eqiad.wmnet
  • 13:15 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1011.eqiad.wmnet
  • 13:14 zabe@deploy1002: zabe and dreamrimmer: Continuing with sync
  • 13:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1191.eqiad.wmnet
  • 13:11 zabe@deploy1002: zabe and dreamrimmer: Backport for Add tm: as alias to template: on English Wikipedia (T363757), [ruwiki] Limit the use of the ContentTranslation tool (T362440) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:10 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1011.eqiad.wmnet
  • 13:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T361627)', diff saved to https://phabricator.wikimedia.org/P62080 and previous config saved to /var/cache/conftool/dbconfig/20240508-130727-marostegui.json
  • 13:06 zabe@deploy1002: Started scap: Backport for Add tm: as alias to template: on English Wikipedia (T363757), [ruwiki] Limit the use of the ContentTranslation tool (T362440)
  • 13:05 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1010.eqiad.wmnet
  • 12:58 elukey: depool/deploy/repool every node in the range ms-fe10[10-14] to upgrade envoy to PKI TLS certs
  • 12:57 klausman@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 12:57 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1010.eqiad.wmnet
  • 12:56 klausman@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 12:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1191.eqiad.wmnet
  • 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1181.eqiad.wmnet
  • 12:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P62076 and previous config saved to /var/cache/conftool/dbconfig/20240508-122631-marostegui.json
  • 12:22 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1174.eqiad.wmnet
  • 12:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1170.eqiad.wmnet
  • 12:16 hnowlan@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw2396.codfw.wmnet|mw2397.codfw.wmnet|mw2398.codfw.wmnet|mw2399.codfw.wmnet|mw2401.codfw.wmnet|mw2402.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 12:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P62075 and previous config saved to /var/cache/conftool/dbconfig/20240508-121123-marostegui.json
  • 12:08 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1170.eqiad.wmnet
  • 11:57 moritzm: installing tomcat security updates
  • 11:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T361627)', diff saved to https://phabricator.wikimedia.org/P62074 and previous config saved to /var/cache/conftool/dbconfig/20240508-115616-marostegui.json
  • 11:37 hnowlan: running homer commit for new codfw appservers
  • 11:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1203 (T361627)', diff saved to https://phabricator.wikimedia.org/P62073 and previous config saved to /var/cache/conftool/dbconfig/20240508-113048-marostegui.json
  • 11:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 11:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 11:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T361627)', diff saved to https://phabricator.wikimedia.org/P62072 and previous config saved to /var/cache/conftool/dbconfig/20240508-113025-marostegui.json
  • 11:24 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62071 and previous config saved to /var/cache/conftool/dbconfig/20240508-112439-root.json
  • 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'db1177 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62070 and previous config saved to /var/cache/conftool/dbconfig/20240508-112054-root.json
  • 11:17 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1015.eqiad.wmnet
  • 11:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P62069 and previous config saved to /var/cache/conftool/dbconfig/20240508-111518-marostegui.json
  • 11:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1015.eqiad.wmnet
  • 11:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2397.codfw.wmnet with OS bullseye
  • 11:09 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62068 and previous config saved to /var/cache/conftool/dbconfig/20240508-110933-root.json
  • 11:08 volans@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for sretest1003.eqiad.wmnet: Renew puppet certificate - volans@cumin1002
  • 11:06 volans@cumin1002: START - Cookbook sre.puppet.renew-cert for sretest1003.eqiad.wmnet: Renew puppet certificate - volans@cumin1002
  • 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1011.eqiad.wmnet
  • 11:06 volans@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for sretest1002.eqiad.wmnet: Renew puppet certificate - volans@cumin1002
  • 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1177 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62067 and previous config saved to /var/cache/conftool/dbconfig/20240508-110545-root.json
  • 11:05 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2399.codfw.wmnet with OS bullseye
  • 11:03 volans@cumin1002: START - Cookbook sre.puppet.renew-cert for sretest1002.eqiad.wmnet: Renew puppet certificate - volans@cumin1002
  • 11:02 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2402.codfw.wmnet with OS bullseye
  • 11:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2398.codfw.wmnet with OS bullseye
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P62066 and previous config saved to /var/cache/conftool/dbconfig/20240508-110010-marostegui.json
  • 10:59 volans@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for sretest1001.eqiad.wmnet: Renew puppet certificate - volans@cumin1002
  • 10:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2401.codfw.wmnet with OS bullseye
  • 10:57 volans@cumin1002: START - Cookbook sre.puppet.renew-cert for sretest1001.eqiad.wmnet: Renew puppet certificate - volans@cumin1002
  • 10:55 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2396.codfw.wmnet with OS bullseye
  • 10:54 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62065 and previous config saved to /var/cache/conftool/dbconfig/20240508-105428-root.json
  • 10:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1011.eqiad.wmnet
  • 10:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1177 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62064 and previous config saved to /var/cache/conftool/dbconfig/20240508-105039-root.json
  • 10:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2397.codfw.wmnet with reason: host reimage
  • 10:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2220.codfw.wmnet
  • 10:48 ladsgroup@deploy1002: Finished scap: Backport for pager: Use SelectQueryBuilder::rawTables in IndexPager (T364428) (duration: 15m 42s)
  • 10:46 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2399.codfw.wmnet with reason: host reimage
  • 10:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T361627)', diff saved to https://phabricator.wikimedia.org/P62063 and previous config saved to /var/cache/conftool/dbconfig/20240508-104503-marostegui.json
  • 10:44 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2402.codfw.wmnet with reason: host reimage
  • 10:41 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2398.codfw.wmnet with reason: host reimage
  • 10:39 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62062 and previous config saved to /var/cache/conftool/dbconfig/20240508-103922-root.json
  • 10:39 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2220.codfw.wmnet
  • 10:38 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2401.codfw.wmnet with reason: host reimage
  • 10:36 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2396.codfw.wmnet with reason: host reimage
  • 10:36 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 10:35 ladsgroup@deploy1002: ladsgroup: Backport for pager: Use SelectQueryBuilder::rawTables in IndexPager (T364428) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:35 marostegui@cumin1002: dbctl commit (dc=all): 'db1177 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62061 and previous config saved to /var/cache/conftool/dbconfig/20240508-103531-root.json
  • 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2218.codfw.wmnet
  • 10:34 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62060 and previous config saved to /var/cache/conftool/dbconfig/20240508-103410-root.json
  • 10:33 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2398.codfw.wmnet with reason: host reimage
  • 10:33 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2397.codfw.wmnet with reason: host reimage
  • 10:33 ladsgroup@deploy1002: Started scap: Backport for pager: Use SelectQueryBuilder::rawTables in IndexPager (T364428)
  • 10:33 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2399.codfw.wmnet with reason: host reimage
  • 10:32 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2402.codfw.wmnet with reason: host reimage
  • 10:32 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2401.codfw.wmnet with reason: host reimage
  • 10:32 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2396.codfw.wmnet with reason: host reimage
  • 10:24 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62059 and previous config saved to /var/cache/conftool/dbconfig/20240508-102416-root.json
  • 10:20 marostegui@cumin1002: dbctl commit (dc=all): 'db1177 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62058 and previous config saved to /var/cache/conftool/dbconfig/20240508-102023-root.json
  • 10:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1193 (T361627)', diff saved to https://phabricator.wikimedia.org/P62057 and previous config saved to /var/cache/conftool/dbconfig/20240508-101946-marostegui.json
  • 10:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 10:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 10:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T361627)', diff saved to https://phabricator.wikimedia.org/P62056 and previous config saved to /var/cache/conftool/dbconfig/20240508-101923-marostegui.json
  • 10:19 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62055 and previous config saved to /var/cache/conftool/dbconfig/20240508-101905-root.json
  • 10:19 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1011.eqiad.wmnet with OS bullseye
  • 10:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2399.codfw.wmnet with OS bullseye
  • 10:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2398.codfw.wmnet with OS bullseye
  • 10:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2402.codfw.wmnet with OS bullseye
  • 10:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2401.codfw.wmnet with OS bullseye
  • 10:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2397.codfw.wmnet with OS bullseye
  • 10:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2396.codfw.wmnet with OS bullseye
  • 10:11 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2218.codfw.wmnet
  • 10:09 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62054 and previous config saved to /var/cache/conftool/dbconfig/20240508-100910-root.json
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1177 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62053 and previous config saved to /var/cache/conftool/dbconfig/20240508-100517-root.json
  • 10:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P62052 and previous config saved to /var/cache/conftool/dbconfig/20240508-100416-marostegui.json
  • 10:03 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62051 and previous config saved to /var/cache/conftool/dbconfig/20240508-100359-root.json
  • 09:58 hnowlan: depooling 6 6 codfw api appservers in advance of reimaging to k8s workers
  • 09:56 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1011.eqiad.wmnet with reason: host reimage
  • 09:54 marostegui@cumin1002: dbctl commit (dc=all): 'es1022 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62050 and previous config saved to /var/cache/conftool/dbconfig/20240508-095405-root.json
  • 09:53 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1011.eqiad.wmnet with reason: host reimage
  • 09:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1177 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62049 and previous config saved to /var/cache/conftool/dbconfig/20240508-095011-root.json
  • 09:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1022.eqiad.wmnet with OS bookworm
  • 09:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P62048 and previous config saved to /var/cache/conftool/dbconfig/20240508-094905-marostegui.json
  • 09:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1177.eqiad.wmnet with OS bookworm
  • 09:48 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62047 and previous config saved to /var/cache/conftool/dbconfig/20240508-094853-root.json
  • 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2208.codfw.wmnet
  • 09:41 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host snapshot1011.eqiad.wmnet with OS bullseye
  • 09:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T361627)', diff saved to https://phabricator.wikimedia.org/P62046 and previous config saved to /var/cache/conftool/dbconfig/20240508-093350-marostegui.json
  • 09:33 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62045 and previous config saved to /var/cache/conftool/dbconfig/20240508-093347-root.json
  • 09:29 marostegui@cumin1002: dbctl commit (dc=all): 'db1178 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62044 and previous config saved to /var/cache/conftool/dbconfig/20240508-092944-root.json
  • 09:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
  • 09:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1022.eqiad.wmnet with reason: host reimage
  • 09:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
  • 09:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1022.eqiad.wmnet with reason: host reimage
  • 09:18 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62043 and previous config saved to /var/cache/conftool/dbconfig/20240508-091841-root.json
  • 09:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1178 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62042 and previous config saved to /var/cache/conftool/dbconfig/20240508-091434-root.json
  • 09:10 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1177.eqiad.wmnet with OS bookworm
  • 09:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1177 T363792', diff saved to https://phabricator.wikimedia.org/P62041 and previous config saved to /var/cache/conftool/dbconfig/20240508-090925-root.json
  • 09:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1192 (T361627)', diff saved to https://phabricator.wikimedia.org/P62040 and previous config saved to /var/cache/conftool/dbconfig/20240508-090817-marostegui.json
  • 09:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 09:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T361627)', diff saved to https://phabricator.wikimedia.org/P62039 and previous config saved to /var/cache/conftool/dbconfig/20240508-090754-marostegui.json
  • 09:07 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bookworm
  • 09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1022 T364289', diff saved to https://phabricator.wikimedia.org/P62038 and previous config saved to /var/cache/conftool/dbconfig/20240508-090621-root.json
  • 09:03 marostegui@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62037 and previous config saved to /var/cache/conftool/dbconfig/20240508-090334-root.json
  • 08:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1178 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62036 and previous config saved to /var/cache/conftool/dbconfig/20240508-085929-root.json
  • 08:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2023.codfw.wmnet with OS bookworm
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P62035 and previous config saved to /var/cache/conftool/dbconfig/20240508-085246-marostegui.json
  • 08:44 marostegui@cumin1002: dbctl commit (dc=all): 'db1178 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62034 and previous config saved to /var/cache/conftool/dbconfig/20240508-084422-root.json
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P62033 and previous config saved to /var/cache/conftool/dbconfig/20240508-083739-marostegui.json
  • 08:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2208.codfw.wmnet
  • 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2182.codfw.wmnet
  • 08:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2023.codfw.wmnet with reason: host reimage
  • 08:32 klausman@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2023.codfw.wmnet with reason: host reimage
  • 08:31 klausman@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 08:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 08:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 08:29 marostegui@cumin1002: dbctl commit (dc=all): 'db1178 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62032 and previous config saved to /var/cache/conftool/dbconfig/20240508-082917-root.json
  • 08:24 klausman@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 08:23 klausman@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 08:22 klausman@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T361627)', diff saved to https://phabricator.wikimedia.org/P62031 and previous config saved to /var/cache/conftool/dbconfig/20240508-082231-marostegui.json
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62030 and previous config saved to /var/cache/conftool/dbconfig/20240508-082202-root.json
  • 08:21 klausman@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 08:21 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2182.codfw.wmnet
  • 08:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2168.codfw.wmnet
  • 08:16 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62029 and previous config saved to /var/cache/conftool/dbconfig/20240508-081633-root.json
  • 08:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1178 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62028 and previous config saved to /var/cache/conftool/dbconfig/20240508-081412-root.json
  • 08:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2023.codfw.wmnet with OS bookworm
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Give some weight to es5 master', diff saved to https://phabricator.wikimedia.org/P62027 and previous config saved to /var/cache/conftool/dbconfig/20240508-080848-marostegui.json
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2023 T364443', diff saved to https://phabricator.wikimedia.org/P62026 and previous config saved to /var/cache/conftool/dbconfig/20240508-080812-root.json
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62025 and previous config saved to /var/cache/conftool/dbconfig/20240508-080656-root.json
  • 08:06 marostegui: Starting es5 codfw failover from es2023 to es2024 - T364443
  • 08:03 marostegui@cumin1002: dbctl commit (dc=all): 'Set es2024 with weight 0 T364443', diff saved to https://phabricator.wikimedia.org/P62024 and previous config saved to /var/cache/conftool/dbconfig/20240508-080312-root.json
  • 08:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es5 T364443
  • 08:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es5 T364443
  • 08:01 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62023 and previous config saved to /var/cache/conftool/dbconfig/20240508-080128-root.json
  • 07:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1178 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62022 and previous config saved to /var/cache/conftool/dbconfig/20240508-075906-root.json
  • 07:57 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2168.codfw.wmnet
  • 07:57 Emperor: depool/restart/repool ms-fe1012
  • 07:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1177 (T361627)', diff saved to https://phabricator.wikimedia.org/P62021 and previous config saved to /var/cache/conftool/dbconfig/20240508-075635-marostegui.json
  • 07:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 07:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 07:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T361627)', diff saved to https://phabricator.wikimedia.org/P62020 and previous config saved to /var/cache/conftool/dbconfig/20240508-075610-marostegui.json
  • 07:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2150.codfw.wmnet
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62019 and previous config saved to /var/cache/conftool/dbconfig/20240508-075150-root.json
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62018 and previous config saved to /var/cache/conftool/dbconfig/20240508-074620-root.json
  • 07:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P62017 and previous config saved to /var/cache/conftool/dbconfig/20240508-074102-marostegui.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62016 and previous config saved to /var/cache/conftool/dbconfig/20240508-073644-root.json
  • 07:33 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2150.codfw.wmnet
  • 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62015 and previous config saved to /var/cache/conftool/dbconfig/20240508-073109-root.json
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P62014 and previous config saved to /var/cache/conftool/dbconfig/20240508-072554-marostegui.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62012 and previous config saved to /var/cache/conftool/dbconfig/20240508-072138-root.json
  • 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62011 and previous config saved to /var/cache/conftool/dbconfig/20240508-071604-root.json
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T361627)', diff saved to https://phabricator.wikimedia.org/P62010 and previous config saved to /var/cache/conftool/dbconfig/20240508-071047-marostegui.json
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62009 and previous config saved to /var/cache/conftool/dbconfig/20240508-070632-root.json
  • 07:02 moritzm: uninstalling git-fat on buster hosts T364373
  • 07:00 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62008 and previous config saved to /var/cache/conftool/dbconfig/20240508-070058-root.json
  • 06:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62007 and previous config saved to /var/cache/conftool/dbconfig/20240508-065127-root.json
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'es2025 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62006 and previous config saved to /var/cache/conftool/dbconfig/20240508-064552-root.json
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1172 (T361627)', diff saved to https://phabricator.wikimedia.org/P62005 and previous config saved to /var/cache/conftool/dbconfig/20240508-064523-marostegui.json
  • 06:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 06:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 06:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es5 T364443
  • 06:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es5 T364443
  • 06:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2022.codfw.wmnet with OS bookworm
  • 06:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 06:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 06:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T361627)', diff saved to https://phabricator.wikimedia.org/P62004 and previous config saved to /var/cache/conftool/dbconfig/20240508-062012-marostegui.json
  • 06:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2022.codfw.wmnet with reason: host reimage
  • 06:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2022.codfw.wmnet with reason: host reimage
  • 06:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P62003 and previous config saved to /var/cache/conftool/dbconfig/20240508-060501-marostegui.json
  • 06:03 marostegui@cumin1002: dbctl commit (dc=all): 'Give more weight to es2021', diff saved to https://phabricator.wikimedia.org/P62002 and previous config saved to /var/cache/conftool/dbconfig/20240508-060312-marostegui.json
  • 05:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2025.codfw.wmnet with OS bookworm
  • 05:50 marostegui@cumin1002: dbctl commit (dc=all): 'Give more weight to es2021', diff saved to https://phabricator.wikimedia.org/P62001 and previous config saved to /var/cache/conftool/dbconfig/20240508-055023-marostegui.json
  • 05:50 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2022.codfw.wmnet with OS bookworm
  • 05:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P62000 and previous config saved to /var/cache/conftool/dbconfig/20240508-054953-marostegui.json
  • 05:48 marostegui@cumin1002: dbctl commit (dc=all): 'Give more weight to es2021', diff saved to https://phabricator.wikimedia.org/P61999 and previous config saved to /var/cache/conftool/dbconfig/20240508-054825-marostegui.json
  • 05:47 marostegui@cumin1002: dbctl commit (dc=all): 'Give more weight to es2021', diff saved to https://phabricator.wikimedia.org/P61998 and previous config saved to /var/cache/conftool/dbconfig/20240508-054742-marostegui.json
  • 05:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2022', diff saved to https://phabricator.wikimedia.org/P61997 and previous config saved to /var/cache/conftool/dbconfig/20240508-054705-root.json
  • 05:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T361627)', diff saved to https://phabricator.wikimedia.org/P61996 and previous config saved to /var/cache/conftool/dbconfig/20240508-053445-marostegui.json
  • 05:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2025.codfw.wmnet with reason: host reimage
  • 05:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS bookworm
  • 05:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2025.codfw.wmnet with reason: host reimage
  • 05:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
  • 05:05 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2025.codfw.wmnet with OS bookworm
  • 05:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1167 (T361627)', diff saved to https://phabricator.wikimedia.org/P61995 and previous config saved to /var/cache/conftool/dbconfig/20240508-050419-marostegui.json
  • 05:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
  • 05:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2025', diff saved to https://phabricator.wikimedia.org/P61994 and previous config saved to /var/cache/conftool/dbconfig/20240508-050408-root.json
  • 05:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 05:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 05:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 05:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 04:52 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS bookworm
  • 04:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2203.codfw.wmnet with reason: Maintenance
  • 04:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2203.codfw.wmnet with reason: Maintenance
  • 02:16 eileen: civicrm upgraded from 867c3a0d to bf49ecdc

2024-05-07

  • 23:21 eileen: civicrm upgraded from aee07c4e to 867c3a0d
  • 22:50 eileen: civicrm upgraded from 80ae4543 to aee07c4e
  • 21:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T352010)', diff saved to https://phabricator.wikimedia.org/P61992 and previous config saved to /var/cache/conftool/dbconfig/20240507-215122-ladsgroup.json
  • 21:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P61991 and previous config saved to /var/cache/conftool/dbconfig/20240507-213614-ladsgroup.json
  • 21:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P61990 and previous config saved to /var/cache/conftool/dbconfig/20240507-213227-ladsgroup.json
  • 21:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P61989 and previous config saved to /var/cache/conftool/dbconfig/20240507-212103-ladsgroup.json
  • 21:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P61988 and previous config saved to /var/cache/conftool/dbconfig/20240507-211717-ladsgroup.json
  • 21:15 zabe@deploy1002: Finished scap: Backport for Use OpenSSL for PBKDF2 password hashing (T320929) (duration: 17m 14s)
  • 21:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T352010)', diff saved to https://phabricator.wikimedia.org/P61987 and previous config saved to /var/cache/conftool/dbconfig/20240507-210556-ladsgroup.json
  • 21:03 zabe@deploy1002: zabe and ki: Continuing with sync
  • 21:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P61986 and previous config saved to /var/cache/conftool/dbconfig/20240507-210209-ladsgroup.json
  • 21:01 zabe@deploy1002: zabe and ki: Backport for Use OpenSSL for PBKDF2 password hashing (T320929) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:58 zabe@deploy1002: Started scap: Backport for Use OpenSSL for PBKDF2 password hashing (T320929)
  • 20:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P61985 and previous config saved to /var/cache/conftool/dbconfig/20240507-204701-ladsgroup.json
  • 20:40 zabe@deploy1002: Finished scap: Backport for Avoid empty insert in SqlScoreStorage::storeScores (T364218) (duration: 16m 01s)
  • 20:27 zabe@deploy1002: zabe: Continuing with sync
  • 20:26 zabe@deploy1002: zabe: Backport for Avoid empty insert in SqlScoreStorage::storeScores (T364218) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:24 zabe@deploy1002: Started scap: Backport for Avoid empty insert in SqlScoreStorage::storeScores (T364218)
  • 20:19 jhuneidi@deploy1002: Finished scap: testwikis wikis to 1.43.0-wmf.4 refs T361398 (duration: 15m 03s)
  • 20:17 denisse: Deleting the kibana and kibana-combined certificates from the private repository - T360414
  • 20:09 denisse: Restarting envoyproxy and opensearch-dashboards services on the Logstash hosts that serve OpenSearch dashboards to migrate to CFSSL certificates - T360414
  • 20:06 denisse: Enabling Puppet on the Logstash hosts that serve OpenSearch dashboards to migrate to CFSSL certificates - T360414
  • 20:04 jhuneidi@deploy1002: Started scap: testwikis wikis to 1.43.0-wmf.4 refs T361398
  • 19:59 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.4 refs T361398
  • 19:57 denisse@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 12 hosts with reason: Downtiming the Logstash hosts serving OpenSearch Dashboards as part of the cergen to CFSSL migration - T360414
  • 19:57 denisse@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on 12 hosts with reason: Downtiming the Logstash hosts serving OpenSearch Dashboards as part of the cergen to CFSSL migration - T360414
  • 19:46 denisse: disabling Puppet on the Logstash hosts that serve OpenSearch dashboards to test the CFSSL certificates - T360414
  • 19:34 jhuneidi@deploy1002: Finished scap: Backport for Partial cherry-pick of I9d8409fdbd757e (T361398 T362566) (duration: 15m 39s)
  • 19:21 jhuneidi@deploy1002: ladsgroup and jhuneidi: Continuing with sync
  • 19:21 jhuneidi@deploy1002: ladsgroup and jhuneidi: Backport for Partial cherry-pick of I9d8409fdbd757e (T361398 T362566) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:18 jhuneidi@deploy1002: Started scap: Backport for Partial cherry-pick of I9d8409fdbd757e (T361398 T362566)
  • 18:40 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1013.eqiad.wmnet with reason: Decommissioning — T364422
  • 18:40 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on aqs1013.eqiad.wmnet with reason: Decommissioning — T364422
  • 17:33 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/apertium: apply
  • 17:32 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/apertium: apply
  • 17:21 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
  • 17:20 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/apertium: apply
  • 17:14 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/apertium: apply
  • 17:13 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/apertium: apply
  • 16:48 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
  • 16:39 elukey@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
  • 16:34 zabe@deploy1002: Finished scap: T363825 (duration: 07m 42s)
  • 16:26 zabe@deploy1002: Started scap: T363825
  • 16:08 zabe@deploy1002: sync-world aborted: (no justification provided) (duration: 00m 00s)
  • 16:08 zabe@deploy1002: Started scap: (no justification provided)
  • 16:05 ladsgroup@deploy1002: Finished scap: Backport for Stop writing to old columns of pagelinks in most wikis (T352010 T299947) (duration: 32m 29s)
  • 15:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P61983 and previous config saved to /var/cache/conftool/dbconfig/20240507-155822-ladsgroup.json
  • 15:58 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 15:58 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 15:52 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 15:38 ladsgroup@deploy1002: ladsgroup: Backport for Stop writing to old columns of pagelinks in most wikis (T352010 T299947) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:34 ejegg: switched Adyen IPN format to JSON in merchant console
  • 15:32 ladsgroup@deploy1002: Started scap: Backport for Stop writing to old columns of pagelinks in most wikis (T352010 T299947)
  • 15:31 ejegg: SmashPig (standalone IPN listener) upgraded from 71b9be53 to 67db9d96
  • 15:29 hnowlan: depooling 5 eqiad api appservers in advance of reimaging to k8s workers
  • 15:19 moritzm: imported nodejs 20.5.1-deb-1nodesource1 to thirdparty/node20 T362681
  • 15:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2122.codfw.wmnet
  • 15:13 godog: remove accidentally set site!=magru silence, add site=magru silence instead - T364016
  • 15:12 elukey: repool ms-fe1009's envoy with PKI TLS cert
  • 15:12 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1009.eqiad.wmnet
  • 14:55 elukey: depool ms-fe1009's nginx (swift proxy) to safely apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1026927
  • 14:54 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1009.eqiad.wmnet
  • 14:53 sukhe: A:cp and A:magru: running haproxy-restart
  • 14:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2122.codfw.wmnet
  • 14:53 hnowlan@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw2305.codfw.wmnet|mw2325.codfw.wmnet|mw2338.codfw.wmnet|mw2359.codfw.wmnet|mw2390.codfw.wmnet|mw2407.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 14:52 moritzm: installing mariadb-10.5 security updates (as packaged in Debian, not the wmf-mariadb packages)
  • 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2121.codfw.wmnet
  • 14:50 godog: silence site=magru alerts during prometheus7001 - T364016
  • 14:44 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2121.codfw.wmnet
  • 14:41 hnowlan: running homer 'cr*codfw*' commit to configure BGP for new k8s codfw workers
  • 14:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2338.codfw.wmnet with OS bullseye
  • 14:33 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2325.codfw.wmnet with OS bullseye
  • 14:31 filippo@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host prometheus7001.magru.wmnet
  • 14:31 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host prometheus7001.magru.wmnet with OS bullseye
  • 14:30 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2305.codfw.wmnet with OS bullseye
  • 14:28 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2359.codfw.wmnet with OS bullseye
  • 14:23 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2407.codfw.wmnet with OS bullseye
  • 14:22 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 14:20 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2390.codfw.wmnet with OS bullseye
  • 14:19 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2338.codfw.wmnet with reason: host reimage
  • 14:16 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus7001.magru.wmnet with reason: host reimage
  • 14:13 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2325.codfw.wmnet with reason: host reimage
  • 14:13 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus7001.magru.wmnet with reason: host reimage
  • 14:12 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@b543b85]: (no justification provided) (duration: 00m 24s)
  • 14:11 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@b543b85]: (no justification provided)
  • 14:10 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2305.codfw.wmnet with reason: host reimage
  • 14:08 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2359.codfw.wmnet with reason: host reimage
  • 14:04 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2407.codfw.wmnet with reason: host reimage
  • 14:03 btullis@deploy1002: Finished deploy [airflow-dags/analytics@6be7efd]: (no justification provided) (duration: 00m 27s)
  • 14:03 btullis@deploy1002: Started deploy [airflow-dags/analytics@6be7efd]: (no justification provided)
  • 14:01 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2390.codfw.wmnet with reason: host reimage
  • 13:57 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2338.codfw.wmnet with reason: host reimage
  • 13:56 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2305.codfw.wmnet with reason: host reimage
  • 13:56 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2325.codfw.wmnet with reason: host reimage
  • 13:56 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2359.codfw.wmnet with reason: host reimage
  • 13:56 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2407.codfw.wmnet with reason: host reimage
  • 13:56 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2390.codfw.wmnet with reason: host reimage
  • 13:53 filippo@cumin1002: START - Cookbook sre.hosts.reimage for host prometheus7001.magru.wmnet with OS bullseye
  • 13:51 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 13:50 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 13:50 filippo@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus7001.magru.wmnet on all recursors
  • 13:50 filippo@cumin1002: START - Cookbook sre.dns.wipe-cache prometheus7001.magru.wmnet on all recursors
  • 13:50 filippo@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:50 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 13:49 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 13:48 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@ad4934c]: (no justification provided) (duration: 00m 32s)
  • 13:47 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@ad4934c]: (no justification provided)
  • 13:44 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 13:44 filippo@cumin1002: START - Cookbook sre.ganeti.makevm for new host prometheus7001.magru.wmnet
  • 13:41 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus7001.magru.wmnet
  • 13:40 filippo@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:40 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1002"
  • 13:40 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2359.codfw.wmnet with OS bullseye
  • 13:40 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2390.codfw.wmnet with OS bullseye
  • 13:40 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2407.codfw.wmnet with OS bullseye
  • 13:40 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2338.codfw.wmnet with OS bullseye
  • 13:40 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2305.codfw.wmnet with OS bullseye
  • 13:40 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2325.codfw.wmnet with OS bullseye
  • 13:40 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1002"
  • 13:36 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 13:31 filippo@cumin1002: START - Cookbook sre.hosts.decommission for hosts prometheus7001.magru.wmnet
  • 13:31 filippo@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus7001.magru.wmnet
  • 13:31 filippo@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 13:29 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 13:29 filippo@cumin1002: START - Cookbook sre.ganeti.makevm for new host prometheus7001.magru.wmnet
  • 13:25 filippo@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus7001.magru.wmnet
  • 13:25 filippo@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus7001.magru.wmnet with OS bullseye
  • 13:21 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 13:19 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 13:17 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 13:14 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 13:05 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 13:04 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 12:48 klausman@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 12:46 klausman@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 12:10 filippo@cumin1002: START - Cookbook sre.hosts.reimage for host prometheus7001.magru.wmnet with OS bullseye
  • 12:09 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 12:08 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 12:08 filippo@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus7001.magru.wmnet on all recursors
  • 12:08 filippo@cumin1002: START - Cookbook sre.dns.wipe-cache prometheus7001.magru.wmnet on all recursors
  • 12:08 filippo@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:08 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 12:07 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 12:05 btullis@deploy1002: Finished deploy [airflow-dags/analytics@e5ba870]: (no justification provided) (duration: 00m 32s)
  • 12:05 btullis@deploy1002: Started deploy [airflow-dags/analytics@e5ba870]: (no justification provided)
  • 12:03 moritzm: installing ruby3.1 security updates
  • 12:02 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 12:02 filippo@cumin1002: START - Cookbook sre.ganeti.makevm for new host prometheus7001.magru.wmnet
  • 11:22 jforrester@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 11:19 jforrester@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 11:19 jforrester@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 11:17 jforrester@deploy1002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 11:16 jforrester@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 11:15 jforrester@deploy1002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 11:05 hnowlan: depooling 6 codfw appservers in advance of reimaging
  • 10:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1223.eqiad.wmnet
  • 10:28 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus7001.magru.wmnet
  • 10:28 filippo@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:28 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1002"
  • 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot-master (exit_code=0) rolling restart_daemons on A:maps-master
  • 10:25 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1223.eqiad.wmnet
  • 10:25 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot-master rolling restart_daemons on A:maps-master
  • 10:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1198.eqiad.wmnet
  • 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
  • 10:16 jnuche@deploy1002: Finished scap: testwikis wikis to 1.43.0-wmf.4 refs T361398 (duration: 19m 22s)
  • 10:15 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad
  • 10:14 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
  • 10:09 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw
  • 10:08 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1198.eqiad.wmnet
  • 10:01 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
  • 09:58 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
  • 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1189.eqiad.wmnet
  • 09:57 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1002"
  • 09:56 jnuche@deploy1002: Started scap: testwikis wikis to 1.43.0-wmf.4 refs T361398
  • 09:55 jnuche@deploy1002: sync-world aborted: testwikis wikis to 1.43.0-wmf.4 refs T361398 (duration: 43m 38s)
  • 09:54 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 09:49 filippo@cumin1002: START - Cookbook sre.hosts.decommission for hosts prometheus7001.magru.wmnet
  • 09:41 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1189.eqiad.wmnet
  • 09:40 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1166.eqiad.wmnet
  • 09:39 brouberol@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 09:38 brouberol@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 09:38 brouberol@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 09:37 brouberol@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 09:37 brouberol@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:36 brouberol@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:36 brouberol@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:36 brouberol@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2165 (T352010)', diff saved to https://phabricator.wikimedia.org/P61981 and previous config saved to /var/cache/conftool/dbconfig/20240507-093302-ladsgroup.json
  • 09:32 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 09:32 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 09:31 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1166.eqiad.wmnet
  • 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1157.eqiad.wmnet
  • 09:21 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1157.eqiad.wmnet
  • 09:11 jnuche@deploy1002: Started scap: testwikis wikis to 1.43.0-wmf.4 refs T361398
  • 09:05 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install7001.wikimedia.org
  • 09:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install7001.wikimedia.org with OS bullseye
  • 09:03 jayme@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:02 jayme@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 09:02 jayme@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 09:01 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 09:01 jayme@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:01 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:00 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:00 jayme@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 08:59 jayme@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:59 jayme@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 08:53 taavi@deploy1002: Finished scap: Backport for wikitech: Also disable password changes when logged-in (duration: 15m 50s)
  • 08:41 taavi@deploy1002: taavi: Continuing with sync
  • 08:40 taavi@deploy1002: taavi: Backport for wikitech: Also disable password changes when logged-in synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:37 taavi@deploy1002: Started scap: Backport for wikitech: Also disable password changes when logged-in
  • 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install7001.wikimedia.org with reason: host reimage
  • 08:35 moritzm: installing glibc security updates on buster
  • 08:34 zabe@deploy1002: Finished scap: Backport for Use OpenSSL for PBKDF2 password hashing on testwiki (T320929) (duration: 17m 22s)
  • 08:34 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on install7001.wikimedia.org with reason: host reimage
  • 08:22 zabe@deploy1002: zabe: Continuing with sync
  • 08:19 zabe@deploy1002: zabe: Backport for Use OpenSSL for PBKDF2 password hashing on testwiki (T320929) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:17 zabe@deploy1002: Started scap: Backport for Use OpenSSL for PBKDF2 password hashing on testwiki (T320929)
  • 08:15 zabe@deploy1002: Finished scap: Backport for Stop setting wgPasswordDefault (duration: 15m 24s)
  • 08:13 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install7001.wikimedia.org with OS bullseye
  • 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install7001.wikimedia.org - jmm@cumin2002"
  • 08:07 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install7001.wikimedia.org - jmm@cumin2002"
  • 08:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install7001.wikimedia.org on all recursors
  • 08:07 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install7001.wikimedia.org on all recursors
  • 08:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install7001.wikimedia.org - jmm@cumin2002"
  • 08:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install7001.wikimedia.org - jmm@cumin2002"
  • 08:03 zabe@deploy1002: zabe: Continuing with sync
  • 08:02 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 08:02 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install7001.wikimedia.org
  • 08:02 zabe@deploy1002: zabe: Backport for Stop setting wgPasswordDefault synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install7001.wikimedia.org
  • 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install7001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 08:00 zabe@deploy1002: Started scap: Backport for Stop setting wgPasswordDefault
  • 07:50 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install7001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 07:45 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 07:40 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts install7001.wikimedia.org
  • 07:39 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host install7001.wikimedia.org with OS bullseye
  • 07:31 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install7001.wikimedia.org with OS bullseye
  • 04:04 mwpresync@deploy1002: Pruned MediaWiki: 1.43.0-wmf.1, 1.43.0-wmf.2 (duration: 04m 50s)
  • 00:47 denisse: Reverting debug changes to their previous state - T364354
  • 00:42 denisse: Writing output to `/tmp/benthos_output.txt` shows that the grok processor's output is being parsed correctly - T364354
  • 00:17 denisse: Adding a logger processor to the `parse_ncredir_log_format` on `ncredir2001` to examine the JSON structure - T364354

2024-05-06

  • 22:22 dzahn@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 22:20 dzahn@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 22:20 dzahn@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 22:18 dzahn@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 22:14 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 22:13 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 21:29 dancy@deploy1002: Installation of scap version "4.82.0" completed for 320 hosts
  • 21:28 dancy@deploy1002: Installing scap version "4.82.0" for 320 hosts
  • 20:47 jdrewniak@deploy1002: Finished scap: Backport for [Vector 2022] Deploy larger font-size and appearance menu to pilot wikis (T362147) (duration: 15m 10s)
  • 20:34 jdrewniak@deploy1002: jdrewniak: Continuing with sync
  • 20:34 jdrewniak@deploy1002: jdrewniak: Backport for [Vector 2022] Deploy larger font-size and appearance menu to pilot wikis (T362147) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:32 jdrewniak@deploy1002: Started scap: Backport for [Vector 2022] Deploy larger font-size and appearance menu to pilot wikis (T362147)
  • 20:27 jdrewniak@deploy1002: Finished scap: Backport for Revert "Revert "Release DT visual enhancements to all except Wikipedia/Commons/Wikidata"" (T352087) (duration: 21m 33s)
  • 20:14 jdrewniak@deploy1002: esanders and jdrewniak: Continuing with sync
  • 20:09 jdrewniak@deploy1002: esanders and jdrewniak: Backport for Revert "Revert "Release DT visual enhancements to all except Wikipedia/Commons/Wikidata"" (T352087) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:05 jdrewniak@deploy1002: Started scap: Backport for Revert "Revert "Release DT visual enhancements to all except Wikipedia/Commons/Wikidata"" (T352087)
  • 19:46 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s7 userOptions.php --delete wlenhancedfilters-seen-tour # T364269
  • 19:25 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s5 userOptions.php --delete rcenhancedfilters-seen-highlight-button-counter # T364269
  • 19:23 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript userOptions.php --wiki=enwiki --delete wlenhancedfilters-seen-tour # T364269
  • 19:21 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s6 userOptions.php --delete rcenhancedfilters-seen-highlight-button-counter # T364269
  • 19:17 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s6 userOptions.php --delete rcenhancedfilters-tried-highlight # T364269
  • 19:17 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s5 userOptions.php --delete rcenhancedfilters-tried-highlight # T364269
  • 19:15 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s3 userOptions.php --delete wlenhancedfilters-seen-tour # T364269
  • 18:20 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s6 userOptions.php --delete wlenhancedfilters-seen-tour # T364269
  • 18:20 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s5 userOptions.php --delete wlenhancedfilters-seen-tour # T364269
  • 18:20 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s4 userOptions.php --delete wlenhancedfilters-seen-tour # T364269
  • 18:20 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s2 userOptions.php --delete wlenhancedfilters-seen-tour # T364269
  • 18:18 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript userOptions.php --wiki=loginwiki --delete wlenhancedfilters-seen-tour # T364269
  • 18:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T361627)', diff saved to https://phabricator.wikimedia.org/P61979 and previous config saved to /var/cache/conftool/dbconfig/20240506-181706-marostegui.json
  • 18:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P61978 and previous config saved to /var/cache/conftool/dbconfig/20240506-180158-marostegui.json
  • 17:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P61977 and previous config saved to /var/cache/conftool/dbconfig/20240506-174651-marostegui.json
  • 17:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T361627)', diff saved to https://phabricator.wikimedia.org/P61976 and previous config saved to /var/cache/conftool/dbconfig/20240506-173143-marostegui.json
  • 17:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2216 (T361627)', diff saved to https://phabricator.wikimedia.org/P61975 and previous config saved to /var/cache/conftool/dbconfig/20240506-172126-marostegui.json
  • 17:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2216.codfw.wmnet with reason: Maintenance
  • 17:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2216.codfw.wmnet with reason: Maintenance
  • 17:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212 (T361627)', diff saved to https://phabricator.wikimedia.org/P61974 and previous config saved to /var/cache/conftool/dbconfig/20240506-172103-marostegui.json
  • 17:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212', diff saved to https://phabricator.wikimedia.org/P61973 and previous config saved to /var/cache/conftool/dbconfig/20240506-170556-marostegui.json
  • 16:52 sukhe: sudo cumin 'A:ncredir' 'run-puppet-agent --enable "merging CR 1028514"'
  • 16:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212', diff saved to https://phabricator.wikimedia.org/P61972 and previous config saved to /var/cache/conftool/dbconfig/20240506-165048-marostegui.json
  • 16:45 sukhe: disable puppet on A:ncredir to merge CR 1028514
  • 16:38 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw2382.codfw.wmnet
  • 16:38 jayme@cumin1002: START - Cookbook sre.hosts.remove-downtime for mw2382.codfw.wmnet
  • 16:37 jayme@cumin1002: conftool action : set/pooled=yes; selector: name=mw2382.codfw.wmnet
  • 16:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212 (T361627)', diff saved to https://phabricator.wikimedia.org/P61971 and previous config saved to /var/cache/conftool/dbconfig/20240506-163540-marostegui.json
  • 16:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2212 (T361627)', diff saved to https://phabricator.wikimedia.org/P61970 and previous config saved to /var/cache/conftool/dbconfig/20240506-162528-marostegui.json
  • 16:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2212.codfw.wmnet with reason: Maintenance
  • 16:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2212.codfw.wmnet with reason: Maintenance
  • 16:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2202.codfw.wmnet with reason: Maintenance
  • 16:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2202.codfw.wmnet with reason: Maintenance
  • 16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T361627)', diff saved to https://phabricator.wikimedia.org/P61969 and previous config saved to /var/cache/conftool/dbconfig/20240506-161624-marostegui.json
  • 16:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P61968 and previous config saved to /var/cache/conftool/dbconfig/20240506-160116-marostegui.json
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61967 and previous config saved to /var/cache/conftool/dbconfig/20240506-155420-root.json
  • 15:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P61966 and previous config saved to /var/cache/conftool/dbconfig/20240506-154608-marostegui.json
  • 15:39 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61965 and previous config saved to /var/cache/conftool/dbconfig/20240506-153914-root.json
  • 15:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T361627)', diff saved to https://phabricator.wikimedia.org/P61964 and previous config saved to /var/cache/conftool/dbconfig/20240506-153101-marostegui.json
  • 15:25 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
  • 15:24 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
  • 15:24 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61963 and previous config saved to /var/cache/conftool/dbconfig/20240506-152408-root.json
  • 15:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2188 (T361627)', diff saved to https://phabricator.wikimedia.org/P61962 and previous config saved to /var/cache/conftool/dbconfig/20240506-152040-marostegui.json
  • 15:20 urbanecm@deploy1002: Finished scap: Backport for userOptions.php: Actually batch deletion (T364311) (duration: 16m 51s)
  • 15:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 15:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 15:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T361627)', diff saved to https://phabricator.wikimedia.org/P61961 and previous config saved to /var/cache/conftool/dbconfig/20240506-152016-marostegui.json
  • 15:17 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s{3-8} userOptions.php --delete rcenhancedfilters-seen-tour # T364269
  • 15:16 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s2 userOptions.php --delete rcenhancedfilters-seen-tour # T364269
  • 15:16 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript userOptions.php --wiki=enwiki --delete rcenhancedfilters-seen-tour # T364269
  • 15:12 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
  • 15:11 brouberol@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster cloudelastic: restart to pick up new JDK - brouberol@cumin2002 - T363975
  • 15:09 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/mathoid: apply
  • 15:09 urbanecm: mwmaint1002: mwscript userOptions.php --wiki=loginwiki --delete rcenhancedfilters-seen-tour # T364269
  • 15:09 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61960 and previous config saved to /var/cache/conftool/dbconfig/20240506-150902-root.json
  • 15:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P61959 and previous config saved to /var/cache/conftool/dbconfig/20240506-150508-marostegui.json
  • 15:03 urbanecm@deploy1002: Started scap: Backport for userOptions.php: Actually batch deletion (T364311)
  • 15:03 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
  • 15:01 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/mathoid: apply
  • 14:55 moritzm: installing less security updates
  • 14:54 filippo@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host prometheus7001.magru.wmnet
  • 14:54 filippo@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host prometheus7001.magru.wmnet with OS bullseye
  • 14:53 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61958 and previous config saved to /var/cache/conftool/dbconfig/20240506-145356-root.json
  • 14:51 brouberol@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster cloudelastic: restart to pick up new JDK - brouberol@cumin2002 - T363975
  • 14:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P61957 and previous config saved to /var/cache/conftool/dbconfig/20240506-145001-marostegui.json
  • 14:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2209.codfw.wmnet
  • 14:45 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:45 urbanecm@deploy1002: Finished scap: Backport for Revert "Release DT visual enhancements to all except Wikipedia/Commons/Wikidata" (duration: 18m 11s)
  • 14:45 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:38 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61956 and previous config saved to /var/cache/conftool/dbconfig/20240506-143850-root.json
  • 14:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T361627)', diff saved to https://phabricator.wikimedia.org/P61955 and previous config saved to /var/cache/conftool/dbconfig/20240506-143453-marostegui.json
  • 14:32 urbanecm@deploy1002: urbanecm: Continuing with sync
  • 14:32 urbanecm@deploy1002: urbanecm: Backport for Revert "Release DT visual enhancements to all except Wikipedia/Commons/Wikidata" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:28 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:28 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:27 urbanecm@deploy1002: Started scap: Backport for Revert "Release DT visual enhancements to all except Wikipedia/Commons/Wikidata"
  • 14:25 urbanecm@deploy1002: Sync cancelled.
  • 14:23 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:23 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61954 and previous config saved to /var/cache/conftool/dbconfig/20240506-142344-root.json
  • 14:23 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:23 filippo@cumin1002: START - Cookbook sre.hosts.reimage for host prometheus7001.magru.wmnet with OS bullseye
  • 14:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2176 (T361627)', diff saved to https://phabricator.wikimedia.org/P61953 and previous config saved to /var/cache/conftool/dbconfig/20240506-142316-marostegui.json
  • 14:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 14:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 14:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61952 and previous config saved to /var/cache/conftool/dbconfig/20240506-142253-marostegui.json
  • 14:21 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 14:20 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 14:20 filippo@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus7001.magru.wmnet on all recursors
  • 14:20 filippo@cumin1002: START - Cookbook sre.dns.wipe-cache prometheus7001.magru.wmnet on all recursors
  • 14:20 filippo@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:20 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 14:19 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus7001.magru.wmnet - filippo@cumin1002"
  • 14:17 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 14:17 filippo@cumin1002: START - Cookbook sre.ganeti.makevm for new host prometheus7001.magru.wmnet
  • 14:16 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1004.eqiad.wmnet with OS bookworm
  • 14:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2021.codfw.wmnet with OS bookworm
  • 14:11 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus7001.magru.wmnet
  • 14:11 filippo@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:11 filippo@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1002"
  • 14:11 filippo@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1002"
  • 14:08 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 14:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P61951 and previous config saved to /var/cache/conftool/dbconfig/20240506-140745-marostegui.json
  • 14:04 filippo@cumin1002: START - Cookbook sre.hosts.decommission for hosts prometheus7001.magru.wmnet
  • 13:54 filippo@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus7001.magru.wmnet
  • 13:54 filippo@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus7001.magru.wmnet with OS bullseye
  • 13:53 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudbackup1004.eqiad.wmnet with reason: host reimage
  • 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P61950 and previous config saved to /var/cache/conftool/dbconfig/20240506-135238-marostegui.json
  • 13:51 andrew@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudbackup1004.eqiad.wmnet with reason: host reimage
  • 13:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2021.codfw.wmnet with reason: host reimage
  • 13:49 urbanecm@deploy1002: esanders and urbanecm: Backport for Release DT visual enhancements to all except Wikipedia/Commons/Wikidata (T352087) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2021.codfw.wmnet with reason: host reimage
  • 13:45 urbanecm@deploy1002: Started scap: Backport for Release DT visual enhancements to all except Wikipedia/Commons/Wikidata (T352087)
  • 13:44 urbanecm@deploy1002: Finished scap: Backport for eswiki, commonswiki wikidatawiki: lift IP cap for edit-a-thon (T364039) (duration: 17m 14s)
  • 13:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61949 and previous config saved to /var/cache/conftool/dbconfig/20240506-133728-marostegui.json
  • 13:35 urbanecm: Run `mwscript userOptions.php --wiki=testwiki --delete` for "rcenhancedfilters-seen-tour", "wlenhancedfilters-seen-tour", "rcenhancedfilters-tried-highlight", "rcenhancedfilters-seen-highlight-button-counter" (T364269)
  • 13:33 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudbackup1004.eqiad.wmnet with OS bookworm
  • 13:27 elukey@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=inference,name=eqiad
  • 13:27 urbanecm@deploy1002: Started scap: Backport for eswiki, commonswiki wikidatawiki: lift IP cap for edit-a-thon (T364039)
  • 13:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61948 and previous config saved to /var/cache/conftool/dbconfig/20240506-132635-marostegui.json
  • 13:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 13:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 13:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T361627)', diff saved to https://phabricator.wikimedia.org/P61947 and previous config saved to /var/cache/conftool/dbconfig/20240506-132612-marostegui.json
  • 13:25 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2021.codfw.wmnet with OS bookworm
  • 13:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2021', diff saved to https://phabricator.wikimedia.org/P61946 and previous config saved to /var/cache/conftool/dbconfig/20240506-132424-root.json
  • 13:16 urbanecm@deploy1002: Finished scap: Backport for iglwiki: Enable GrowthExperiments (T364130), Backport several WikimediaMessages patches (T217451 T362538 T364213 T315774 T364269) (duration: 24m 01s)
  • 13:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P61945 and previous config saved to /var/cache/conftool/dbconfig/20240506-131104-marostegui.json
  • 13:09 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 13:07 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 13:07 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61944 and previous config saved to /var/cache/conftool/dbconfig/20240506-130712-root.json
  • 13:05 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 13:05 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 13:04 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 13:03 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 13:03 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 13:02 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 13:01 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
  • 13:00 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 12:59 sukhe: running authdns-update for removing depooling magru geoip/*
  • 12:57 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2209.codfw.wmnet
  • 12:57 urbanecm@deploy1002: urbanecm: Continuing with sync
  • 12:57 urbanecm@deploy1002: urbanecm: Backport for iglwiki: Enable GrowthExperiments (T364130), Backport several WikimediaMessages patches (T217451 T362538 T364213 T315774 T364269) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:56 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 12:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P61943 and previous config saved to /var/cache/conftool/dbconfig/20240506-125556-marostegui.json
  • 12:54 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 12:53 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 12:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2205.codfw.wmnet
  • 12:52 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61942 and previous config saved to /var/cache/conftool/dbconfig/20240506-125206-root.json
  • 12:52 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 12:52 urbanecm@deploy1002: Started scap: Backport for iglwiki: Enable GrowthExperiments (T364130), Backport several WikimediaMessages patches (T217451 T362538 T364213 T315774 T364269)
  • 12:51 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 12:51 urbanecm@deploy1002: Sync cancelled.
  • 12:45 urbanecm@deploy1002: urbanecm: Backport for iglwiki: Enable GrowthExperiments (T364130), Backport several WikimediaMessages patches (T217451 T362538 T364213 T315774 T364269) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:27 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=iglwiki growthexperiments # T364130
  • 12:27 elukey@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=inference,name=eqiad
  • 12:26 filippo@cumin1002: START - Cookbook sre.dns.netbox
  • 12:26 filippo@cumin1002: START - Cookbook sre.ganeti.makevm for new host prometheus7001.magru.wmnet
  • 12:25 urbanecm@deploy1002: Started scap: Backport for iglwiki: Enable GrowthExperiments (T364130), Backport several WikimediaMessages patches (T217451 T362538 T364213 T315774 T364269)
  • 12:21 urbanecm: [urbanecm@deploy1002 ~]$ sudo /usr/local/sbin/fix-staging-perms # fixing permissions
  • 12:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61937 and previous config saved to /var/cache/conftool/dbconfig/20240506-122154-root.json
  • 12:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P61936 and previous config saved to /var/cache/conftool/dbconfig/20240506-121515-marostegui.json
  • 12:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61935 and previous config saved to /var/cache/conftool/dbconfig/20240506-120648-root.json
  • 12:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P61934 and previous config saved to /var/cache/conftool/dbconfig/20240506-120007-marostegui.json
  • 11:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61933 and previous config saved to /var/cache/conftool/dbconfig/20240506-115142-root.json
  • 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
  • 11:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T361627)', diff saved to https://phabricator.wikimedia.org/P61932 and previous config saved to /var/cache/conftool/dbconfig/20240506-114459-marostegui.json
  • 11:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
  • 11:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61931 and previous config saved to /var/cache/conftool/dbconfig/20240506-113636-root.json
  • 11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170 (T361627)', diff saved to https://phabricator.wikimedia.org/P61930 and previous config saved to /var/cache/conftool/dbconfig/20240506-113511-marostegui.json
  • 11:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 11:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 11:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T361627)', diff saved to https://phabricator.wikimedia.org/P61929 and previous config saved to /var/cache/conftool/dbconfig/20240506-113448-marostegui.json
  • 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test2002.wikimedia.org
  • 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2194.codfw.wmnet
  • 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test2002.wikimedia.org
  • 11:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P61928 and previous config saved to /var/cache/conftool/dbconfig/20240506-111940-marostegui.json
  • 11:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2024.codfw.wmnet with OS bookworm
  • 11:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P61927 and previous config saved to /var/cache/conftool/dbconfig/20240506-110433-marostegui.json
  • 11:03 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2194.codfw.wmnet
  • 10:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T361627)', diff saved to https://phabricator.wikimedia.org/P61926 and previous config saved to /var/cache/conftool/dbconfig/20240506-104925-marostegui.json
  • 10:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2024.codfw.wmnet with reason: host reimage
  • 10:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2024.codfw.wmnet with reason: host reimage
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2153 (T361627)', diff saved to https://phabricator.wikimedia.org/P61925 and previous config saved to /var/cache/conftool/dbconfig/20240506-103848-marostegui.json
  • 10:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 10:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T361627)', diff saved to https://phabricator.wikimedia.org/P61924 and previous config saved to /var/cache/conftool/dbconfig/20240506-103825-marostegui.json
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61923 and previous config saved to /var/cache/conftool/dbconfig/20240506-103814-root.json
  • 10:36 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2190.codfw.wmnet
  • 10:31 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1178.eqiad.wmnet with OS bookworm
  • 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P61922 and previous config saved to /var/cache/conftool/dbconfig/20240506-102317-marostegui.json
  • 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61921 and previous config saved to /var/cache/conftool/dbconfig/20240506-102307-root.json
  • 10:21 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2024.codfw.wmnet with OS bookworm
  • 10:19 marostegui@cumin1002: dbctl commit (dc=all): 'Give some weight to es2023', diff saved to https://phabricator.wikimedia.org/P61920 and previous config saved to /var/cache/conftool/dbconfig/20240506-101934-marostegui.json
  • 10:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2024', diff saved to https://phabricator.wikimedia.org/P61919 and previous config saved to /var/cache/conftool/dbconfig/20240506-101911-root.json
  • 10:11 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2190.codfw.wmnet
  • 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2177.codfw.wmnet
  • 10:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P61918 and previous config saved to /var/cache/conftool/dbconfig/20240506-100809-marostegui.json
  • 10:08 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61917 and previous config saved to /var/cache/conftool/dbconfig/20240506-100801-root.json
  • 10:02 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2177.codfw.wmnet
  • 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T361627)', diff saved to https://phabricator.wikimedia.org/P61916 and previous config saved to /var/cache/conftool/dbconfig/20240506-095302-marostegui.json
  • 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61915 and previous config saved to /var/cache/conftool/dbconfig/20240506-095255-root.json
  • 09:43 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:43 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2146 (T361627)', diff saved to https://phabricator.wikimedia.org/P61914 and previous config saved to /var/cache/conftool/dbconfig/20240506-094158-marostegui.json
  • 09:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 09:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 09:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T361627)', diff saved to https://phabricator.wikimedia.org/P61913 and previous config saved to /var/cache/conftool/dbconfig/20240506-094135-marostegui.json
  • 09:40 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:40 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:39 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:37 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61912 and previous config saved to /var/cache/conftool/dbconfig/20240506-093749-root.json
  • 09:30 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61911 and previous config saved to /var/cache/conftool/dbconfig/20240506-093047-root.json
  • 09:29 moritzm: uploaded openjdk-8 8u412-ga-1~deb10u1 to buster-wikimedia (forward port of latest Java 8 security updates)
  • 09:28 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 09:27 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 09:26 jayme@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 09:26 jayme@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 09:26 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
  • 09:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P61910 and previous config saved to /var/cache/conftool/dbconfig/20240506-092627-marostegui.json
  • 09:26 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 09:26 jayme@deploy1002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 09:26 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 09:25 jayme@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
  • 09:25 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 09:25 jayme@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 09:24 jayme@deploy1002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 09:22 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61909 and previous config saved to /var/cache/conftool/dbconfig/20240506-092244-root.json
  • 09:22 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 09:21 jayme@deploy1002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 09:17 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2149.codfw.wmnet
  • 09:15 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61908 and previous config saved to /var/cache/conftool/dbconfig/20240506-091541-root.json
  • 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P61907 and previous config saved to /var/cache/conftool/dbconfig/20240506-091120-marostegui.json
  • 09:11 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS bookworm
  • 09:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1178', diff saved to https://phabricator.wikimedia.org/P61906 and previous config saved to /var/cache/conftool/dbconfig/20240506-090759-root.json
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'es1025 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61905 and previous config saved to /var/cache/conftool/dbconfig/20240506-090736-root.json
  • 09:00 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61904 and previous config saved to /var/cache/conftool/dbconfig/20240506-090035-root.json
  • 08:57 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2149.codfw.wmnet
  • 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T361627)', diff saved to https://phabricator.wikimedia.org/P61903 and previous config saved to /var/cache/conftool/dbconfig/20240506-085612-marostegui.json
  • 08:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1025.eqiad.wmnet with OS bookworm
  • 08:45 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61902 and previous config saved to /var/cache/conftool/dbconfig/20240506-084530-root.json
  • 08:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2145 (T361627)', diff saved to https://phabricator.wikimedia.org/P61901 and previous config saved to /var/cache/conftool/dbconfig/20240506-084422-marostegui.json
  • 08:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 08:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1193 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61900 and previous config saved to /var/cache/conftool/dbconfig/20240506-083657-root.json
  • 08:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 08:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 08:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T361627)', diff saved to https://phabricator.wikimedia.org/P61899 and previous config saved to /var/cache/conftool/dbconfig/20240506-083507-marostegui.json
  • 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2127.codfw.wmnet
  • 08:30 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61898 and previous config saved to /var/cache/conftool/dbconfig/20240506-083024-root.json
  • 08:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61897 and previous config saved to /var/cache/conftool/dbconfig/20240506-082426-root.json
  • 08:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1025.eqiad.wmnet with reason: host reimage
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1193 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61896 and previous config saved to /var/cache/conftool/dbconfig/20240506-082151-root.json
  • 08:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1025.eqiad.wmnet with reason: host reimage
  • 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P61895 and previous config saved to /var/cache/conftool/dbconfig/20240506-082000-marostegui.json
  • 08:15 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61894 and previous config saved to /var/cache/conftool/dbconfig/20240506-081518-root.json
  • 08:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2127.codfw.wmnet
  • 08:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61893 and previous config saved to /var/cache/conftool/dbconfig/20240506-080920-root.json
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1193 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61892 and previous config saved to /var/cache/conftool/dbconfig/20240506-080645-root.json
  • 08:05 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1025.eqiad.wmnet with OS bookworm
  • 08:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P61891 and previous config saved to /var/cache/conftool/dbconfig/20240506-080452-marostegui.json
  • 08:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1025 T364289', diff saved to https://phabricator.wikimedia.org/P61890 and previous config saved to /var/cache/conftool/dbconfig/20240506-080423-root.json
  • 08:00 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61889 and previous config saved to /var/cache/conftool/dbconfig/20240506-080012-root.json
  • 07:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1020.eqiad.wmnet with OS bookworm
  • 07:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61888 and previous config saved to /var/cache/conftool/dbconfig/20240506-075414-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1193 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61887 and previous config saved to /var/cache/conftool/dbconfig/20240506-075139-root.json
  • 07:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T361627)', diff saved to https://phabricator.wikimedia.org/P61886 and previous config saved to /var/cache/conftool/dbconfig/20240506-074945-marostegui.json
  • 07:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61885 and previous config saved to /var/cache/conftool/dbconfig/20240506-073909-root.json
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2130 (T361627)', diff saved to https://phabricator.wikimedia.org/P61884 and previous config saved to /var/cache/conftool/dbconfig/20240506-073826-marostegui.json
  • 07:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 07:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T361627)', diff saved to https://phabricator.wikimedia.org/P61883 and previous config saved to /var/cache/conftool/dbconfig/20240506-073803-marostegui.json
  • 07:37 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) webproxy on magru recursors
  • 07:37 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache webproxy on magru recursors
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1193 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61882 and previous config saved to /var/cache/conftool/dbconfig/20240506-073633-root.json
  • 07:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1020.eqiad.wmnet with reason: host reimage
  • 07:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1020.eqiad.wmnet with reason: host reimage
  • 07:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61881 and previous config saved to /var/cache/conftool/dbconfig/20240506-072403-root.json
  • 07:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P61880 and previous config saved to /var/cache/conftool/dbconfig/20240506-072255-marostegui.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1193 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61879 and previous config saved to /var/cache/conftool/dbconfig/20240506-072127-root.json
  • 07:13 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1020.eqiad.wmnet with OS bookworm
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1020', diff saved to https://phabricator.wikimedia.org/P61878 and previous config saved to /var/cache/conftool/dbconfig/20240506-071051-root.json
  • 07:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61877 and previous config saved to /var/cache/conftool/dbconfig/20240506-070857-root.json
  • 07:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P61876 and previous config saved to /var/cache/conftool/dbconfig/20240506-070748-marostegui.json
  • 07:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS bookworm
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1193 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61875 and previous config saved to /var/cache/conftool/dbconfig/20240506-070621-root.json
  • 06:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS bookworm
  • 06:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61874 and previous config saved to /var/cache/conftool/dbconfig/20240506-065351-root.json
  • 06:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T361627)', diff saved to https://phabricator.wikimedia.org/P61873 and previous config saved to /var/cache/conftool/dbconfig/20240506-065239-marostegui.json
  • 06:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
  • 06:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
  • 06:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2116 (T361627)', diff saved to https://phabricator.wikimedia.org/P61872 and previous config saved to /var/cache/conftool/dbconfig/20240506-064121-marostegui.json
  • 06:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 06:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 06:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
  • 06:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
  • 06:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS bookworm
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1193', diff saved to https://phabricator.wikimedia.org/P61871 and previous config saved to /var/cache/conftool/dbconfig/20240506-062814-root.json
  • 06:17 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 06:17 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS bookworm
  • 06:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2165 T363977', diff saved to https://phabricator.wikimedia.org/P61870 and previous config saved to /var/cache/conftool/dbconfig/20240506-061416-root.json
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2161 to s8 primary T363977', diff saved to https://phabricator.wikimedia.org/P61869 and previous config saved to /var/cache/conftool/dbconfig/20240506-061311-marostegui.json
  • 06:12 marostegui: Starting s8 codfw failover from db2165 to db2161 - T363977
  • 06:07 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 05:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s8 T363977
  • 05:50 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2161 with weight 0 T363977', diff saved to https://phabricator.wikimedia.org/P61868 and previous config saved to /var/cache/conftool/dbconfig/20240506-055013-root.json
  • 05:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s8 T363977
  • 05:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 05:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2165.codfw.wmnet with reason: Maintenance

2024-05-05

  • 11:09 brennen@deploy1002: Finished deploy [phabricator/deployment@dd53761]: test deploy phab1004 for T364271 (duration: 00m 32s)
  • 11:08 brennen@deploy1002: Started deploy [phabricator/deployment@dd53761]: test deploy phab1004 for T364271
  • 11:08 brennen@deploy1002: Finished deploy [phabricator/deployment@dd53761]: test deploy phab2002 for T364271 (duration: 00m 32s)
  • 11:07 brennen@deploy1002: Started deploy [phabricator/deployment@dd53761]: test deploy phab2002 for T364271
  • 11:04 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab.wmfusercontent.org with reason: brennen is deploying things
  • 11:03 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab.wmfusercontent.org with reason: brennen is deploying things
  • 11:03 taavi@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 1:00:00 on phabricator.wikimedia.org with reason: brennen is deploying things
  • 11:03 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phabricator.wikimedia.org with reason: brennen is deploying things
  • 11:03 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: brennen is deploying things
  • 11:03 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: brennen is deploying things
  • 08:42 taavi: taavi@gerrit1003 ~ $ sudo systemctl restart apache2

2024-05-04

  • 13:41 jayme: doubled the number of eventgate-main replicas in eqiad to 16
  • 07:39 taavi@cumin1002: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
  • 07:33 taavi@cumin1002: START - Cookbook sre.wikireplicas.update-views
  • 03:07 denisse: Restarting `status curator_actions_cluster_wide.service` to log with DEBUGG level on logstash2026 - T364190
  • 03:06 denisse: Enable log level DEBUG for curator on logstash2026 - T364190
  • 01:33 bblack@cumin1002: conftool action : set/weight=100; selector: name=dns7.*
  • 01:24 bblack: lvs7001 - restart pybal
  • 01:23 bblack: lvs7003 - restart pybal

2024-05-03

  • 21:38 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on wdqs2023.codfw.wmnet with reason: T362920
  • 21:38 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 6 days, 0:00:00 on wdqs2023.codfw.wmnet with reason: T362920
  • 21:27 ryankemper: T362920 [wdqs] Depooled `wdqs2023` in preparation to switch it to a graph split host
  • 19:02 sukhe: cleaning up stale confd template files for magru related reimaging
  • 18:44 brett@cumin2002: conftool action : set/pooled=yes; selector: name=ncredir7002.magru.wmnet,service=nginx
  • 18:43 brett@cumin2002: conftool action : set/pooled=yes; selector: name=ncredir7001.magru.wmnet,service=nginx
  • 18:38 brett@cumin2002: conftool action : set/pooled=no; selector: name=ncredir7001.magru.wmnet,service=nginx
  • 18:38 brett@cumin2002: conftool action : set/pooled=no; selector: name=ncredir7002.magru.wmnet,service=nginx
  • 18:29 brett@cumin2002: conftool action : set/pooled=yes; selector: name=ncredir7002.magru.wmnet,service=nginx
  • 18:29 brett@cumin2002: conftool action : set/weight=1; selector: name=ncredir7002.magru.wmnet,service=nginx
  • 18:29 brett@cumin2002: conftool action : set/pooled=yes; selector: name=ncredir7001.magru.wmnet,service=nginx
  • 18:28 brett@cumin2002: conftool action : set/weight=1; selector: name=ncredir7001.magru.wmnet,service=nginx
  • 17:45 dcausse: repooling wdqs1012
  • 17:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 17:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 17:14 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir7002.magru.wmnet
  • 17:14 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir7002.magru.wmnet with OS bookworm
  • 17:13 denisse: Run `sudo mdadm --add /dev/md1 /dev/sdg` on `centrallog1002` - T363660
  • 17:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 17:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 17:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T361627)', diff saved to https://phabricator.wikimedia.org/P61862 and previous config saved to /var/cache/conftool/dbconfig/20240503-170054-marostegui.json
  • 16:47 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir7002.magru.wmnet with reason: host reimage
  • 16:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P61860 and previous config saved to /var/cache/conftool/dbconfig/20240503-164546-marostegui.json
  • 16:44 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir7002.magru.wmnet with reason: host reimage
  • 16:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P61859 and previous config saved to /var/cache/conftool/dbconfig/20240503-163039-marostegui.json
  • 16:18 brett@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir7002.magru.wmnet with OS bookworm
  • 16:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T361627)', diff saved to https://phabricator.wikimedia.org/P61858 and previous config saved to /var/cache/conftool/dbconfig/20240503-161531-marostegui.json
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2195 (T361627)', diff saved to https://phabricator.wikimedia.org/P61857 and previous config saved to /var/cache/conftool/dbconfig/20240503-155432-marostegui.json
  • 15:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 15:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T361627)', diff saved to https://phabricator.wikimedia.org/P61856 and previous config saved to /var/cache/conftool/dbconfig/20240503-155409-marostegui.json
  • 15:42 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir7002.magru.wmnet - brett@cumin2002"
  • 15:41 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir7002.magru.wmnet - brett@cumin2002"
  • 15:40 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir7002.magru.wmnet on all recursors
  • 15:40 brett@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir7002.magru.wmnet on all recursors
  • 15:40 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:40 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir7002.magru.wmnet - brett@cumin2002"
  • 15:39 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir7002.magru.wmnet - brett@cumin2002"
  • 15:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P61855 and previous config saved to /var/cache/conftool/dbconfig/20240503-153901-marostegui.json
  • 15:34 brett@cumin2002: START - Cookbook sre.dns.netbox
  • 15:34 brett@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir7002.magru.wmnet
  • 15:26 dcausse: depooled wdqs1012 (lagged)
  • 15:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P61854 and previous config saved to /var/cache/conftool/dbconfig/20240503-152354-marostegui.json
  • 15:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T361627)', diff saved to https://phabricator.wikimedia.org/P61853 and previous config saved to /var/cache/conftool/dbconfig/20240503-150846-marostegui.json
  • 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add install7001 - jmm@cumin2002"
  • 14:44 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@5d3a06d] (releasing): update plugins to address vulnerabilities (duration: 00m 39s)
  • 14:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2181 (T361627)', diff saved to https://phabricator.wikimedia.org/P61852 and previous config saved to /var/cache/conftool/dbconfig/20240503-144419-marostegui.json
  • 14:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 14:44 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@5d3a06d] (releasing): update plugins to address vulnerabilities
  • 14:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T361627)', diff saved to https://phabricator.wikimedia.org/P61851 and previous config saved to /var/cache/conftool/dbconfig/20240503-144356-marostegui.json
  • 14:39 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@5d3a06d] (releasing): test plugin update in secondary host (duration: 00m 22s)
  • 14:39 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@5d3a06d] (releasing): test plugin update in secondary host
  • 14:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P61850 and previous config saved to /var/cache/conftool/dbconfig/20240503-142848-marostegui.json
  • 14:26 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add install7001 - jmm@cumin2002"
  • 14:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install7001.wikimedia.org
  • 14:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install7001.wikimedia.org with OS bookworm
  • 14:16 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:15 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 14:14 sukhe: sudo homer asw*magru* commit "add durum and doh hosts in magru"
  • 14:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P61849 and previous config saved to /var/cache/conftool/dbconfig/20240503-141341-marostegui.json
  • 14:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install7001.wikimedia.org with reason: host reimage
  • 14:08 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on install7001.wikimedia.org with reason: host reimage
  • 14:07 herron: alert1001:~# systemctl restart prometheus-alertmanager.service
  • 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T361627)', diff saved to https://phabricator.wikimedia.org/P61848 and previous config saved to /var/cache/conftool/dbconfig/20240503-135834-marostegui.json
  • 13:43 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install7001.wikimedia.org with OS bookworm
  • 13:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2167 (T361627)', diff saved to https://phabricator.wikimedia.org/P61847 and previous config saved to /var/cache/conftool/dbconfig/20240503-133601-marostegui.json
  • 13:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 13:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 13:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T361627)', diff saved to https://phabricator.wikimedia.org/P61846 and previous config saved to /var/cache/conftool/dbconfig/20240503-133538-marostegui.json
  • 13:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install7001.wikimedia.org - jmm@cumin2002"
  • 13:29 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install7001.wikimedia.org - jmm@cumin2002"
  • 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install7001.wikimedia.org on all recursors
  • 13:28 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install7001.wikimedia.org on all recursors
  • 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install7001.wikimedia.org - jmm@cumin2002"
  • 13:26 elukey: restart karma on alert1001 to verify if probe down alerts shown are stale
  • 13:26 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install7001.wikimedia.org - jmm@cumin2002"
  • 13:23 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:22 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 13:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P61845 and previous config saved to /var/cache/conftool/dbconfig/20240503-132030-marostegui.json
  • 13:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P61844 and previous config saved to /var/cache/conftool/dbconfig/20240503-130523-marostegui.json
  • 13:04 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:03 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 12:51 cmooney@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 12:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T361627)', diff saved to https://phabricator.wikimedia.org/P61843 and previous config saved to /var/cache/conftool/dbconfig/20240503-125015-marostegui.json
  • 12:47 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 12:26 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61841 and previous config saved to /var/cache/conftool/dbconfig/20240503-122659-root.json
  • 12:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2166 (T361627)', diff saved to https://phabricator.wikimedia.org/P61840 and previous config saved to /var/cache/conftool/dbconfig/20240503-122510-marostegui.json
  • 12:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 12:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 12:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T361627)', diff saved to https://phabricator.wikimedia.org/P61839 and previous config saved to /var/cache/conftool/dbconfig/20240503-122446-marostegui.json
  • 12:11 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61838 and previous config saved to /var/cache/conftool/dbconfig/20240503-121153-root.json
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P61837 and previous config saved to /var/cache/conftool/dbconfig/20240503-120938-marostegui.json
  • 12:06 topranks: removing entries for lsw1-a1-codfw switch and private1-a1-codfw vlan from puppet T364097
  • 12:02 sukhe@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh7002.wikimedia.org
  • 12:02 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh7002.wikimedia.org with OS bookworm
  • 12:01 moritzm: uploaded wmf-sre-laptop 0.5.10 to apt.wikimedia.org
  • 11:57 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:57 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove lsw1-a1-codfw phyiscal link dns - cmooney@cumin1002"
  • 11:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61835 and previous config saved to /var/cache/conftool/dbconfig/20240503-115647-root.json
  • 11:55 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove lsw1-a1-codfw phyiscal link dns - cmooney@cumin1002"
  • 11:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P61834 and previous config saved to /var/cache/conftool/dbconfig/20240503-115431-marostegui.json
  • 11:53 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 11:45 sukhe@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum7002.magru.wmnet
  • 11:45 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum7002.magru.wmnet with OS bookworm
  • 11:44 topranks: Removing connections from ssw1-a1-codfw and ssw1-a8-codfw to lsw1-a1-codfw T364097
  • 11:41 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh7002.wikimedia.org with reason: host reimage
  • 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61833 and previous config saved to /var/cache/conftool/dbconfig/20240503-114141-root.json
  • 11:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T361627)', diff saved to https://phabricator.wikimedia.org/P61832 and previous config saved to /var/cache/conftool/dbconfig/20240503-113924-marostegui.json
  • 11:38 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh7002.wikimedia.org with reason: host reimage
  • 11:27 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum7002.magru.wmnet with reason: host reimage
  • 11:26 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61831 and previous config saved to /var/cache/conftool/dbconfig/20240503-112635-root.json
  • 11:23 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum7002.magru.wmnet with reason: host reimage
  • 11:19 taavi@cumin1002: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
  • 11:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM durum7001.magru.wmnet
  • 11:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS bookworm
  • 11:16 taavi@cumin1002: START - Cookbook sre.wikireplicas.update-views
  • 11:16 taavi@cumin1002: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=93)
  • 11:15 taavi@cumin1002: START - Cookbook sre.wikireplicas.update-views
  • 11:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2164 (T361627)', diff saved to https://phabricator.wikimedia.org/P61830 and previous config saved to /var/cache/conftool/dbconfig/20240503-111415-marostegui.json
  • 11:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 11:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 11:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 11:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 11:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T361627)', diff saved to https://phabricator.wikimedia.org/P61829 and previous config saved to /var/cache/conftool/dbconfig/20240503-111337-marostegui.json
  • 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM durum7001.magru.wmnet
  • 11:11 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host doh7002.wikimedia.org with OS bookworm
  • 11:11 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61828 and previous config saved to /var/cache/conftool/dbconfig/20240503-111129-root.json
  • 11:11 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh7002.wikimedia.org - sukhe@cumin1002"
  • 11:10 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh7002.wikimedia.org - sukhe@cumin1002"
  • 11:09 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh7002.wikimedia.org on all recursors
  • 11:09 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache doh7002.wikimedia.org on all recursors
  • 11:09 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:09 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh7002.wikimedia.org - sukhe@cumin1002"
  • 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM doh7001.wikimedia.org
  • 11:08 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh7002.wikimedia.org - sukhe@cumin1002"
  • 11:06 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 11:06 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host doh7002.wikimedia.org
  • 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM doh7001.wikimedia.org
  • 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ncredir7001.magru.wmnet
  • 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ncredir7001.magru.wmnet
  • 10:58 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host durum7002.magru.wmnet with OS bookworm
  • 10:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240503-105824-marostegui.json
  • 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM netflow7001.magru.wmnet
  • 10:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
  • 10:54 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum7002.magru.wmnet - sukhe@cumin1002"
  • 10:53 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum7002.magru.wmnet - sukhe@cumin1002"
  • 10:53 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum7002.magru.wmnet on all recursors
  • 10:53 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache durum7002.magru.wmnet on all recursors
  • 10:53 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:53 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum7002.magru.wmnet - sukhe@cumin1002"
  • 10:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
  • 10:52 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum7002.magru.wmnet - sukhe@cumin1002"
  • 10:51 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM netflow7001.magru.wmnet
  • 10:50 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 10:50 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host durum7002.magru.wmnet
  • 10:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P61827 and previous config saved to /var/cache/conftool/dbconfig/20240503-104317-marostegui.json
  • 10:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS bookworm
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1203', diff saved to https://phabricator.wikimedia.org/P61826 and previous config saved to /var/cache/conftool/dbconfig/20240503-103814-root.json
  • 10:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add bast7001 - jmm@cumin2002 - T364016"
  • 10:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add bast7001 - jmm@cumin2002 - T364016"
  • 10:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T361627)', diff saved to https://phabricator.wikimedia.org/P61825 and previous config saved to /var/cache/conftool/dbconfig/20240503-102809-marostegui.json
  • 10:27 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lsw1-a1-codfw,lsw1-a1-codfw IPv6,lsw1-a1-codfw.mgmt with reason: device being decommed and renamed, downtiming as a precaution first
  • 10:27 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on lsw1-a1-codfw,lsw1-a1-codfw IPv6,lsw1-a1-codfw.mgmt with reason: device being decommed and renamed, downtiming as a precaution first
  • 10:15 moritzm: installing Java 17 security updates on idp-test
  • 10:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2163 (T361627)', diff saved to https://phabricator.wikimedia.org/P61823 and previous config saved to /var/cache/conftool/dbconfig/20240503-100335-marostegui.json
  • 10:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 10:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 10:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T361627)', diff saved to https://phabricator.wikimedia.org/P61822 and previous config saved to /var/cache/conftool/dbconfig/20240503-100313-marostegui.json
  • 09:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P61821 and previous config saved to /var/cache/conftool/dbconfig/20240503-094805-marostegui.json
  • 09:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P61820 and previous config saved to /var/cache/conftool/dbconfig/20240503-093257-marostegui.json
  • 09:26 pfischer@deploy1002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T361627)', diff saved to https://phabricator.wikimedia.org/P61818 and previous config saved to /var/cache/conftool/dbconfig/20240503-091750-marostegui.json
  • 09:11 pfischer@deploy1002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2162 (T361627)', diff saved to https://phabricator.wikimedia.org/P61817 and previous config saved to /var/cache/conftool/dbconfig/20240503-085234-marostegui.json
  • 08:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host bast7001.wikimedia.org
  • 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast7001.wikimedia.org with OS bookworm
  • 08:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T361627)', diff saved to https://phabricator.wikimedia.org/P61816 and previous config saved to /var/cache/conftool/dbconfig/20240503-085211-marostegui.json
  • 08:48 XioNoX: restart turnilo
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P61815 and previous config saved to /var/cache/conftool/dbconfig/20240503-083703-marostegui.json
  • 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast7001.wikimedia.org with reason: host reimage
  • 08:33 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast7001.wikimedia.org with reason: host reimage
  • 08:30 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P61814 and previous config saved to /var/cache/conftool/dbconfig/20240503-082156-marostegui.json
  • 08:20 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 08:11 moritzm: installing emacs security updates
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T361627)', diff saved to https://phabricator.wikimedia.org/P61813 and previous config saved to /var/cache/conftool/dbconfig/20240503-080649-marostegui.json
  • 08:05 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host bast7001.wikimedia.org with OS bookworm
  • 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast7001.wikimedia.org - jmm@cumin2002"
  • 08:00 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast7001.wikimedia.org - jmm@cumin2002"
  • 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) bast7001.wikimedia.org on all recursors
  • 07:59 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache bast7001.wikimedia.org on all recursors
  • 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast7001.wikimedia.org - jmm@cumin2002"
  • 07:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast7001.wikimedia.org - jmm@cumin2002"
  • 07:53 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 07:53 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host bast7001.wikimedia.org
  • 07:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2161 (T361627)', diff saved to https://phabricator.wikimedia.org/P61812 and previous config saved to /var/cache/conftool/dbconfig/20240503-074135-marostegui.json
  • 07:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 07:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 07:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T361627)', diff saved to https://phabricator.wikimedia.org/P61811 and previous config saved to /var/cache/conftool/dbconfig/20240503-074112-marostegui.json
  • 07:33 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti7004.magru.wmnet to cluster magru02 and group B4
  • 07:32 zabe: zabe@mwmaint1002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=metawiki --logwiki=metawiki 'Arnadh2011' 'User435211' # T363654
  • 07:32 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti7004.magru.wmnet to cluster magru02 and group B4
  • 07:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P61810 and previous config saved to /var/cache/conftool/dbconfig/20240503-072604-marostegui.json
  • 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61809 and previous config saved to /var/cache/conftool/dbconfig/20240503-071853-root.json
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P61808 and previous config saved to /var/cache/conftool/dbconfig/20240503-071057-marostegui.json
  • 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61807 and previous config saved to /var/cache/conftool/dbconfig/20240503-070347-root.json
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T361627)', diff saved to https://phabricator.wikimedia.org/P61806 and previous config saved to /var/cache/conftool/dbconfig/20240503-065547-marostegui.json
  • 06:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61805 and previous config saved to /var/cache/conftool/dbconfig/20240503-064842-root.json
  • 06:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61804 and previous config saved to /var/cache/conftool/dbconfig/20240503-063336-root.json
  • 06:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2154 (T361627)', diff saved to https://phabricator.wikimedia.org/P61803 and previous config saved to /var/cache/conftool/dbconfig/20240503-063048-marostegui.json
  • 06:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T361627)', diff saved to https://phabricator.wikimedia.org/P61802 and previous config saved to /var/cache/conftool/dbconfig/20240503-063025-marostegui.json
  • 06:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61801 and previous config saved to /var/cache/conftool/dbconfig/20240503-061830-root.json
  • 06:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P61800 and previous config saved to /var/cache/conftool/dbconfig/20240503-061517-marostegui.json
  • 06:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61799 and previous config saved to /var/cache/conftool/dbconfig/20240503-060324-root.json
  • 06:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P61798 and previous config saved to /var/cache/conftool/dbconfig/20240503-060010-marostegui.json
  • 05:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61797 and previous config saved to /var/cache/conftool/dbconfig/20240503-054818-root.json
  • 05:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS bookworm
  • 05:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T361627)', diff saved to https://phabricator.wikimedia.org/P61796 and previous config saved to /var/cache/conftool/dbconfig/20240503-054502-marostegui.json
  • 05:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
  • 05:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2152 (T361627)', diff saved to https://phabricator.wikimedia.org/P61795 and previous config saved to /var/cache/conftool/dbconfig/20240503-052430-marostegui.json
  • 05:24 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 05:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
  • 05:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 05:11 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS bookworm
  • 05:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1214', diff saved to https://phabricator.wikimedia.org/P61794 and previous config saved to /var/cache/conftool/dbconfig/20240503-050947-root.json
  • 04:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 04:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 01:04 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-eqiad: Apply updated JDK 8 - eevans@cumin1002
  • 01:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 01:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 01:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T361627)', diff saved to https://phabricator.wikimedia.org/P61793 and previous config saved to /var/cache/conftool/dbconfig/20240503-010330-marostegui.json
  • 00:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P61792 and previous config saved to /var/cache/conftool/dbconfig/20240503-004821-marostegui.json
  • 00:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P61791 and previous config saved to /var/cache/conftool/dbconfig/20240503-003313-marostegui.json
  • 00:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T361627)', diff saved to https://phabricator.wikimedia.org/P61790 and previous config saved to /var/cache/conftool/dbconfig/20240503-001805-marostegui.json
  • 00:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1249 (T361627)', diff saved to https://phabricator.wikimedia.org/P61789 and previous config saved to /var/cache/conftool/dbconfig/20240503-000614-marostegui.json
  • 00:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 00:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 00:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T361627)', diff saved to https://phabricator.wikimedia.org/P61788 and previous config saved to /var/cache/conftool/dbconfig/20240503-000602-marostegui.json

2024-05-02

  • 23:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P61787 and previous config saved to /var/cache/conftool/dbconfig/20240502-235053-marostegui.json
  • 23:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P61786 and previous config saved to /var/cache/conftool/dbconfig/20240502-233545-marostegui.json
  • 23:33 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-eqiad: Apply updated JDK 8 - eevans@cumin1002
  • 23:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T361627)', diff saved to https://phabricator.wikimedia.org/P61785 and previous config saved to /var/cache/conftool/dbconfig/20240502-232037-marostegui.json
  • 22:44 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-codfw: Apply updated JDK 8 - eevans@cumin1002
  • 22:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1248 (T361627)', diff saved to https://phabricator.wikimedia.org/P61784 and previous config saved to /var/cache/conftool/dbconfig/20240502-224227-marostegui.json
  • 22:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 22:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 22:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T361627)', diff saved to https://phabricator.wikimedia.org/P61783 and previous config saved to /var/cache/conftool/dbconfig/20240502-224204-marostegui.json
  • 22:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P61782 and previous config saved to /var/cache/conftool/dbconfig/20240502-222656-marostegui.json
  • 22:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P61781 and previous config saved to /var/cache/conftool/dbconfig/20240502-221149-marostegui.json
  • 21:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T361627)', diff saved to https://phabricator.wikimedia.org/P61780 and previous config saved to /var/cache/conftool/dbconfig/20240502-215641-marostegui.json
  • 21:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir7001.magru.wmnet with OS bookworm
  • 21:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1247 (T361627)', diff saved to https://phabricator.wikimedia.org/P61779 and previous config saved to /var/cache/conftool/dbconfig/20240502-214435-marostegui.json
  • 21:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 21:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 21:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 21:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 21:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T361627)', diff saved to https://phabricator.wikimedia.org/P61778 and previous config saved to /var/cache/conftool/dbconfig/20240502-213631-marostegui.json
  • 21:22 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir7001.magru.wmnet with reason: host reimage
  • 21:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P61777 and previous config saved to /var/cache/conftool/dbconfig/20240502-212123-marostegui.json
  • 21:19 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir7001.magru.wmnet with reason: host reimage
  • 21:12 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-codfw: Apply updated JDK 8 - eevans@cumin1002
  • 21:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P61776 and previous config saved to /var/cache/conftool/dbconfig/20240502-210613-marostegui.json
  • 20:53 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Apply updated JDK 8 - eevans@cumin1002
  • 20:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T361627)', diff saved to https://phabricator.wikimedia.org/P61775 and previous config saved to /var/cache/conftool/dbconfig/20240502-205105-marostegui.json
  • 20:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1244 (T361627)', diff saved to https://phabricator.wikimedia.org/P61774 and previous config saved to /var/cache/conftool/dbconfig/20240502-204208-marostegui.json
  • 20:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1244.eqiad.wmnet with reason: Maintenance
  • 20:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1244.eqiad.wmnet with reason: Maintenance
  • 20:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T361627)', diff saved to https://phabricator.wikimedia.org/P61773 and previous config saved to /var/cache/conftool/dbconfig/20240502-204146-marostegui.json
  • 20:41 brett@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir7001.magru.wmnet with OS bookworm
  • 20:32 jdrewniak@deploy1002: Sync cancelled.
  • 20:30 jdrewniak@deploy1002: jdrewniak: Backport for Revert "Deploy Vector appearance menu and increased font-size to plwiki" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P61772 and previous config saved to /var/cache/conftool/dbconfig/20240502-202638-marostegui.json
  • 20:25 jdrewniak@deploy1002: Started scap: Backport for Revert "Deploy Vector appearance menu and increased font-size to plwiki"
  • 20:21 jdrewniak@deploy1002: Sync cancelled.
  • 20:14 jdrewniak@deploy1002: bwang and jdrewniak: Backport for Update wgVectorClientPrefs to wgVectorAppearance (T362808), Deploy Vector appearance menu and increased font-size to plwiki (T362147) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P61771 and previous config saved to /var/cache/conftool/dbconfig/20240502-201131-marostegui.json
  • 20:09 jdrewniak@deploy1002: Started scap: Backport for Update wgVectorClientPrefs to wgVectorAppearance (T362808), Deploy Vector appearance menu and increased font-size to plwiki (T362147)
  • 20:04 cdanis@deploy1002: Finished scap: Backport for probenet: add magru measurement endpoint (T362902) (duration: 18m 19s)
  • 19:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T361627)', diff saved to https://phabricator.wikimedia.org/P61770 and previous config saved to /var/cache/conftool/dbconfig/20240502-195623-marostegui.json
  • 19:50 cdanis@deploy1002: cdanis: Continuing with sync
  • 19:50 cdanis@deploy1002: cdanis: Backport for probenet: add magru measurement endpoint (T362902) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:49 brett@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host ncredir7001.magru.wmnet
  • 19:49 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ncredir7001.magru.wmnet with OS bookworm
  • 19:45 cdanis@deploy1002: Started scap: Backport for probenet: add magru measurement endpoint (T362902)
  • 19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1243 (T361627)', diff saved to https://phabricator.wikimedia.org/P61769 and previous config saved to /var/cache/conftool/dbconfig/20240502-194513-marostegui.json
  • 19:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 19:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 19:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T361627)', diff saved to https://phabricator.wikimedia.org/P61768 and previous config saved to /var/cache/conftool/dbconfig/20240502-194450-marostegui.json
  • 19:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1181 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P61767 and previous config saved to /var/cache/conftool/dbconfig/20240502-194127-ladsgroup.json
  • 19:36 sukhe@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh7001.wikimedia.org
  • 19:36 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh7001.wikimedia.org with OS bookworm
  • 19:33 amastilovic@deploy1002: Finished deploy [airflow-dags/analytics@4edc35c]: (no justification provided) (duration: 00m 38s)
  • 19:32 amastilovic@deploy1002: Started deploy [airflow-dags/analytics@4edc35c]: (no justification provided)
  • 19:31 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 19:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P61766 and previous config saved to /var/cache/conftool/dbconfig/20240502-192942-marostegui.json
  • 19:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1181 (re)pooling @ 75%: Maint over', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20240502-192621-ladsgroup.json
  • 19:21 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P61765 and previous config saved to /var/cache/conftool/dbconfig/20240502-191434-marostegui.json
  • 19:11 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh7001.wikimedia.org with reason: host reimage
  • 19:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1181 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P61764 and previous config saved to /var/cache/conftool/dbconfig/20240502-191115-ladsgroup.json
  • 19:08 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh7001.wikimedia.org with reason: host reimage
  • 18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T361627)', diff saved to https://phabricator.wikimedia.org/P61763 and previous config saved to /var/cache/conftool/dbconfig/20240502-185926-marostegui.json
  • 18:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1181 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P61762 and previous config saved to /var/cache/conftool/dbconfig/20240502-185609-ladsgroup.json
  • 18:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1242 (T361627)', diff saved to https://phabricator.wikimedia.org/P61761 and previous config saved to /var/cache/conftool/dbconfig/20240502-184710-marostegui.json
  • 18:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 18:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 18:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T361627)', diff saved to https://phabricator.wikimedia.org/P61760 and previous config saved to /var/cache/conftool/dbconfig/20240502-184658-marostegui.json
  • 18:41 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host doh7001.wikimedia.org with OS bookworm
  • 18:40 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:35 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Apply updated JDK 8 - eevans@cumin1002
  • 18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P61759 and previous config saved to /var/cache/conftool/dbconfig/20240502-183151-marostegui.json
  • 18:24 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:23 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh7001.wikimedia.org on all recursors
  • 18:23 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache doh7001.wikimedia.org on all recursors
  • 18:23 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:23 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:22 sukhe: sudo cumin -b1 -s900 "A:dnsbox" "systemctl restart ntp.service"
  • 18:22 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:20 brett@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir7001.magru.wmnet with OS bookworm
  • 18:19 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir7001.magru.wmnet - brett@cumin2002"
  • 18:18 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir7001.magru.wmnet - brett@cumin2002"
  • 18:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir7001.magru.wmnet on all recursors
  • 18:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir7001.magru.wmnet on all recursors
  • 18:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:18 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir7001.magru.wmnet - brett@cumin2002"
  • 18:17 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir7001.magru.wmnet - brett@cumin2002"
  • 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P61758 and previous config saved to /var/cache/conftool/dbconfig/20240502-181643-marostegui.json
  • 18:11 sukhe: magru: setting weights on cp servers and pooling
  • 18:10 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 18:10 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host doh7001.wikimedia.org
  • 18:09 sukhe@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh7001.wikimedia.org
  • 18:09 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh7001.wikimedia.org on all recursors
  • 18:09 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache doh7001.wikimedia.org on all recursors
  • 18:09 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:09 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:08 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:05 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Apply updated JDK 8 - eevans@cumin1002
  • 18:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T361627)', diff saved to https://phabricator.wikimedia.org/P61756 and previous config saved to /var/cache/conftool/dbconfig/20240502-180136-marostegui.json
  • 17:58 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 17:55 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 17:55 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh7001.wikimedia.org on all recursors
  • 17:55 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache doh7001.wikimedia.org on all recursors
  • 17:55 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:53 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 17:53 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 17:52 sukhe@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 17:50 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 17:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1241 (T361627)', diff saved to https://phabricator.wikimedia.org/P61755 and previous config saved to /var/cache/conftool/dbconfig/20240502-174920-marostegui.json
  • 17:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 17:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T361627)', diff saved to https://phabricator.wikimedia.org/P61754 and previous config saved to /var/cache/conftool/dbconfig/20240502-174856-marostegui.json
  • 17:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P61753 and previous config saved to /var/cache/conftool/dbconfig/20240502-173349-marostegui.json
  • 17:24 brett@cumin2002: START - Cookbook sre.dns.netbox
  • 17:24 brett@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir7001.magru.wmnet
  • 17:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P61752 and previous config saved to /var/cache/conftool/dbconfig/20240502-171840-marostegui.json
  • 17:15 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 17:15 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 17:05 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 17:05 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host doh7001.wikimedia.org
  • 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T361627)', diff saved to https://phabricator.wikimedia.org/P61751 and previous config saved to /var/cache/conftool/dbconfig/20240502-170332-marostegui.json
  • 16:53 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:52 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1221 (T361627)', diff saved to https://phabricator.wikimedia.org/P61750 and previous config saved to /var/cache/conftool/dbconfig/20240502-165211-marostegui.json
  • 16:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T361627)', diff saved to https://phabricator.wikimedia.org/P61749 and previous config saved to /var/cache/conftool/dbconfig/20240502-165129-marostegui.json
  • 16:40 sukhe@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum7001.magru.wmnet
  • 16:40 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum7001.magru.wmnet with OS bookworm
  • 16:39 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:38 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P61748 and previous config saved to /var/cache/conftool/dbconfig/20240502-163622-marostegui.json
  • 16:21 amastilovic@deploy1002: Finished deploy [airflow-dags/analytics@7513bfa]: (no justification provided) (duration: 00m 44s)
  • 16:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P61747 and previous config saved to /var/cache/conftool/dbconfig/20240502-162114-marostegui.json
  • 16:20 amastilovic@deploy1002: Started deploy [airflow-dags/analytics@7513bfa]: (no justification provided)
  • 16:16 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum7001.magru.wmnet with reason: host reimage
  • 16:15 sukhe: running authdns-update once again to confirm state of dns700[12]
  • 16:14 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:14 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: force update dns7x - sukhe@cumin1002"
  • 16:13 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum7001.magru.wmnet with reason: host reimage
  • 16:12 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: force update dns7x - sukhe@cumin1002"
  • 16:12 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:12 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:11 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 16:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T361627)', diff saved to https://phabricator.wikimedia.org/P61746 and previous config saved to /var/cache/conftool/dbconfig/20240502-160606-marostegui.json
  • 16:05 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:03 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 15:56 sukhe: running authdns-update
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1199 (T361627)', diff saved to https://phabricator.wikimedia.org/P61744 and previous config saved to /var/cache/conftool/dbconfig/20240502-155359-marostegui.json
  • 15:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 15:53 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Apply updated JDK 8 - eevans@cumin1002
  • 15:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 15:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T361627)', diff saved to https://phabricator.wikimedia.org/P61743 and previous config saved to /var/cache/conftool/dbconfig/20240502-155336-marostegui.json
  • 15:51 moritzm: installing postgresql-15 security updates
  • 15:51 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns7002.wikimedia.org,service=(authdns-update|recdns|ntp)
  • 15:51 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns7001.wikimedia.org,service=(authdns-update|recdns|ntp)
  • 15:44 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host durum7001.magru.wmnet with OS bookworm
  • 15:43 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum7001.magru.wmnet - sukhe@cumin1002"
  • 15:43 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum7001.magru.wmnet - sukhe@cumin1002"
  • 15:42 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum7001.magru.wmnet on all recursors
  • 15:42 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache durum7001.magru.wmnet on all recursors
  • 15:42 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:42 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum7001.magru.wmnet - sukhe@cumin1002"
  • 15:41 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum7001.magru.wmnet - sukhe@cumin1002"
  • 15:39 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 15:39 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host durum7001.magru.wmnet
  • 15:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P61741 and previous config saved to /var/cache/conftool/dbconfig/20240502-153828-marostegui.json
  • 15:34 elukey@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore1*: Move to PKI Truststore - elukey@cumin1002
  • 15:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host netflow7001.magru.wmnet
  • 15:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netflow7001.magru.wmnet with OS bookworm
  • 15:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P61740 and previous config saved to /var/cache/conftool/dbconfig/20240502-152319-marostegui.json
  • 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: new VIP for ganeti/magru02 - jmm@cumin2002"
  • 15:15 elukey@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore1*: Move to PKI Truststore - elukey@cumin1002
  • 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61739 and previous config saved to /var/cache/conftool/dbconfig/20240502-151407-root.json
  • 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61738 and previous config saved to /var/cache/conftool/dbconfig/20240502-151403-root.json
  • 15:13 dani@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 15:12 dani@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 15:12 dani@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 15:12 elukey@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore200[5,6]*: Move to PKI Truststore - elukey@cumin1002
  • 15:12 dani@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 15:12 dani@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 15:11 dani@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 15:10 hnowlan: Move mw-on-k8s traffic percentage from 80% to 85%
  • 15:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T361627)', diff saved to https://phabricator.wikimedia.org/P61737 and previous config saved to /var/cache/conftool/dbconfig/20240502-150812-marostegui.json
  • 15:03 elukey@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=inference,name=codfw
  • 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet
  • 15:00 elukey@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore200[5,6]*: Move to PKI Truststore - elukey@cumin1002
  • 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61736 and previous config saved to /var/cache/conftool/dbconfig/20240502-145901-root.json
  • 14:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61735 and previous config saved to /var/cache/conftool/dbconfig/20240502-145856-root.json
  • 14:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: new VIP for ganeti/magru02 - jmm@cumin2002"
  • 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1190 (T361627)', diff saved to https://phabricator.wikimedia.org/P61734 and previous config saved to /var/cache/conftool/dbconfig/20240502-145632-marostegui.json
  • 14:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 14:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T361627)', diff saved to https://phabricator.wikimedia.org/P61733 and previous config saved to /var/cache/conftool/dbconfig/20240502-145609-marostegui.json
  • 14:56 elukey@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore2004*: Move to PKI Truststore - elukey@cumin1002
  • 14:55 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 14:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet
  • 14:50 elukey@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore2004*: Move to PKI Truststore - elukey@cumin1002
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61732 and previous config saved to /var/cache/conftool/dbconfig/20240502-144356-root.json
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61731 and previous config saved to /var/cache/conftool/dbconfig/20240502-144350-root.json
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61730 and previous config saved to /var/cache/conftool/dbconfig/20240502-144300-root.json
  • 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow7001.magru.wmnet with reason: host reimage
  • 14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P61729 and previous config saved to /var/cache/conftool/dbconfig/20240502-144101-marostegui.json
  • 14:38 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netflow7001.magru.wmnet with reason: host reimage
  • 14:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61728 and previous config saved to /var/cache/conftool/dbconfig/20240502-142850-root.json
  • 14:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61727 and previous config saved to /var/cache/conftool/dbconfig/20240502-142844-root.json
  • 14:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61726 and previous config saved to /var/cache/conftool/dbconfig/20240502-142754-root.json
  • 14:26 hnowlan@deploy1002: Finished scap: (no justification provided) (duration: 03m 16s)
  • 14:26 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:26 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 14:26 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:26 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 14:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P61725 and previous config saved to /var/cache/conftool/dbconfig/20240502-142554-marostegui.json
  • 14:25 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:25 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 14:23 hnowlan@deploy1002: Started scap: (no justification provided)
  • 14:22 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.43.0-wmf.3 refs T361397
  • 14:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61724 and previous config saved to /var/cache/conftool/dbconfig/20240502-141344-root.json
  • 14:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61723 and previous config saved to /var/cache/conftool/dbconfig/20240502-141339-root.json
  • 14:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61722 and previous config saved to /var/cache/conftool/dbconfig/20240502-141248-root.json
  • 14:12 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host netflow7001.magru.wmnet with OS bookworm
  • 14:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T361627)', diff saved to https://phabricator.wikimedia.org/P61721 and previous config saved to /var/cache/conftool/dbconfig/20240502-141046-marostegui.json
  • 14:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow7001.magru.wmnet - jmm@cumin2002"
  • 14:07 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow7001.magru.wmnet - jmm@cumin2002"
  • 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow7001.magru.wmnet on all recursors
  • 14:07 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow7001.magru.wmnet on all recursors
  • 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow7001.magru.wmnet - jmm@cumin2002"
  • 14:04 hnowlan@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw1371.eqiad.wmnet|mw1399.eqiad.wmnet|mw1405.eqiad.wmnet|mw1409.eqiad.wmnet|mw1435.eqiad.wmnet),cluster=kubernetes,service=kubesvc
  • 14:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow7001.magru.wmnet - jmm@cumin2002"
  • 13:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1160 (T361627)', diff saved to https://phabricator.wikimedia.org/P61720 and previous config saved to /var/cache/conftool/dbconfig/20240502-135947-marostegui.json
  • 13:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 13:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61719 and previous config saved to /var/cache/conftool/dbconfig/20240502-135839-root.json
  • 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61718 and previous config saved to /var/cache/conftool/dbconfig/20240502-135833-root.json
  • 13:58 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:58 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61717 and previous config saved to /var/cache/conftool/dbconfig/20240502-135743-root.json
  • 13:57 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 13:57 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 13:56 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:56 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host netflow7001.magru.wmnet
  • 13:54 hnowlan: running homer 'cr*eqiad*' commit for new kubernetes workers
  • 13:53 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti7003.magru.wmnet to cluster magru01 and group B3
  • 13:53 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:52 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
  • 13:52 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti7003.magru.wmnet to cluster magru01 and group B3
  • 13:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 13:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 13:50 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:50 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
  • 13:43 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti7003.magru.wmnet to cluster magru01 and group B3
  • 13:43 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti7003.magru.wmnet to cluster magru01 and group B3
  • 13:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61716 and previous config saved to /var/cache/conftool/dbconfig/20240502-134333-root.json
  • 13:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61715 and previous config saved to /var/cache/conftool/dbconfig/20240502-134328-root.json
  • 13:42 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 13:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61714 and previous config saved to /var/cache/conftool/dbconfig/20240502-134237-root.json
  • 13:42 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 13:41 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 13:40 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 13:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1175 db1189', diff saved to https://phabricator.wikimedia.org/P61713 and previous config saved to /var/cache/conftool/dbconfig/20240502-134050-root.json
  • 13:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 13:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 13:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T361627)', diff saved to https://phabricator.wikimedia.org/P61712 and previous config saved to /var/cache/conftool/dbconfig/20240502-133420-marostegui.json
  • 13:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
  • 13:32 sukhe: running authdns-update to revert magru text geomap
  • 13:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61711 and previous config saved to /var/cache/conftool/dbconfig/20240502-132731-root.json
  • 13:24 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:24 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
  • 13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P61710 and previous config saved to /var/cache/conftool/dbconfig/20240502-131912-marostegui.json
  • 13:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61709 and previous config saved to /var/cache/conftool/dbconfig/20240502-131225-root.json
  • 13:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2161.codfw.wmnet with OS bookworm
  • 13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P61708 and previous config saved to /var/cache/conftool/dbconfig/20240502-130404-marostegui.json
  • 13:02 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 12:57 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 12:49 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 12:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T361627)', diff saved to https://phabricator.wikimedia.org/P61707 and previous config saved to /var/cache/conftool/dbconfig/20240502-124857-marostegui.json
  • 12:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
  • 12:26 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS bookworm
  • 12:25 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2161.codfw.wmnet with OS bookworm
  • 12:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P61704 and previous config saved to /var/cache/conftool/dbconfig/20240502-122409-marostegui.json
  • 12:22 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
  • 12:20 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 12:19 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS bookworm
  • 12:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2161', diff saved to https://phabricator.wikimedia.org/P61703 and previous config saved to /var/cache/conftool/dbconfig/20240502-121759-root.json
  • 12:17 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1230.eqiad.wmnet
  • 12:15 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P61702 and previous config saved to /var/cache/conftool/dbconfig/20240502-120901-marostegui.json
  • 12:02 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 12:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1399.eqiad.wmnet with OS bullseye
  • 11:57 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 11:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1435.eqiad.wmnet with OS bullseye
  • 11:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet
  • 11:56 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1230.eqiad.wmnet
  • 11:55 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1405.eqiad.wmnet with OS bullseye
  • 11:55 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 11:54 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 11:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1213.eqiad.wmnet
  • 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T361627)', diff saved to https://phabricator.wikimedia.org/P61701 and previous config saved to /var/cache/conftool/dbconfig/20240502-115353-marostegui.json
  • 11:53 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1409.eqiad.wmnet with OS bullseye
  • 11:53 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 11:51 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1371.eqiad.wmnet with OS bullseye
  • 11:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2210 (T361627)', diff saved to https://phabricator.wikimedia.org/P61700 and previous config saved to /var/cache/conftool/dbconfig/20240502-114448-marostegui.json
  • 11:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2210.codfw.wmnet with reason: Maintenance
  • 11:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2210.codfw.wmnet with reason: Maintenance
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T361627)', diff saved to https://phabricator.wikimedia.org/P61699 and previous config saved to /var/cache/conftool/dbconfig/20240502-114425-marostegui.json
  • 11:43 elukey: depool LiftWing's codfw services from traffic to move all MW API calls to mw-api-int-ro
  • 11:43 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1399.eqiad.wmnet with reason: host reimage
  • 11:42 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1213.eqiad.wmnet
  • 11:42 elukey@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=inference,name=codfw
  • 11:41 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:41 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti01.svc.magru.wmnet on all recursors
  • 11:41 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache ganeti01.svc.magru.wmnet on all recursors
  • 11:40 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 11:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1435.eqiad.wmnet with reason: host reimage
  • 11:37 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1405.eqiad.wmnet with reason: host reimage
  • 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1210.eqiad.wmnet
  • 11:35 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1409.eqiad.wmnet with reason: host reimage
  • 11:35 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1405.eqiad.wmnet with reason: host reimage
  • 11:34 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1399.eqiad.wmnet with reason: host reimage
  • 11:34 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1435.eqiad.wmnet with reason: host reimage
  • 11:32 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1371.eqiad.wmnet with reason: host reimage
  • 11:30 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1409.eqiad.wmnet with reason: host reimage
  • 11:29 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1371.eqiad.wmnet with reason: host reimage
  • 11:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P61698 and previous config saved to /var/cache/conftool/dbconfig/20240502-112918-marostegui.json
  • 11:25 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1210.eqiad.wmnet
  • 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1185.eqiad.wmnet
  • 11:21 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1405.eqiad.wmnet with OS bullseye
  • 11:21 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1399.eqiad.wmnet with OS bullseye
  • 11:21 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1435.eqiad.wmnet with OS bullseye
  • 11:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1409.eqiad.wmnet with OS bullseye
  • 11:15 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1371.eqiad.wmnet with OS bullseye
  • 11:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P61697 and previous config saved to /var/cache/conftool/dbconfig/20240502-111410-marostegui.json
  • 11:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1185.eqiad.wmnet
  • 11:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti01.svc.magru.wmnet on all recursors
  • 11:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache ganeti01.svc.magru.wmnet on all recursors
  • 11:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti01.svc.magru.wmnet. on all recursors
  • 11:07 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti01.svc.magru.wmnet. on all recursors
  • 11:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti01.svc.magru.wmnet on all recursors
  • 11:07 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti01.svc.magru.wmnet on all recursors
  • 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:05 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1183.eqiad.wmnet
  • 10:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T361627)', diff saved to https://phabricator.wikimedia.org/P61696 and previous config saved to /var/cache/conftool/dbconfig/20240502-105903-marostegui.json
  • 10:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61695 and previous config saved to /var/cache/conftool/dbconfig/20240502-105530-root.json
  • 10:47 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1183.eqiad.wmnet
  • 10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2206 (T361627)', diff saved to https://phabricator.wikimedia.org/P61694 and previous config saved to /var/cache/conftool/dbconfig/20240502-104658-marostegui.json
  • 10:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2206.codfw.wmnet with reason: Maintenance
  • 10:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2206.codfw.wmnet with reason: Maintenance
  • 10:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61693 and previous config saved to /var/cache/conftool/dbconfig/20240502-104024-root.json
  • 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: new VIP for ganeti01/magru - jmm@cumin2002"
  • 10:37 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: new VIP for ganeti01/magru - jmm@cumin2002"
  • 10:36 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2213.codfw.wmnet
  • 10:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2199.codfw.wmnet with reason: Maintenance
  • 10:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2199.codfw.wmnet with reason: Maintenance
  • 10:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T361627)', diff saved to https://phabricator.wikimedia.org/P61692 and previous config saved to /var/cache/conftool/dbconfig/20240502-103601-marostegui.json
  • 10:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 10:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61691 and previous config saved to /var/cache/conftool/dbconfig/20240502-102518-root.json
  • 10:22 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2213.codfw.wmnet
  • 10:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P61690 and previous config saved to /var/cache/conftool/dbconfig/20240502-102053-marostegui.json
  • 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
  • 10:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2211.codfw.wmnet
  • 10:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
  • 10:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61689 and previous config saved to /var/cache/conftool/dbconfig/20240502-101012-root.json
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P61688 and previous config saved to /var/cache/conftool/dbconfig/20240502-100546-marostegui.json
  • 10:00 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1005.eqiad.wmnet with OS bookworm
  • 09:58 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2211.codfw.wmnet
  • 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2192.codfw.wmnet
  • 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61687 and previous config saved to /var/cache/conftool/dbconfig/20240502-095506-root.json
  • 09:54 moritzm: installing util-linux security updates
  • 09:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T361627)', diff saved to https://phabricator.wikimedia.org/P61686 and previous config saved to /var/cache/conftool/dbconfig/20240502-095038-marostegui.json
  • 09:50 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2192.codfw.wmnet
  • 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2178.codfw.wmnet
  • 09:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61685 and previous config saved to /var/cache/conftool/dbconfig/20240502-094000-root.json
  • 09:38 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw2382.codfw.wmnet with reason: Degraded RAID/storage controller issues
  • 09:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2179 (T361627)', diff saved to https://phabricator.wikimedia.org/P61684 and previous config saved to /var/cache/conftool/dbconfig/20240502-093827-marostegui.json
  • 09:38 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw2382.codfw.wmnet with reason: Degraded RAID/storage controller issues
  • 09:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 09:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 09:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T361627)', diff saved to https://phabricator.wikimedia.org/P61683 and previous config saved to /var/cache/conftool/dbconfig/20240502-093803-marostegui.json
  • 09:35 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1005.eqiad.wmnet with reason: host reimage
  • 09:32 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1005.eqiad.wmnet with reason: host reimage
  • 09:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2152.codfw.wmnet with OS bookworm
  • 09:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2178.codfw.wmnet
  • 09:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61682 and previous config saved to /var/cache/conftool/dbconfig/20240502-092454-root.json
  • 09:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P61681 and previous config saved to /var/cache/conftool/dbconfig/20240502-092256-marostegui.json
  • 09:18 hnowlan: depooling 5 appservers in advance of migrating them to k8s workers
  • 09:18 stevemunene@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
  • 09:13 stevemunene@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: sync on main
  • 09:13 stevemunene@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
  • 09:12 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1005.eqiad.wmnet with OS bookworm
  • 09:10 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cephosd1005.eqiad.wmnet with OS bookworm
  • 09:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2152.codfw.wmnet with reason: host reimage
  • 09:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2171.codfw.wmnet
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P61680 and previous config saved to /var/cache/conftool/dbconfig/20240502-090748-marostegui.json
  • 09:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2152.codfw.wmnet with reason: host reimage
  • 09:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 09:02 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 09:02 stevemunene@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: sync on main
  • 08:59 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2171.codfw.wmnet
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T361627)', diff saved to https://phabricator.wikimedia.org/P61679 and previous config saved to /var/cache/conftool/dbconfig/20240502-085241-marostegui.json
  • 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2157.codfw.wmnet
  • 08:49 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2152.codfw.wmnet with OS bookworm
  • 08:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T361627)', diff saved to https://phabricator.wikimedia.org/P61677 and previous config saved to /var/cache/conftool/dbconfig/20240502-084041-marostegui.json
  • 08:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 08:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 08:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T361627)', diff saved to https://phabricator.wikimedia.org/P61676 and previous config saved to /var/cache/conftool/dbconfig/20240502-084018-marostegui.json
  • 08:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P61675 and previous config saved to /var/cache/conftool/dbconfig/20240502-082510-marostegui.json
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P61674 and previous config saved to /var/cache/conftool/dbconfig/20240502-081002-marostegui.json
  • 08:08 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2157.codfw.wmnet
  • 08:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2123.codfw.wmnet
  • 07:57 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-public
  • 07:56 brouberol@cumin1002: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:datahubsearch
  • 07:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T361627)', diff saved to https://phabricator.wikimedia.org/P61673 and previous config saved to /var/cache/conftool/dbconfig/20240502-075455-marostegui.json
  • 07:48 brouberol@cumin1002: START - Cookbook sre.opensearch.roll-restart-reboot rolling restart_daemons on A:datahubsearch
  • 07:47 moritzm: installing Java 8 security updates
  • 07:47 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-public
  • 07:44 volans@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) netbox to netbox2002.codfw.wmnet,netbox1002.eqiad.wmnet with reason: Update Netbox dependencies for netbox - volans@cumin1002
  • 07:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T361627)', diff saved to https://phabricator.wikimedia.org/P61672 and previous config saved to /var/cache/conftool/dbconfig/20240502-074400-marostegui.json
  • 07:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 07:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T361627)', diff saved to https://phabricator.wikimedia.org/P61671 and previous config saved to /var/cache/conftool/dbconfig/20240502-074320-marostegui.json
  • 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-internal
  • 07:40 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2123.codfw.wmnet
  • 07:38 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-internal
  • 07:38 volans@cumin1002: START - Cookbook sre.deploy.python-code netbox to netbox2002.codfw.wmnet,netbox1002.eqiad.wmnet with reason: Update Netbox dependencies for netbox - volans@cumin1002
  • 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P61670 and previous config saved to /var/cache/conftool/dbconfig/20240502-072813-marostegui.json
  • 07:13 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-test
  • 07:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P61669 and previous config saved to /var/cache/conftool/dbconfig/20240502-071305-marostegui.json
  • 07:13 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-test
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T361627)', diff saved to https://phabricator.wikimedia.org/P61668 and previous config saved to /var/cache/conftool/dbconfig/20240502-065758-marostegui.json
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2147 (T361627)', diff saved to https://phabricator.wikimedia.org/P61667 and previous config saved to /var/cache/conftool/dbconfig/20240502-064533-marostegui.json
  • 06:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 06:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61666 and previous config saved to /var/cache/conftool/dbconfig/20240502-064230-root.json
  • 06:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 06:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 06:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137 (T361627)', diff saved to https://phabricator.wikimedia.org/P61665 and previous config saved to /var/cache/conftool/dbconfig/20240502-063343-marostegui.json
  • 06:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61664 and previous config saved to /var/cache/conftool/dbconfig/20240502-062725-root.json
  • 06:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137', diff saved to https://phabricator.wikimedia.org/P61663 and previous config saved to /var/cache/conftool/dbconfig/20240502-061836-marostegui.json
  • 06:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61662 and previous config saved to /var/cache/conftool/dbconfig/20240502-061218-root.json
  • 06:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137', diff saved to https://phabricator.wikimedia.org/P61661 and previous config saved to /var/cache/conftool/dbconfig/20240502-060328-marostegui.json
  • 05:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61660 and previous config saved to /var/cache/conftool/dbconfig/20240502-055712-root.json
  • 05:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137 (T361627)', diff saved to https://phabricator.wikimedia.org/P61659 and previous config saved to /var/cache/conftool/dbconfig/20240502-054821-marostegui.json
  • 05:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 05:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 05:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61658 and previous config saved to /var/cache/conftool/dbconfig/20240502-054206-root.json
  • 05:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2137 (T361627)', diff saved to https://phabricator.wikimedia.org/P61657 and previous config saved to /var/cache/conftool/dbconfig/20240502-053717-marostegui.json
  • 05:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 05:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 05:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T361627)', diff saved to https://phabricator.wikimedia.org/P61656 and previous config saved to /var/cache/conftool/dbconfig/20240502-053654-marostegui.json
  • 05:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1181.eqiad.wmnet with OS bookworm
  • 05:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61655 and previous config saved to /var/cache/conftool/dbconfig/20240502-052700-root.json
  • 05:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P61654 and previous config saved to /var/cache/conftool/dbconfig/20240502-052146-marostegui.json
  • 05:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 05:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 05:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2162.codfw.wmnet with OS bookworm
  • 05:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61653 and previous config saved to /var/cache/conftool/dbconfig/20240502-051155-root.json
  • 05:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
  • 05:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P61652 and previous config saved to /var/cache/conftool/dbconfig/20240502-050639-marostegui.json
  • 05:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
  • 04:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
  • 04:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
  • 04:52 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1181.eqiad.wmnet with OS bookworm
  • 04:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T361627)', diff saved to https://phabricator.wikimedia.org/P61651 and previous config saved to /var/cache/conftool/dbconfig/20240502-045131-marostegui.json
  • 04:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1181 T363892', diff saved to https://phabricator.wikimedia.org/P61650 and previous config saved to /var/cache/conftool/dbconfig/20240502-045017-root.json
  • 04:48 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db1236 to s7 primary and set section read-write T363892', diff saved to https://phabricator.wikimedia.org/P61649 and previous config saved to /var/cache/conftool/dbconfig/20240502-044848-marostegui.json
  • 04:48 marostegui@cumin1002: dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - T363892', diff saved to https://phabricator.wikimedia.org/P61648 and previous config saved to /var/cache/conftool/dbconfig/20240502-044819-marostegui.json
  • 04:48 marostegui: Starting s7 eqiad failover from db1181 to db1236 - T363892
  • 04:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2136 (T361627)', diff saved to https://phabricator.wikimedia.org/P61647 and previous config saved to /var/cache/conftool/dbconfig/20240502-044020-marostegui.json
  • 04:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 04:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 04:35 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2162.codfw.wmnet with OS bookworm
  • 04:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2162.codfw.wmnet with reason: Reimage
  • 04:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2162.codfw.wmnet with reason: Reimage
  • 04:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2162', diff saved to https://phabricator.wikimedia.org/P61646 and previous config saved to /var/cache/conftool/dbconfig/20240502-043403-root.json
  • 04:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 T363892
  • 04:30 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1236 with weight 0 T363892', diff saved to https://phabricator.wikimedia.org/P61645 and previous config saved to /var/cache/conftool/dbconfig/20240502-043019-marostegui.json
  • 04:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s7 T363892
  • 04:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 04:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 04:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 04:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1181.eqiad.wmnet with reason: Maintenance

2024-05-01

  • 23:57 eileen: civicrm upgraded from 3ac4043c to 80ae4543
  • 21:37 eileen: config revision changed from 36b287b6 to b772c8bc
  • 20:22 jdrewniak@deploy1002: Finished scap: Backport for [Vector] Enable appearance menu and increased font-size on testwiki (T362147) (duration: 19m 29s)
  • 20:10 jdrewniak@deploy1002: jdlrobson and jdrewniak: Continuing with sync
  • 20:08 jdrewniak@deploy1002: jdlrobson and jdrewniak: Backport for [Vector] Enable appearance menu and increased font-size on testwiki (T362147) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:03 jdrewniak@deploy1002: Started scap: Backport for [Vector] Enable appearance menu and increased font-size on testwiki (T362147)
  • 19:40 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns7002.wikimedia.org with OS bookworm
  • 19:40 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 19:39 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 19:12 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
  • 19:09 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
  • 18:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 18:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 18:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T361627)', diff saved to https://phabricator.wikimedia.org/P61644 and previous config saved to /var/cache/conftool/dbconfig/20240501-185521-marostegui.json
  • 18:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P61643 and previous config saved to /var/cache/conftool/dbconfig/20240501-184013-marostegui.json
  • 18:36 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS bookworm
  • 18:36 sukhe@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns7002.wikimedia.org with OS bookworm
  • 18:36 sukhe@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dns7002.magru.wmnet']
  • 18:35 sukhe@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dns7002.magru.wmnet']
  • 18:35 sukhe@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dns7002.magru.wmnet']
  • 18:35 sukhe@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dns7002.magru.wmnet']
  • 18:28 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1005.eqiad.wmnet with OS bookworm
  • 18:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P61642 and previous config saved to /var/cache/conftool/dbconfig/20240501-182505-marostegui.json
  • 18:16 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS bookworm
  • 18:15 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns7001.wikimedia.org with OS bookworm
  • 18:15 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 18:14 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 18:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T361627)', diff saved to https://phabricator.wikimedia.org/P61641 and previous config saved to /var/cache/conftool/dbconfig/20240501-180958-marostegui.json
  • 18:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1236 (T361627)', diff saved to https://phabricator.wikimedia.org/P61640 and previous config saved to /var/cache/conftool/dbconfig/20240501-180645-marostegui.json
  • 18:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 18:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 18:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T361627)', diff saved to https://phabricator.wikimedia.org/P61639 and previous config saved to /var/cache/conftool/dbconfig/20240501-180622-marostegui.json
  • 18:03 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1004.eqiad.wmnet with OS bookworm
  • 17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P61638 and previous config saved to /var/cache/conftool/dbconfig/20240501-175114-marostegui.json
  • 17:49 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7001.wikimedia.org with reason: host reimage
  • 17:46 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7001.wikimedia.org with reason: host reimage
  • 17:38 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1004.eqiad.wmnet with reason: host reimage
  • 17:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P61637 and previous config saved to /var/cache/conftool/dbconfig/20240501-173607-marostegui.json
  • 17:35 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1004.eqiad.wmnet with reason: host reimage
  • 17:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T361627)', diff saved to https://phabricator.wikimedia.org/P61636 and previous config saved to /var/cache/conftool/dbconfig/20240501-172059-marostegui.json
  • 17:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T361627)', diff saved to https://phabricator.wikimedia.org/P61635 and previous config saved to /var/cache/conftool/dbconfig/20240501-171527-marostegui.json
  • 17:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 17:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 17:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T361627)', diff saved to https://phabricator.wikimedia.org/P61634 and previous config saved to /var/cache/conftool/dbconfig/20240501-171504-marostegui.json
  • 17:14 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1004.eqiad.wmnet with OS bookworm
  • 17:12 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host dns7001.wikimedia.org with OS bookworm
  • 17:02 sukhe: sudo cumin -b1 -s10 "A:dnsbox" "run-puppet-agent --enable 'merging CR 1026166'"
  • 16:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P61633 and previous config saved to /var/cache/conftool/dbconfig/20240501-165957-marostegui.json
  • 16:59 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1003.eqiad.wmnet with OS bookworm
  • 16:59 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns6001.wikimedia.org
  • 16:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P61632 and previous config saved to /var/cache/conftool/dbconfig/20240501-164450-marostegui.json
  • 16:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns6001.wikimedia.org
  • 16:43 sukhe: sudo cumin "A:dnsbox" "disable-puppet 'merging CR 1026166'"
  • 16:34 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1003.eqiad.wmnet with reason: host reimage
  • 16:31 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1003.eqiad.wmnet with reason: host reimage
  • 16:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T361627)', diff saved to https://phabricator.wikimedia.org/P61630 and previous config saved to /var/cache/conftool/dbconfig/20240501-162942-marostegui.json
  • 16:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T361627)', diff saved to https://phabricator.wikimedia.org/P61629 and previous config saved to /var/cache/conftool/dbconfig/20240501-162629-marostegui.json
  • 16:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 16:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 16:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T361627)', diff saved to https://phabricator.wikimedia.org/P61628 and previous config saved to /var/cache/conftool/dbconfig/20240501-162607-marostegui.json
  • 16:11 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1003.eqiad.wmnet with OS bookworm
  • 16:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P61627 and previous config saved to /var/cache/conftool/dbconfig/20240501-161059-marostegui.json
  • 16:10 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cephosd1003.eqiad.wmnet with OS bookworm
  • 16:01 milimetric@deploy1002: Finished deploy [airflow-dags/analytics@09b4f5f]: Testing different settings for mediawiki_history_shapshot_config (duration: 00m 28s)
  • 16:00 milimetric@deploy1002: Started deploy [airflow-dags/analytics@09b4f5f]: Testing different settings for mediawiki_history_shapshot_config
  • 15:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P61626 and previous config saved to /var/cache/conftool/dbconfig/20240501-155552-marostegui.json
  • 15:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T361627)', diff saved to https://phabricator.wikimedia.org/P61625 and previous config saved to /var/cache/conftool/dbconfig/20240501-154042-marostegui.json
  • 15:39 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1003.eqiad.wmnet with OS bookworm
  • 15:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T361627)', diff saved to https://phabricator.wikimedia.org/P61624 and previous config saved to /var/cache/conftool/dbconfig/20240501-153829-marostegui.json
  • 15:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 15:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 15:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T361627)', diff saved to https://phabricator.wikimedia.org/P61623 and previous config saved to /var/cache/conftool/dbconfig/20240501-153806-marostegui.json
  • 15:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P61622 and previous config saved to /var/cache/conftool/dbconfig/20240501-152259-marostegui.json
  • 15:22 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.43.0-wmf.3 refs T361397
  • 15:15 sukhe@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns7001.wikimedia.org with OS bookworm
  • 15:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P61621 and previous config saved to /var/cache/conftool/dbconfig/20240501-150751-marostegui.json
  • 14:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T361627)', diff saved to https://phabricator.wikimedia.org/P61620 and previous config saved to /var/cache/conftool/dbconfig/20240501-145243-marostegui.json
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T361627)', diff saved to https://phabricator.wikimedia.org/P61619 and previous config saved to /var/cache/conftool/dbconfig/20240501-145131-marostegui.json
  • 14:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 14:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61618 and previous config saved to /var/cache/conftool/dbconfig/20240501-145108-marostegui.json
  • 14:43 dancy@deploy1002: Installation of scap version "4.81.0" completed for 325 hosts
  • 14:42 dancy@deploy1002: Installing scap version "4.81.0" for 325 hosts
  • 14:36 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1002.eqiad.wmnet with OS bookworm
  • 14:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P61617 and previous config saved to /var/cache/conftool/dbconfig/20240501-143601-marostegui.json
  • 14:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61616 and previous config saved to /var/cache/conftool/dbconfig/20240501-142233-root.json
  • 14:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P61615 and previous config saved to /var/cache/conftool/dbconfig/20240501-142053-marostegui.json
  • 14:12 bking@deploy1002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:11 bking@deploy1002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1002.eqiad.wmnet with reason: host reimage
  • 14:08 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1002.eqiad.wmnet with reason: host reimage
  • 14:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61614 and previous config saved to /var/cache/conftool/dbconfig/20240501-140728-root.json
  • 14:05 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7001.wikimedia.org with reason: host reimage
  • 14:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61613 and previous config saved to /var/cache/conftool/dbconfig/20240501-140545-marostegui.json
  • 14:03 bking@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61612 and previous config saved to /var/cache/conftool/dbconfig/20240501-140333-marostegui.json
  • 14:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 14:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 14:03 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7001.wikimedia.org with reason: host reimage
  • 14:03 bking@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 13:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 13:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 13:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T361627)', diff saved to https://phabricator.wikimedia.org/P61611 and previous config saved to /var/cache/conftool/dbconfig/20240501-135915-marostegui.json
  • 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61610 and previous config saved to /var/cache/conftool/dbconfig/20240501-135222-root.json
  • 13:47 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1002.eqiad.wmnet with OS bookworm
  • 13:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P61609 and previous config saved to /var/cache/conftool/dbconfig/20240501-134407-marostegui.json
  • 13:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61608 and previous config saved to /var/cache/conftool/dbconfig/20240501-133717-root.json
  • 13:33 Amir1: promoting HNowlan (WMF) to admin in testwiki
  • 13:29 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host dns7001.wikimedia.org with OS bookworm
  • 13:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P61607 and previous config saved to /var/cache/conftool/dbconfig/20240501-132900-marostegui.json
  • 13:25 sukhe: running authdns-update for CR 1026119: depool magru text*
  • 13:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61606 and previous config saved to /var/cache/conftool/dbconfig/20240501-132211-root.json
  • 13:15 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1001.eqiad.wmnet with OS bookworm
  • 13:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T361627)', diff saved to https://phabricator.wikimedia.org/P61605 and previous config saved to /var/cache/conftool/dbconfig/20240501-131351-marostegui.json
  • 13:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T361627)', diff saved to https://phabricator.wikimedia.org/P61604 and previous config saved to /var/cache/conftool/dbconfig/20240501-130822-marostegui.json
  • 13:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 13:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 13:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T361627)', diff saved to https://phabricator.wikimedia.org/P61603 and previous config saved to /var/cache/conftool/dbconfig/20240501-130747-marostegui.json
  • 13:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61602 and previous config saved to /var/cache/conftool/dbconfig/20240501-130704-root.json
  • 12:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS bookworm
  • 12:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P61601 and previous config saved to /var/cache/conftool/dbconfig/20240501-125239-marostegui.json
  • 12:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61600 and previous config saved to /var/cache/conftool/dbconfig/20240501-125158-root.json
  • 12:48 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1001.eqiad.wmnet with reason: host reimage
  • 12:45 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1001.eqiad.wmnet with reason: host reimage
  • 12:24 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1001.eqiad.wmnet with OS bookworm
  • 12:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T361627)', diff saved to https://phabricator.wikimedia.org/P61598 and previous config saved to /var/cache/conftool/dbconfig/20240501-122224-marostegui.json
  • 12:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T361627)', diff saved to https://phabricator.wikimedia.org/P61597 and previous config saved to /var/cache/conftool/dbconfig/20240501-122012-marostegui.json
  • 12:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 12:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 12:15 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS bookworm
  • 12:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 12:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 12:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2154', diff saved to https://phabricator.wikimedia.org/P61596 and previous config saved to /var/cache/conftool/dbconfig/20240501-121347-root.json
  • 12:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61595 and previous config saved to /var/cache/conftool/dbconfig/20240501-120833-root.json
  • 11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T361627)', diff saved to https://phabricator.wikimedia.org/P61594 and previous config saved to /var/cache/conftool/dbconfig/20240501-115915-marostegui.json
  • 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61593 and previous config saved to /var/cache/conftool/dbconfig/20240501-115327-root.json
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P61592 and previous config saved to /var/cache/conftool/dbconfig/20240501-114408-marostegui.json
  • 11:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61591 and previous config saved to /var/cache/conftool/dbconfig/20240501-113821-root.json
  • 11:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P61590 and previous config saved to /var/cache/conftool/dbconfig/20240501-112900-marostegui.json
  • 11:24 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs7003.magru.wmnet with OS bullseye
  • 11:24 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 11:23 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61589 and previous config saved to /var/cache/conftool/dbconfig/20240501-112315-root.json
  • 11:22 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 11:17 sukhe@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs7002.magru.wmnet
  • 11:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T361627)', diff saved to https://phabricator.wikimedia.org/P61588 and previous config saved to /var/cache/conftool/dbconfig/20240501-111353-marostegui.json
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2220 (T361627)', diff saved to https://phabricator.wikimedia.org/P61587 and previous config saved to /var/cache/conftool/dbconfig/20240501-110834-marostegui.json
  • 11:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 11:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T361627)', diff saved to https://phabricator.wikimedia.org/P61586 and previous config saved to /var/cache/conftool/dbconfig/20240501-110822-marostegui.json
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61585 and previous config saved to /var/cache/conftool/dbconfig/20240501-110809-root.json
  • 11:07 sukhe@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs7001.magru.wmnet
  • 11:05 sukhe@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs7002.magru.wmnet
  • 10:58 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs7003.magru.wmnet with reason: host reimage
  • 10:55 sukhe@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs7001.magru.wmnet
  • 10:55 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs7003.magru.wmnet with reason: host reimage
  • 10:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P61584 and previous config saved to /var/cache/conftool/dbconfig/20240501-105315-marostegui.json
  • 10:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61583 and previous config saved to /var/cache/conftool/dbconfig/20240501-105304-root.json
  • 10:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS bookworm
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P61582 and previous config saved to /var/cache/conftool/dbconfig/20240501-103801-marostegui.json
  • 10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61581 and previous config saved to /var/cache/conftool/dbconfig/20240501-103758-root.json
  • 10:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 100%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61580 and previous config saved to /var/cache/conftool/dbconfig/20240501-103338-arnaudb.json
  • 10:30 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host lvs7003.magru.wmnet with OS bullseye
  • 10:30 sukhe@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs7003.magru.wmnet with OS bullseye
  • 10:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: Down with HW issues
  • 10:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: Down with HW issues
  • 10:28 sukhe@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs7003.magru.wmnet']
  • 10:27 sukhe@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs7003.magru.wmnet']
  • 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T361627)', diff saved to https://phabricator.wikimedia.org/P61579 and previous config saved to /var/cache/conftool/dbconfig/20240501-102253-marostegui.json
  • 10:22 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host lvs7003.magru.wmnet with OS bullseye
  • 10:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
  • 10:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 75%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61578 and previous config saved to /var/cache/conftool/dbconfig/20240501-101832-arnaudb.json
  • 10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
  • 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2208 (T361627)', diff saved to https://phabricator.wikimedia.org/P61577 and previous config saved to /var/cache/conftool/dbconfig/20240501-101728-marostegui.json
  • 10:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 10:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 10:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 10:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61576 and previous config saved to /var/cache/conftool/dbconfig/20240501-101151-root.json
  • 10:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 10:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T361627)', diff saved to https://phabricator.wikimedia.org/P61575 and previous config saved to /var/cache/conftool/dbconfig/20240501-100650-marostegui.json
  • 10:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 50%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61574 and previous config saved to /var/cache/conftool/dbconfig/20240501-100326-arnaudb.json
  • 10:00 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS bookworm
  • 09:58 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2163', diff saved to https://phabricator.wikimedia.org/P61573 and previous config saved to /var/cache/conftool/dbconfig/20240501-095845-root.json
  • 09:56 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61572 and previous config saved to /var/cache/conftool/dbconfig/20240501-095646-root.json
  • 09:52 topranks: restarting routinator service on rpki1001
  • 09:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P61571 and previous config saved to /var/cache/conftool/dbconfig/20240501-095142-marostegui.json
  • 09:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 25%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61570 and previous config saved to /var/cache/conftool/dbconfig/20240501-094821-arnaudb.json
  • 09:42 marostegui@deploy1002: Finished scap: Backport for etcd.php: Add es7 (T355285 T355424) (duration: 14m 53s)
  • 09:41 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61569 and previous config saved to /var/cache/conftool/dbconfig/20240501-094140-root.json
  • 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P61568 and previous config saved to /var/cache/conftool/dbconfig/20240501-093635-marostegui.json
  • 09:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 15%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61567 and previous config saved to /var/cache/conftool/dbconfig/20240501-093315-arnaudb.json
  • 09:30 marostegui@deploy1002: marostegui: Continuing with sync
  • 09:30 marostegui@deploy1002: marostegui: Backport for etcd.php: Add es7 (T355285 T355424) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:27 marostegui@deploy1002: Started scap: Backport for etcd.php: Add es7 (T355285 T355424)
  • 09:26 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61566 and previous config saved to /var/cache/conftool/dbconfig/20240501-092634-root.json
  • 09:22 topranks: withdrawing public prefix announcement to AS7195 to test backup in magru (T362421)
  • 09:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T361627)', diff saved to https://phabricator.wikimedia.org/P61565 and previous config saved to /var/cache/conftool/dbconfig/20240501-092125-marostegui.json
  • 09:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 10%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61564 and previous config saved to /var/cache/conftool/dbconfig/20240501-091809-arnaudb.json
  • 09:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T361627)', diff saved to https://phabricator.wikimedia.org/P61563 and previous config saved to /var/cache/conftool/dbconfig/20240501-091513-marostegui.json
  • 09:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 09:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 09:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T361627)', diff saved to https://phabricator.wikimedia.org/P61562 and previous config saved to /var/cache/conftool/dbconfig/20240501-091451-marostegui.json
  • 09:13 marostegui@cumin1002: dbctl commit (dc=all): 'Push es7 codfw config T355424', diff saved to https://phabricator.wikimedia.org/P61561 and previous config saved to /var/cache/conftool/dbconfig/20240501-091352-marostegui.json
  • 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61560 and previous config saved to /var/cache/conftool/dbconfig/20240501-091128-root.json
  • 09:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 5%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61559 and previous config saved to /var/cache/conftool/dbconfig/20240501-090303-arnaudb.json
  • 08:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P61558 and previous config saved to /var/cache/conftool/dbconfig/20240501-085943-marostegui.json
  • 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61557 and previous config saved to /var/cache/conftool/dbconfig/20240501-085622-root.json
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61556 and previous config saved to /var/cache/conftool/dbconfig/20240501-085223-root.json
  • 08:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P61555 and previous config saved to /var/cache/conftool/dbconfig/20240501-084436-marostegui.json
  • 08:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS bookworm
  • 08:41 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61554 and previous config saved to /var/cache/conftool/dbconfig/20240501-084116-root.json
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61553 and previous config saved to /var/cache/conftool/dbconfig/20240501-083717-root.json
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61552 and previous config saved to /var/cache/conftool/dbconfig/20240501-083641-root.json
  • 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'Push es7 eqiad config T355285', diff saved to https://phabricator.wikimedia.org/P61551 and previous config saved to /var/cache/conftool/dbconfig/20240501-083120-marostegui.json
  • 08:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T361627)', diff saved to https://phabricator.wikimedia.org/P61550 and previous config saved to /var/cache/conftool/dbconfig/20240501-082928-marostegui.json
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168 (T361627)', diff saved to https://phabricator.wikimedia.org/P61549 and previous config saved to /var/cache/conftool/dbconfig/20240501-082357-marostegui.json
  • 08:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T361627)', diff saved to https://phabricator.wikimedia.org/P61548 and previous config saved to /var/cache/conftool/dbconfig/20240501-082334-marostegui.json
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61547 and previous config saved to /var/cache/conftool/dbconfig/20240501-082211-root.json
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61546 and previous config saved to /var/cache/conftool/dbconfig/20240501-082135-root.json
  • 08:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
  • 08:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P61545 and previous config saved to /var/cache/conftool/dbconfig/20240501-080827-marostegui.json
  • 08:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61544 and previous config saved to /var/cache/conftool/dbconfig/20240501-080706-root.json
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61543 and previous config saved to /var/cache/conftool/dbconfig/20240501-080630-root.json
  • 08:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 08:05 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 08:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61542 and previous config saved to /var/cache/conftool/dbconfig/20240501-080354-root.json
  • 07:59 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS bookworm
  • 07:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2164', diff saved to https://phabricator.wikimedia.org/P61541 and previous config saved to /var/cache/conftool/dbconfig/20240501-075614-root.json
  • 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P61540 and previous config saved to /var/cache/conftool/dbconfig/20240501-075320-marostegui.json
  • 07:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61539 and previous config saved to /var/cache/conftool/dbconfig/20240501-075200-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61538 and previous config saved to /var/cache/conftool/dbconfig/20240501-075124-root.json
  • 07:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61537 and previous config saved to /var/cache/conftool/dbconfig/20240501-074848-root.json
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T361627)', diff saved to https://phabricator.wikimedia.org/P61536 and previous config saved to /var/cache/conftool/dbconfig/20240501-073812-marostegui.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61535 and previous config saved to /var/cache/conftool/dbconfig/20240501-073655-root.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61534 and previous config saved to /var/cache/conftool/dbconfig/20240501-073615-root.json
  • 07:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61533 and previous config saved to /var/cache/conftool/dbconfig/20240501-073342-root.json
  • 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T361627)', diff saved to https://phabricator.wikimedia.org/P61532 and previous config saved to /var/cache/conftool/dbconfig/20240501-073201-marostegui.json
  • 07:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T361627)', diff saved to https://phabricator.wikimedia.org/P61531 and previous config saved to /var/cache/conftool/dbconfig/20240501-073123-marostegui.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61530 and previous config saved to /var/cache/conftool/dbconfig/20240501-072149-root.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61529 and previous config saved to /var/cache/conftool/dbconfig/20240501-072110-root.json
  • 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61528 and previous config saved to /var/cache/conftool/dbconfig/20240501-071836-root.json
  • 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P61527 and previous config saved to /var/cache/conftool/dbconfig/20240501-071615-marostegui.json
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61526 and previous config saved to /var/cache/conftool/dbconfig/20240501-070603-root.json
  • 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61525 and previous config saved to /var/cache/conftool/dbconfig/20240501-070330-root.json
  • 07:02 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1186.eqiad.wmnet onto db1234.eqiad.wmnet
  • 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P61524 and previous config saved to /var/cache/conftool/dbconfig/20240501-070108-marostegui.json
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61523 and previous config saved to /var/cache/conftool/dbconfig/20240501-065845-root.json
  • 06:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61522 and previous config saved to /var/cache/conftool/dbconfig/20240501-064824-root.json
  • 06:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T361627)', diff saved to https://phabricator.wikimedia.org/P61521 and previous config saved to /var/cache/conftool/dbconfig/20240501-064600-marostegui.json
  • 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61520 and previous config saved to /var/cache/conftool/dbconfig/20240501-064339-root.json
  • 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T361627)', diff saved to https://phabricator.wikimedia.org/P61519 and previous config saved to /var/cache/conftool/dbconfig/20240501-063942-marostegui.json
  • 06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 06:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T361627)', diff saved to https://phabricator.wikimedia.org/P61518 and previous config saved to /var/cache/conftool/dbconfig/20240501-063919-marostegui.json
  • 06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS bookworm
  • 06:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61517 and previous config saved to /var/cache/conftool/dbconfig/20240501-063318-root.json
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61516 and previous config saved to /var/cache/conftool/dbconfig/20240501-062833-root.json
  • 06:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P61515 and previous config saved to /var/cache/conftool/dbconfig/20240501-062407-marostegui.json
  • 06:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
  • 06:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61514 and previous config saved to /var/cache/conftool/dbconfig/20240501-061327-root.json
  • 06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P61513 and previous config saved to /var/cache/conftool/dbconfig/20240501-060900-marostegui.json
  • 05:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61512 and previous config saved to /var/cache/conftool/dbconfig/20240501-055822-root.json
  • 05:58 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS bookworm
  • 05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2166', diff saved to https://phabricator.wikimedia.org/P61511 and previous config saved to /var/cache/conftool/dbconfig/20240501-055657-root.json
  • 05:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T361627)', diff saved to https://phabricator.wikimedia.org/P61510 and previous config saved to /var/cache/conftool/dbconfig/20240501-055353-marostegui.json
  • 05:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T361627)', diff saved to https://phabricator.wikimedia.org/P61509 and previous config saved to /var/cache/conftool/dbconfig/20240501-054720-marostegui.json
  • 05:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 05:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 05:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T361627)', diff saved to https://phabricator.wikimedia.org/P61508 and previous config saved to /var/cache/conftool/dbconfig/20240501-054657-marostegui.json
  • 05:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61507 and previous config saved to /var/cache/conftool/dbconfig/20240501-054316-root.json
  • 05:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es[1035,1039-1040].eqiad.wmnet with reason: Setting up T355285 T355424
  • 05:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on es[1035,1039-1040].eqiad.wmnet with reason: Setting up T355285 T355424
  • 05:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 6 hosts with reason: Setting up T355285 T355424
  • 05:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 6 hosts with reason: Setting up T355285 T355424
  • 05:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P61506 and previous config saved to /var/cache/conftool/dbconfig/20240501-053149-marostegui.json
  • 05:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1234.eqiad.wmnet with OS bookworm
  • 05:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1236.eqiad.wmnet with OS bookworm
  • 05:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61505 and previous config saved to /var/cache/conftool/dbconfig/20240501-052810-root.json
  • 05:23 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db1186.eqiad.wmnet onto db1234.eqiad.wmnet
  • 05:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1186 to clone db1234 T363890', diff saved to https://phabricator.wikimedia.org/P61504 and previous config saved to /var/cache/conftool/dbconfig/20240501-051848-marostegui.json
  • 05:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P61503 and previous config saved to /var/cache/conftool/dbconfig/20240501-051642-marostegui.json
  • 05:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
  • 05:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
  • 05:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
  • 05:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: Down with HW issues
  • 05:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: Down with HW issues
  • 05:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
  • 05:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T361627)', diff saved to https://phabricator.wikimedia.org/P61502 and previous config saved to /var/cache/conftool/dbconfig/20240501-050135-marostegui.json
  • 04:57 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1236.eqiad.wmnet with OS bookworm
  • 04:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1236', diff saved to https://phabricator.wikimedia.org/P61501 and previous config saved to /var/cache/conftool/dbconfig/20240501-045624-marostegui.json
  • 04:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T361627)', diff saved to https://phabricator.wikimedia.org/P61500 and previous config saved to /var/cache/conftool/dbconfig/20240501-045517-marostegui.json
  • 04:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 04:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 04:54 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1234.eqiad.wmnet with OS bookworm
  • 04:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 02:31 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs7002.magru.wmnet with OS bullseye
  • 02:31 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 02:29 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 02:07 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs7002.magru.wmnet with reason: host reimage
  • 02:04 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs7002.magru.wmnet with reason: host reimage
  • 01:37 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host lvs7002.magru.wmnet with OS bullseye
  • 01:26 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs7001.magru.wmnet with OS bullseye
  • 01:26 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 01:25 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 01:02 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs7001.magru.wmnet with reason: host reimage
  • 00:58 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs7001.magru.wmnet with reason: host reimage
  • 00:33 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host lvs7001.magru.wmnet with OS bullseye
  • 00:23 xcollazo@deploy1002: Finished deploy [airflow-dags/analytics@b10376a]: (no justification provided) (duration: 00m 31s)
  • 00:22 xcollazo@deploy1002: Started deploy [airflow-dags/analytics@b10376a]: (no justification provided)
  • 00:05 eileen: civicrm upgraded from 393e1deb to 3ac4043

Archives

See Server Admin Log/Archives.