Server Admin Log

From Wikitech
(Redirected from Server admin log)

2024-05-04

  • 13:41 jayme: doubled the number of eventgate-main replicas in eqiad to 16
  • 07:39 taavi@cumin1002: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
  • 07:33 taavi@cumin1002: START - Cookbook sre.wikireplicas.update-views
  • 03:07 denisse: Restarting `status curator_actions_cluster_wide.service` to log with DEBUGG level on logstash2026 - T364190
  • 03:06 denisse: Enable log level DEBUG for curator on logstash2026 - T364190
  • 01:33 bblack@cumin1002: conftool action : set/weight=100; selector: name=dns7.*
  • 01:24 bblack: lvs7001 - restart pybal
  • 01:23 bblack: lvs7003 - restart pybal

2024-05-03

  • 21:38 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on wdqs2023.codfw.wmnet with reason: T362920
  • 21:38 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 6 days, 0:00:00 on wdqs2023.codfw.wmnet with reason: T362920
  • 21:27 ryankemper: T362920 [wdqs] Depooled `wdqs2023` in preparation to switch it to a graph split host
  • 19:02 sukhe: cleaning up stale confd template files for magru related reimaging
  • 18:44 brett@cumin2002: conftool action : set/pooled=yes; selector: name=ncredir7002.magru.wmnet,service=nginx
  • 18:43 brett@cumin2002: conftool action : set/pooled=yes; selector: name=ncredir7001.magru.wmnet,service=nginx
  • 18:38 brett@cumin2002: conftool action : set/pooled=no; selector: name=ncredir7001.magru.wmnet,service=nginx
  • 18:38 brett@cumin2002: conftool action : set/pooled=no; selector: name=ncredir7002.magru.wmnet,service=nginx
  • 18:29 brett@cumin2002: conftool action : set/pooled=yes; selector: name=ncredir7002.magru.wmnet,service=nginx
  • 18:29 brett@cumin2002: conftool action : set/weight=1; selector: name=ncredir7002.magru.wmnet,service=nginx
  • 18:29 brett@cumin2002: conftool action : set/pooled=yes; selector: name=ncredir7001.magru.wmnet,service=nginx
  • 18:28 brett@cumin2002: conftool action : set/weight=1; selector: name=ncredir7001.magru.wmnet,service=nginx
  • 17:45 dcausse: repooling wdqs1012
  • 17:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 17:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 17:14 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir7002.magru.wmnet
  • 17:14 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir7002.magru.wmnet with OS bookworm
  • 17:13 denisse: Run `sudo mdadm --add /dev/md1 /dev/sdg` on `centrallog1002` - T363660
  • 17:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 17:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 17:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T361627)', diff saved to https://phabricator.wikimedia.org/P61862 and previous config saved to /var/cache/conftool/dbconfig/20240503-170054-marostegui.json
  • 16:47 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir7002.magru.wmnet with reason: host reimage
  • 16:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P61860 and previous config saved to /var/cache/conftool/dbconfig/20240503-164546-marostegui.json
  • 16:44 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir7002.magru.wmnet with reason: host reimage
  • 16:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P61859 and previous config saved to /var/cache/conftool/dbconfig/20240503-163039-marostegui.json
  • 16:18 brett@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir7002.magru.wmnet with OS bookworm
  • 16:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T361627)', diff saved to https://phabricator.wikimedia.org/P61858 and previous config saved to /var/cache/conftool/dbconfig/20240503-161531-marostegui.json
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2195 (T361627)', diff saved to https://phabricator.wikimedia.org/P61857 and previous config saved to /var/cache/conftool/dbconfig/20240503-155432-marostegui.json
  • 15:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 15:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T361627)', diff saved to https://phabricator.wikimedia.org/P61856 and previous config saved to /var/cache/conftool/dbconfig/20240503-155409-marostegui.json
  • 15:42 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir7002.magru.wmnet - brett@cumin2002"
  • 15:41 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir7002.magru.wmnet - brett@cumin2002"
  • 15:40 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir7002.magru.wmnet on all recursors
  • 15:40 brett@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir7002.magru.wmnet on all recursors
  • 15:40 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:40 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir7002.magru.wmnet - brett@cumin2002"
  • 15:39 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir7002.magru.wmnet - brett@cumin2002"
  • 15:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P61855 and previous config saved to /var/cache/conftool/dbconfig/20240503-153901-marostegui.json
  • 15:34 brett@cumin2002: START - Cookbook sre.dns.netbox
  • 15:34 brett@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir7002.magru.wmnet
  • 15:26 dcausse: depooled wdqs1012 (lagged)
  • 15:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P61854 and previous config saved to /var/cache/conftool/dbconfig/20240503-152354-marostegui.json
  • 15:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T361627)', diff saved to https://phabricator.wikimedia.org/P61853 and previous config saved to /var/cache/conftool/dbconfig/20240503-150846-marostegui.json
  • 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add install7001 - jmm@cumin2002"
  • 14:44 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@5d3a06d] (releasing): update plugins to address vulnerabilities (duration: 00m 39s)
  • 14:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2181 (T361627)', diff saved to https://phabricator.wikimedia.org/P61852 and previous config saved to /var/cache/conftool/dbconfig/20240503-144419-marostegui.json
  • 14:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 14:44 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@5d3a06d] (releasing): update plugins to address vulnerabilities
  • 14:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T361627)', diff saved to https://phabricator.wikimedia.org/P61851 and previous config saved to /var/cache/conftool/dbconfig/20240503-144356-marostegui.json
  • 14:39 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@5d3a06d] (releasing): test plugin update in secondary host (duration: 00m 22s)
  • 14:39 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@5d3a06d] (releasing): test plugin update in secondary host
  • 14:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P61850 and previous config saved to /var/cache/conftool/dbconfig/20240503-142848-marostegui.json
  • 14:26 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add install7001 - jmm@cumin2002"
  • 14:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install7001.wikimedia.org
  • 14:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install7001.wikimedia.org with OS bookworm
  • 14:16 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:15 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 14:14 sukhe: sudo homer asw*magru* commit "add durum and doh hosts in magru"
  • 14:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P61849 and previous config saved to /var/cache/conftool/dbconfig/20240503-141341-marostegui.json
  • 14:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install7001.wikimedia.org with reason: host reimage
  • 14:08 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on install7001.wikimedia.org with reason: host reimage
  • 14:07 herron: alert1001:~# systemctl restart prometheus-alertmanager.service
  • 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T361627)', diff saved to https://phabricator.wikimedia.org/P61848 and previous config saved to /var/cache/conftool/dbconfig/20240503-135834-marostegui.json
  • 13:43 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install7001.wikimedia.org with OS bookworm
  • 13:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2167 (T361627)', diff saved to https://phabricator.wikimedia.org/P61847 and previous config saved to /var/cache/conftool/dbconfig/20240503-133601-marostegui.json
  • 13:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 13:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 13:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T361627)', diff saved to https://phabricator.wikimedia.org/P61846 and previous config saved to /var/cache/conftool/dbconfig/20240503-133538-marostegui.json
  • 13:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install7001.wikimedia.org - jmm@cumin2002"
  • 13:29 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install7001.wikimedia.org - jmm@cumin2002"
  • 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install7001.wikimedia.org on all recursors
  • 13:28 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install7001.wikimedia.org on all recursors
  • 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install7001.wikimedia.org - jmm@cumin2002"
  • 13:26 elukey: restart karma on alert1001 to verify if probe down alerts shown are stale
  • 13:26 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install7001.wikimedia.org - jmm@cumin2002"
  • 13:23 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:22 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 13:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P61845 and previous config saved to /var/cache/conftool/dbconfig/20240503-132030-marostegui.json
  • 13:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P61844 and previous config saved to /var/cache/conftool/dbconfig/20240503-130523-marostegui.json
  • 13:04 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:03 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 12:51 cmooney@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 12:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T361627)', diff saved to https://phabricator.wikimedia.org/P61843 and previous config saved to /var/cache/conftool/dbconfig/20240503-125015-marostegui.json
  • 12:47 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 12:26 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61841 and previous config saved to /var/cache/conftool/dbconfig/20240503-122659-root.json
  • 12:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2166 (T361627)', diff saved to https://phabricator.wikimedia.org/P61840 and previous config saved to /var/cache/conftool/dbconfig/20240503-122510-marostegui.json
  • 12:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 12:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 12:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T361627)', diff saved to https://phabricator.wikimedia.org/P61839 and previous config saved to /var/cache/conftool/dbconfig/20240503-122446-marostegui.json
  • 12:11 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61838 and previous config saved to /var/cache/conftool/dbconfig/20240503-121153-root.json
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P61837 and previous config saved to /var/cache/conftool/dbconfig/20240503-120938-marostegui.json
  • 12:06 topranks: removing entries for lsw1-a1-codfw switch and private1-a1-codfw vlan from puppet T364097
  • 12:02 sukhe@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh7002.wikimedia.org
  • 12:02 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh7002.wikimedia.org with OS bookworm
  • 12:01 moritzm: uploaded wmf-sre-laptop 0.5.10 to apt.wikimedia.org
  • 11:57 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:57 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove lsw1-a1-codfw phyiscal link dns - cmooney@cumin1002"
  • 11:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61835 and previous config saved to /var/cache/conftool/dbconfig/20240503-115647-root.json
  • 11:55 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove lsw1-a1-codfw phyiscal link dns - cmooney@cumin1002"
  • 11:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P61834 and previous config saved to /var/cache/conftool/dbconfig/20240503-115431-marostegui.json
  • 11:53 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 11:45 sukhe@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum7002.magru.wmnet
  • 11:45 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum7002.magru.wmnet with OS bookworm
  • 11:44 topranks: Removing connections from ssw1-a1-codfw and ssw1-a8-codfw to lsw1-a1-codfw T364097
  • 11:41 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh7002.wikimedia.org with reason: host reimage
  • 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61833 and previous config saved to /var/cache/conftool/dbconfig/20240503-114141-root.json
  • 11:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T361627)', diff saved to https://phabricator.wikimedia.org/P61832 and previous config saved to /var/cache/conftool/dbconfig/20240503-113924-marostegui.json
  • 11:38 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh7002.wikimedia.org with reason: host reimage
  • 11:27 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum7002.magru.wmnet with reason: host reimage
  • 11:26 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61831 and previous config saved to /var/cache/conftool/dbconfig/20240503-112635-root.json
  • 11:23 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum7002.magru.wmnet with reason: host reimage
  • 11:19 taavi@cumin1002: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
  • 11:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM durum7001.magru.wmnet
  • 11:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS bookworm
  • 11:16 taavi@cumin1002: START - Cookbook sre.wikireplicas.update-views
  • 11:16 taavi@cumin1002: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=93)
  • 11:15 taavi@cumin1002: START - Cookbook sre.wikireplicas.update-views
  • 11:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2164 (T361627)', diff saved to https://phabricator.wikimedia.org/P61830 and previous config saved to /var/cache/conftool/dbconfig/20240503-111415-marostegui.json
  • 11:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 11:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 11:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 11:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 11:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T361627)', diff saved to https://phabricator.wikimedia.org/P61829 and previous config saved to /var/cache/conftool/dbconfig/20240503-111337-marostegui.json
  • 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM durum7001.magru.wmnet
  • 11:11 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host doh7002.wikimedia.org with OS bookworm
  • 11:11 marostegui@cumin1002: dbctl commit (dc=all): 'db1203 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61828 and previous config saved to /var/cache/conftool/dbconfig/20240503-111129-root.json
  • 11:11 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh7002.wikimedia.org - sukhe@cumin1002"
  • 11:10 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh7002.wikimedia.org - sukhe@cumin1002"
  • 11:09 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh7002.wikimedia.org on all recursors
  • 11:09 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache doh7002.wikimedia.org on all recursors
  • 11:09 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:09 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh7002.wikimedia.org - sukhe@cumin1002"
  • 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM doh7001.wikimedia.org
  • 11:08 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh7002.wikimedia.org - sukhe@cumin1002"
  • 11:06 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 11:06 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host doh7002.wikimedia.org
  • 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM doh7001.wikimedia.org
  • 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ncredir7001.magru.wmnet
  • 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ncredir7001.magru.wmnet
  • 10:58 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host durum7002.magru.wmnet with OS bookworm
  • 10:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240503-105824-marostegui.json
  • 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM netflow7001.magru.wmnet
  • 10:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
  • 10:54 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum7002.magru.wmnet - sukhe@cumin1002"
  • 10:53 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum7002.magru.wmnet - sukhe@cumin1002"
  • 10:53 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum7002.magru.wmnet on all recursors
  • 10:53 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache durum7002.magru.wmnet on all recursors
  • 10:53 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:53 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum7002.magru.wmnet - sukhe@cumin1002"
  • 10:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
  • 10:52 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum7002.magru.wmnet - sukhe@cumin1002"
  • 10:51 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM netflow7001.magru.wmnet
  • 10:50 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 10:50 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host durum7002.magru.wmnet
  • 10:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P61827 and previous config saved to /var/cache/conftool/dbconfig/20240503-104317-marostegui.json
  • 10:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS bookworm
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1203', diff saved to https://phabricator.wikimedia.org/P61826 and previous config saved to /var/cache/conftool/dbconfig/20240503-103814-root.json
  • 10:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add bast7001 - jmm@cumin2002 - T364016"
  • 10:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add bast7001 - jmm@cumin2002 - T364016"
  • 10:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T361627)', diff saved to https://phabricator.wikimedia.org/P61825 and previous config saved to /var/cache/conftool/dbconfig/20240503-102809-marostegui.json
  • 10:27 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lsw1-a1-codfw,lsw1-a1-codfw IPv6,lsw1-a1-codfw.mgmt with reason: device being decommed and renamed, downtiming as a precaution first
  • 10:27 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on lsw1-a1-codfw,lsw1-a1-codfw IPv6,lsw1-a1-codfw.mgmt with reason: device being decommed and renamed, downtiming as a precaution first
  • 10:15 moritzm: installing Java 17 security updates on idp-test
  • 10:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2163 (T361627)', diff saved to https://phabricator.wikimedia.org/P61823 and previous config saved to /var/cache/conftool/dbconfig/20240503-100335-marostegui.json
  • 10:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 10:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 10:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T361627)', diff saved to https://phabricator.wikimedia.org/P61822 and previous config saved to /var/cache/conftool/dbconfig/20240503-100313-marostegui.json
  • 09:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P61821 and previous config saved to /var/cache/conftool/dbconfig/20240503-094805-marostegui.json
  • 09:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P61820 and previous config saved to /var/cache/conftool/dbconfig/20240503-093257-marostegui.json
  • 09:26 pfischer@deploy1002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T361627)', diff saved to https://phabricator.wikimedia.org/P61818 and previous config saved to /var/cache/conftool/dbconfig/20240503-091750-marostegui.json
  • 09:11 pfischer@deploy1002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2162 (T361627)', diff saved to https://phabricator.wikimedia.org/P61817 and previous config saved to /var/cache/conftool/dbconfig/20240503-085234-marostegui.json
  • 08:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host bast7001.wikimedia.org
  • 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast7001.wikimedia.org with OS bookworm
  • 08:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T361627)', diff saved to https://phabricator.wikimedia.org/P61816 and previous config saved to /var/cache/conftool/dbconfig/20240503-085211-marostegui.json
  • 08:48 XioNoX: restart turnilo
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P61815 and previous config saved to /var/cache/conftool/dbconfig/20240503-083703-marostegui.json
  • 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast7001.wikimedia.org with reason: host reimage
  • 08:33 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast7001.wikimedia.org with reason: host reimage
  • 08:30 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P61814 and previous config saved to /var/cache/conftool/dbconfig/20240503-082156-marostegui.json
  • 08:20 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 08:11 moritzm: installing emacs security updates
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T361627)', diff saved to https://phabricator.wikimedia.org/P61813 and previous config saved to /var/cache/conftool/dbconfig/20240503-080649-marostegui.json
  • 08:05 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host bast7001.wikimedia.org with OS bookworm
  • 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast7001.wikimedia.org - jmm@cumin2002"
  • 08:00 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast7001.wikimedia.org - jmm@cumin2002"
  • 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) bast7001.wikimedia.org on all recursors
  • 07:59 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache bast7001.wikimedia.org on all recursors
  • 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast7001.wikimedia.org - jmm@cumin2002"
  • 07:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast7001.wikimedia.org - jmm@cumin2002"
  • 07:53 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 07:53 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host bast7001.wikimedia.org
  • 07:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2161 (T361627)', diff saved to https://phabricator.wikimedia.org/P61812 and previous config saved to /var/cache/conftool/dbconfig/20240503-074135-marostegui.json
  • 07:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 07:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 07:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T361627)', diff saved to https://phabricator.wikimedia.org/P61811 and previous config saved to /var/cache/conftool/dbconfig/20240503-074112-marostegui.json
  • 07:33 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti7004.magru.wmnet to cluster magru02 and group B4
  • 07:32 zabe: zabe@mwmaint1002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=metawiki --logwiki=metawiki 'Arnadh2011' 'User435211' # T363654
  • 07:32 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti7004.magru.wmnet to cluster magru02 and group B4
  • 07:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P61810 and previous config saved to /var/cache/conftool/dbconfig/20240503-072604-marostegui.json
  • 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61809 and previous config saved to /var/cache/conftool/dbconfig/20240503-071853-root.json
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P61808 and previous config saved to /var/cache/conftool/dbconfig/20240503-071057-marostegui.json
  • 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61807 and previous config saved to /var/cache/conftool/dbconfig/20240503-070347-root.json
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T361627)', diff saved to https://phabricator.wikimedia.org/P61806 and previous config saved to /var/cache/conftool/dbconfig/20240503-065547-marostegui.json
  • 06:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61805 and previous config saved to /var/cache/conftool/dbconfig/20240503-064842-root.json
  • 06:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61804 and previous config saved to /var/cache/conftool/dbconfig/20240503-063336-root.json
  • 06:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2154 (T361627)', diff saved to https://phabricator.wikimedia.org/P61803 and previous config saved to /var/cache/conftool/dbconfig/20240503-063048-marostegui.json
  • 06:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T361627)', diff saved to https://phabricator.wikimedia.org/P61802 and previous config saved to /var/cache/conftool/dbconfig/20240503-063025-marostegui.json
  • 06:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61801 and previous config saved to /var/cache/conftool/dbconfig/20240503-061830-root.json
  • 06:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P61800 and previous config saved to /var/cache/conftool/dbconfig/20240503-061517-marostegui.json
  • 06:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61799 and previous config saved to /var/cache/conftool/dbconfig/20240503-060324-root.json
  • 06:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P61798 and previous config saved to /var/cache/conftool/dbconfig/20240503-060010-marostegui.json
  • 05:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1214 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61797 and previous config saved to /var/cache/conftool/dbconfig/20240503-054818-root.json
  • 05:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS bookworm
  • 05:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T361627)', diff saved to https://phabricator.wikimedia.org/P61796 and previous config saved to /var/cache/conftool/dbconfig/20240503-054502-marostegui.json
  • 05:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
  • 05:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2152 (T361627)', diff saved to https://phabricator.wikimedia.org/P61795 and previous config saved to /var/cache/conftool/dbconfig/20240503-052430-marostegui.json
  • 05:24 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 05:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
  • 05:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 05:11 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS bookworm
  • 05:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1214', diff saved to https://phabricator.wikimedia.org/P61794 and previous config saved to /var/cache/conftool/dbconfig/20240503-050947-root.json
  • 04:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 04:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 01:04 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-eqiad: Apply updated JDK 8 - eevans@cumin1002
  • 01:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 01:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 01:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T361627)', diff saved to https://phabricator.wikimedia.org/P61793 and previous config saved to /var/cache/conftool/dbconfig/20240503-010330-marostegui.json
  • 00:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P61792 and previous config saved to /var/cache/conftool/dbconfig/20240503-004821-marostegui.json
  • 00:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P61791 and previous config saved to /var/cache/conftool/dbconfig/20240503-003313-marostegui.json
  • 00:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T361627)', diff saved to https://phabricator.wikimedia.org/P61790 and previous config saved to /var/cache/conftool/dbconfig/20240503-001805-marostegui.json
  • 00:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1249 (T361627)', diff saved to https://phabricator.wikimedia.org/P61789 and previous config saved to /var/cache/conftool/dbconfig/20240503-000614-marostegui.json
  • 00:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 00:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 00:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T361627)', diff saved to https://phabricator.wikimedia.org/P61788 and previous config saved to /var/cache/conftool/dbconfig/20240503-000602-marostegui.json

2024-05-02

  • 23:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P61787 and previous config saved to /var/cache/conftool/dbconfig/20240502-235053-marostegui.json
  • 23:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P61786 and previous config saved to /var/cache/conftool/dbconfig/20240502-233545-marostegui.json
  • 23:33 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-eqiad: Apply updated JDK 8 - eevans@cumin1002
  • 23:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T361627)', diff saved to https://phabricator.wikimedia.org/P61785 and previous config saved to /var/cache/conftool/dbconfig/20240502-232037-marostegui.json
  • 22:44 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-codfw: Apply updated JDK 8 - eevans@cumin1002
  • 22:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1248 (T361627)', diff saved to https://phabricator.wikimedia.org/P61784 and previous config saved to /var/cache/conftool/dbconfig/20240502-224227-marostegui.json
  • 22:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 22:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 22:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T361627)', diff saved to https://phabricator.wikimedia.org/P61783 and previous config saved to /var/cache/conftool/dbconfig/20240502-224204-marostegui.json
  • 22:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P61782 and previous config saved to /var/cache/conftool/dbconfig/20240502-222656-marostegui.json
  • 22:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P61781 and previous config saved to /var/cache/conftool/dbconfig/20240502-221149-marostegui.json
  • 21:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T361627)', diff saved to https://phabricator.wikimedia.org/P61780 and previous config saved to /var/cache/conftool/dbconfig/20240502-215641-marostegui.json
  • 21:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir7001.magru.wmnet with OS bookworm
  • 21:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1247 (T361627)', diff saved to https://phabricator.wikimedia.org/P61779 and previous config saved to /var/cache/conftool/dbconfig/20240502-214435-marostegui.json
  • 21:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 21:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 21:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 21:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 21:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T361627)', diff saved to https://phabricator.wikimedia.org/P61778 and previous config saved to /var/cache/conftool/dbconfig/20240502-213631-marostegui.json
  • 21:22 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir7001.magru.wmnet with reason: host reimage
  • 21:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P61777 and previous config saved to /var/cache/conftool/dbconfig/20240502-212123-marostegui.json
  • 21:19 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir7001.magru.wmnet with reason: host reimage
  • 21:12 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-codfw: Apply updated JDK 8 - eevans@cumin1002
  • 21:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P61776 and previous config saved to /var/cache/conftool/dbconfig/20240502-210613-marostegui.json
  • 20:53 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Apply updated JDK 8 - eevans@cumin1002
  • 20:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T361627)', diff saved to https://phabricator.wikimedia.org/P61775 and previous config saved to /var/cache/conftool/dbconfig/20240502-205105-marostegui.json
  • 20:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1244 (T361627)', diff saved to https://phabricator.wikimedia.org/P61774 and previous config saved to /var/cache/conftool/dbconfig/20240502-204208-marostegui.json
  • 20:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1244.eqiad.wmnet with reason: Maintenance
  • 20:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1244.eqiad.wmnet with reason: Maintenance
  • 20:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T361627)', diff saved to https://phabricator.wikimedia.org/P61773 and previous config saved to /var/cache/conftool/dbconfig/20240502-204146-marostegui.json
  • 20:41 brett@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir7001.magru.wmnet with OS bookworm
  • 20:32 jdrewniak@deploy1002: Sync cancelled.
  • 20:30 jdrewniak@deploy1002: jdrewniak: Backport for Revert "Deploy Vector appearance menu and increased font-size to plwiki" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P61772 and previous config saved to /var/cache/conftool/dbconfig/20240502-202638-marostegui.json
  • 20:25 jdrewniak@deploy1002: Started scap: Backport for Revert "Deploy Vector appearance menu and increased font-size to plwiki"
  • 20:21 jdrewniak@deploy1002: Sync cancelled.
  • 20:14 jdrewniak@deploy1002: bwang and jdrewniak: Backport for Update wgVectorClientPrefs to wgVectorAppearance (T362808), Deploy Vector appearance menu and increased font-size to plwiki (T362147) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P61771 and previous config saved to /var/cache/conftool/dbconfig/20240502-201131-marostegui.json
  • 20:09 jdrewniak@deploy1002: Started scap: Backport for Update wgVectorClientPrefs to wgVectorAppearance (T362808), Deploy Vector appearance menu and increased font-size to plwiki (T362147)
  • 20:04 cdanis@deploy1002: Finished scap: Backport for probenet: add magru measurement endpoint (T362902) (duration: 18m 19s)
  • 19:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T361627)', diff saved to https://phabricator.wikimedia.org/P61770 and previous config saved to /var/cache/conftool/dbconfig/20240502-195623-marostegui.json
  • 19:50 cdanis@deploy1002: cdanis: Continuing with sync
  • 19:50 cdanis@deploy1002: cdanis: Backport for probenet: add magru measurement endpoint (T362902) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:49 brett@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host ncredir7001.magru.wmnet
  • 19:49 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ncredir7001.magru.wmnet with OS bookworm
  • 19:45 cdanis@deploy1002: Started scap: Backport for probenet: add magru measurement endpoint (T362902)
  • 19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1243 (T361627)', diff saved to https://phabricator.wikimedia.org/P61769 and previous config saved to /var/cache/conftool/dbconfig/20240502-194513-marostegui.json
  • 19:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 19:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 19:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T361627)', diff saved to https://phabricator.wikimedia.org/P61768 and previous config saved to /var/cache/conftool/dbconfig/20240502-194450-marostegui.json
  • 19:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1181 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P61767 and previous config saved to /var/cache/conftool/dbconfig/20240502-194127-ladsgroup.json
  • 19:36 sukhe@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh7001.wikimedia.org
  • 19:36 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh7001.wikimedia.org with OS bookworm
  • 19:33 amastilovic@deploy1002: Finished deploy [airflow-dags/analytics@4edc35c]: (no justification provided) (duration: 00m 38s)
  • 19:32 amastilovic@deploy1002: Started deploy [airflow-dags/analytics@4edc35c]: (no justification provided)
  • 19:31 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 19:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P61766 and previous config saved to /var/cache/conftool/dbconfig/20240502-192942-marostegui.json
  • 19:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1181 (re)pooling @ 75%: Maint over', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20240502-192621-ladsgroup.json
  • 19:21 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P61765 and previous config saved to /var/cache/conftool/dbconfig/20240502-191434-marostegui.json
  • 19:11 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh7001.wikimedia.org with reason: host reimage
  • 19:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1181 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P61764 and previous config saved to /var/cache/conftool/dbconfig/20240502-191115-ladsgroup.json
  • 19:08 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh7001.wikimedia.org with reason: host reimage
  • 18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T361627)', diff saved to https://phabricator.wikimedia.org/P61763 and previous config saved to /var/cache/conftool/dbconfig/20240502-185926-marostegui.json
  • 18:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1181 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P61762 and previous config saved to /var/cache/conftool/dbconfig/20240502-185609-ladsgroup.json
  • 18:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1242 (T361627)', diff saved to https://phabricator.wikimedia.org/P61761 and previous config saved to /var/cache/conftool/dbconfig/20240502-184710-marostegui.json
  • 18:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 18:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 18:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T361627)', diff saved to https://phabricator.wikimedia.org/P61760 and previous config saved to /var/cache/conftool/dbconfig/20240502-184658-marostegui.json
  • 18:41 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host doh7001.wikimedia.org with OS bookworm
  • 18:40 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:35 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Apply updated JDK 8 - eevans@cumin1002
  • 18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P61759 and previous config saved to /var/cache/conftool/dbconfig/20240502-183151-marostegui.json
  • 18:24 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:23 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh7001.wikimedia.org on all recursors
  • 18:23 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache doh7001.wikimedia.org on all recursors
  • 18:23 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:23 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:22 sukhe: sudo cumin -b1 -s900 "A:dnsbox" "systemctl restart ntp.service"
  • 18:22 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:20 brett@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir7001.magru.wmnet with OS bookworm
  • 18:19 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir7001.magru.wmnet - brett@cumin2002"
  • 18:18 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir7001.magru.wmnet - brett@cumin2002"
  • 18:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir7001.magru.wmnet on all recursors
  • 18:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir7001.magru.wmnet on all recursors
  • 18:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:18 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir7001.magru.wmnet - brett@cumin2002"
  • 18:17 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir7001.magru.wmnet - brett@cumin2002"
  • 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P61758 and previous config saved to /var/cache/conftool/dbconfig/20240502-181643-marostegui.json
  • 18:11 sukhe: magru: setting weights on cp servers and pooling
  • 18:10 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 18:10 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host doh7001.wikimedia.org
  • 18:09 sukhe@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh7001.wikimedia.org
  • 18:09 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh7001.wikimedia.org on all recursors
  • 18:09 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache doh7001.wikimedia.org on all recursors
  • 18:09 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:09 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:08 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh7001.wikimedia.org - sukhe@cumin1002"
  • 18:05 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Apply updated JDK 8 - eevans@cumin1002
  • 18:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T361627)', diff saved to https://phabricator.wikimedia.org/P61756 and previous config saved to /var/cache/conftool/dbconfig/20240502-180136-marostegui.json
  • 17:58 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 17:55 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 17:55 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh7001.wikimedia.org on all recursors
  • 17:55 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache doh7001.wikimedia.org on all recursors
  • 17:55 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:53 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 17:53 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 17:52 sukhe@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 17:50 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 17:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1241 (T361627)', diff saved to https://phabricator.wikimedia.org/P61755 and previous config saved to /var/cache/conftool/dbconfig/20240502-174920-marostegui.json
  • 17:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 17:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T361627)', diff saved to https://phabricator.wikimedia.org/P61754 and previous config saved to /var/cache/conftool/dbconfig/20240502-174856-marostegui.json
  • 17:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P61753 and previous config saved to /var/cache/conftool/dbconfig/20240502-173349-marostegui.json
  • 17:24 brett@cumin2002: START - Cookbook sre.dns.netbox
  • 17:24 brett@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir7001.magru.wmnet
  • 17:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P61752 and previous config saved to /var/cache/conftool/dbconfig/20240502-171840-marostegui.json
  • 17:15 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 17:15 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 17:05 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 17:05 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host doh7001.wikimedia.org
  • 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T361627)', diff saved to https://phabricator.wikimedia.org/P61751 and previous config saved to /var/cache/conftool/dbconfig/20240502-170332-marostegui.json
  • 16:53 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:52 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1221 (T361627)', diff saved to https://phabricator.wikimedia.org/P61750 and previous config saved to /var/cache/conftool/dbconfig/20240502-165211-marostegui.json
  • 16:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T361627)', diff saved to https://phabricator.wikimedia.org/P61749 and previous config saved to /var/cache/conftool/dbconfig/20240502-165129-marostegui.json
  • 16:40 sukhe@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum7001.magru.wmnet
  • 16:40 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum7001.magru.wmnet with OS bookworm
  • 16:39 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:38 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P61748 and previous config saved to /var/cache/conftool/dbconfig/20240502-163622-marostegui.json
  • 16:21 amastilovic@deploy1002: Finished deploy [airflow-dags/analytics@7513bfa]: (no justification provided) (duration: 00m 44s)
  • 16:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P61747 and previous config saved to /var/cache/conftool/dbconfig/20240502-162114-marostegui.json
  • 16:20 amastilovic@deploy1002: Started deploy [airflow-dags/analytics@7513bfa]: (no justification provided)
  • 16:16 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum7001.magru.wmnet with reason: host reimage
  • 16:15 sukhe: running authdns-update once again to confirm state of dns700[12]
  • 16:14 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:14 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: force update dns7x - sukhe@cumin1002"
  • 16:13 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum7001.magru.wmnet with reason: host reimage
  • 16:12 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: force update dns7x - sukhe@cumin1002"
  • 16:12 sfaci@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:12 sfaci@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:11 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 16:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T361627)', diff saved to https://phabricator.wikimedia.org/P61746 and previous config saved to /var/cache/conftool/dbconfig/20240502-160606-marostegui.json
  • 16:05 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:03 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 15:56 sukhe: running authdns-update
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1199 (T361627)', diff saved to https://phabricator.wikimedia.org/P61744 and previous config saved to /var/cache/conftool/dbconfig/20240502-155359-marostegui.json
  • 15:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 15:53 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Apply updated JDK 8 - eevans@cumin1002
  • 15:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 15:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T361627)', diff saved to https://phabricator.wikimedia.org/P61743 and previous config saved to /var/cache/conftool/dbconfig/20240502-155336-marostegui.json
  • 15:51 moritzm: installing postgresql-15 security updates
  • 15:51 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns7002.wikimedia.org,service=(authdns-update|recdns|ntp)
  • 15:51 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns7001.wikimedia.org,service=(authdns-update|recdns|ntp)
  • 15:44 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host durum7001.magru.wmnet with OS bookworm
  • 15:43 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum7001.magru.wmnet - sukhe@cumin1002"
  • 15:43 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum7001.magru.wmnet - sukhe@cumin1002"
  • 15:42 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum7001.magru.wmnet on all recursors
  • 15:42 sukhe@cumin1002: START - Cookbook sre.dns.wipe-cache durum7001.magru.wmnet on all recursors
  • 15:42 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:42 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum7001.magru.wmnet - sukhe@cumin1002"
  • 15:41 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum7001.magru.wmnet - sukhe@cumin1002"
  • 15:39 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 15:39 sukhe@cumin1002: START - Cookbook sre.ganeti.makevm for new host durum7001.magru.wmnet
  • 15:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P61741 and previous config saved to /var/cache/conftool/dbconfig/20240502-153828-marostegui.json
  • 15:34 elukey@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore1*: Move to PKI Truststore - elukey@cumin1002
  • 15:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host netflow7001.magru.wmnet
  • 15:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netflow7001.magru.wmnet with OS bookworm
  • 15:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P61740 and previous config saved to /var/cache/conftool/dbconfig/20240502-152319-marostegui.json
  • 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: new VIP for ganeti/magru02 - jmm@cumin2002"
  • 15:15 elukey@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore1*: Move to PKI Truststore - elukey@cumin1002
  • 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61739 and previous config saved to /var/cache/conftool/dbconfig/20240502-151407-root.json
  • 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61738 and previous config saved to /var/cache/conftool/dbconfig/20240502-151403-root.json
  • 15:13 dani@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 15:12 dani@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 15:12 dani@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 15:12 elukey@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore200[5,6]*: Move to PKI Truststore - elukey@cumin1002
  • 15:12 dani@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 15:12 dani@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 15:11 dani@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 15:10 hnowlan: Move mw-on-k8s traffic percentage from 80% to 85%
  • 15:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T361627)', diff saved to https://phabricator.wikimedia.org/P61737 and previous config saved to /var/cache/conftool/dbconfig/20240502-150812-marostegui.json
  • 15:03 elukey@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=inference,name=codfw
  • 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet
  • 15:00 elukey@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore200[5,6]*: Move to PKI Truststore - elukey@cumin1002
  • 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61736 and previous config saved to /var/cache/conftool/dbconfig/20240502-145901-root.json
  • 14:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61735 and previous config saved to /var/cache/conftool/dbconfig/20240502-145856-root.json
  • 14:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: new VIP for ganeti/magru02 - jmm@cumin2002"
  • 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1190 (T361627)', diff saved to https://phabricator.wikimedia.org/P61734 and previous config saved to /var/cache/conftool/dbconfig/20240502-145632-marostegui.json
  • 14:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 14:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T361627)', diff saved to https://phabricator.wikimedia.org/P61733 and previous config saved to /var/cache/conftool/dbconfig/20240502-145609-marostegui.json
  • 14:56 elukey@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore2004*: Move to PKI Truststore - elukey@cumin1002
  • 14:55 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 14:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet
  • 14:50 elukey@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore2004*: Move to PKI Truststore - elukey@cumin1002
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61732 and previous config saved to /var/cache/conftool/dbconfig/20240502-144356-root.json
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61731 and previous config saved to /var/cache/conftool/dbconfig/20240502-144350-root.json
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61730 and previous config saved to /var/cache/conftool/dbconfig/20240502-144300-root.json
  • 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow7001.magru.wmnet with reason: host reimage
  • 14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P61729 and previous config saved to /var/cache/conftool/dbconfig/20240502-144101-marostegui.json
  • 14:38 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netflow7001.magru.wmnet with reason: host reimage
  • 14:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61728 and previous config saved to /var/cache/conftool/dbconfig/20240502-142850-root.json
  • 14:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61727 and previous config saved to /var/cache/conftool/dbconfig/20240502-142844-root.json
  • 14:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61726 and previous config saved to /var/cache/conftool/dbconfig/20240502-142754-root.json
  • 14:26 hnowlan@deploy1002: Finished scap: (no justification provided) (duration: 03m 16s)
  • 14:26 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:26 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 14:26 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:26 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 14:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P61725 and previous config saved to /var/cache/conftool/dbconfig/20240502-142554-marostegui.json
  • 14:25 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:25 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 14:23 hnowlan@deploy1002: Started scap: (no justification provided)
  • 14:22 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.43.0-wmf.3 refs T361397
  • 14:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61724 and previous config saved to /var/cache/conftool/dbconfig/20240502-141344-root.json
  • 14:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61723 and previous config saved to /var/cache/conftool/dbconfig/20240502-141339-root.json
  • 14:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61722 and previous config saved to /var/cache/conftool/dbconfig/20240502-141248-root.json
  • 14:12 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host netflow7001.magru.wmnet with OS bookworm
  • 14:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T361627)', diff saved to https://phabricator.wikimedia.org/P61721 and previous config saved to /var/cache/conftool/dbconfig/20240502-141046-marostegui.json
  • 14:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow7001.magru.wmnet - jmm@cumin2002"
  • 14:07 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow7001.magru.wmnet - jmm@cumin2002"
  • 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow7001.magru.wmnet on all recursors
  • 14:07 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow7001.magru.wmnet on all recursors
  • 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow7001.magru.wmnet - jmm@cumin2002"
  • 14:04 hnowlan@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw1371.eqiad.wmnet|mw1399.eqiad.wmnet|mw1405.eqiad.wmnet|mw1409.eqiad.wmnet|mw1435.eqiad.wmnet),cluster=kubernetes,service=kubesvc
  • 14:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow7001.magru.wmnet - jmm@cumin2002"
  • 13:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1160 (T361627)', diff saved to https://phabricator.wikimedia.org/P61720 and previous config saved to /var/cache/conftool/dbconfig/20240502-135947-marostegui.json
  • 13:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 13:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61719 and previous config saved to /var/cache/conftool/dbconfig/20240502-135839-root.json
  • 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61718 and previous config saved to /var/cache/conftool/dbconfig/20240502-135833-root.json
  • 13:58 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:58 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61717 and previous config saved to /var/cache/conftool/dbconfig/20240502-135743-root.json
  • 13:57 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 13:57 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 13:56 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:56 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host netflow7001.magru.wmnet
  • 13:54 hnowlan: running homer 'cr*eqiad*' commit for new kubernetes workers
  • 13:53 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti7003.magru.wmnet to cluster magru01 and group B3
  • 13:53 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:52 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
  • 13:52 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti7003.magru.wmnet to cluster magru01 and group B3
  • 13:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 13:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 13:50 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:50 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
  • 13:43 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti7003.magru.wmnet to cluster magru01 and group B3
  • 13:43 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti7003.magru.wmnet to cluster magru01 and group B3
  • 13:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1175 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61716 and previous config saved to /var/cache/conftool/dbconfig/20240502-134333-root.json
  • 13:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1189 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61715 and previous config saved to /var/cache/conftool/dbconfig/20240502-134328-root.json
  • 13:42 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 13:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61714 and previous config saved to /var/cache/conftool/dbconfig/20240502-134237-root.json
  • 13:42 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 13:41 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 13:40 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 13:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1175 db1189', diff saved to https://phabricator.wikimedia.org/P61713 and previous config saved to /var/cache/conftool/dbconfig/20240502-134050-root.json
  • 13:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 13:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 13:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T361627)', diff saved to https://phabricator.wikimedia.org/P61712 and previous config saved to /var/cache/conftool/dbconfig/20240502-133420-marostegui.json
  • 13:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
  • 13:32 sukhe: running authdns-update to revert magru text geomap
  • 13:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61711 and previous config saved to /var/cache/conftool/dbconfig/20240502-132731-root.json
  • 13:24 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:24 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
  • 13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P61710 and previous config saved to /var/cache/conftool/dbconfig/20240502-131912-marostegui.json
  • 13:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61709 and previous config saved to /var/cache/conftool/dbconfig/20240502-131225-root.json
  • 13:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2161.codfw.wmnet with OS bookworm
  • 13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P61708 and previous config saved to /var/cache/conftool/dbconfig/20240502-130404-marostegui.json
  • 13:02 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 12:57 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 12:49 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 12:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T361627)', diff saved to https://phabricator.wikimedia.org/P61707 and previous config saved to /var/cache/conftool/dbconfig/20240502-124857-marostegui.json
  • 12:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
  • 12:26 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS bookworm
  • 12:25 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2161.codfw.wmnet with OS bookworm
  • 12:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P61704 and previous config saved to /var/cache/conftool/dbconfig/20240502-122409-marostegui.json
  • 12:22 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
  • 12:20 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 12:19 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS bookworm
  • 12:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2161', diff saved to https://phabricator.wikimedia.org/P61703 and previous config saved to /var/cache/conftool/dbconfig/20240502-121759-root.json
  • 12:17 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1230.eqiad.wmnet
  • 12:15 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P61702 and previous config saved to /var/cache/conftool/dbconfig/20240502-120901-marostegui.json
  • 12:02 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 12:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1399.eqiad.wmnet with OS bullseye
  • 11:57 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 11:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1435.eqiad.wmnet with OS bullseye
  • 11:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet
  • 11:56 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1230.eqiad.wmnet
  • 11:55 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1405.eqiad.wmnet with OS bullseye
  • 11:55 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 11:54 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 11:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1213.eqiad.wmnet
  • 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T361627)', diff saved to https://phabricator.wikimedia.org/P61701 and previous config saved to /var/cache/conftool/dbconfig/20240502-115353-marostegui.json
  • 11:53 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1409.eqiad.wmnet with OS bullseye
  • 11:53 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 11:51 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1371.eqiad.wmnet with OS bullseye
  • 11:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2210 (T361627)', diff saved to https://phabricator.wikimedia.org/P61700 and previous config saved to /var/cache/conftool/dbconfig/20240502-114448-marostegui.json
  • 11:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2210.codfw.wmnet with reason: Maintenance
  • 11:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2210.codfw.wmnet with reason: Maintenance
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T361627)', diff saved to https://phabricator.wikimedia.org/P61699 and previous config saved to /var/cache/conftool/dbconfig/20240502-114425-marostegui.json
  • 11:43 elukey: depool LiftWing's codfw services from traffic to move all MW API calls to mw-api-int-ro
  • 11:43 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1399.eqiad.wmnet with reason: host reimage
  • 11:42 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1213.eqiad.wmnet
  • 11:42 elukey@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=inference,name=codfw
  • 11:41 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:41 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti01.svc.magru.wmnet on all recursors
  • 11:41 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache ganeti01.svc.magru.wmnet on all recursors
  • 11:40 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 11:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1435.eqiad.wmnet with reason: host reimage
  • 11:37 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1405.eqiad.wmnet with reason: host reimage
  • 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1210.eqiad.wmnet
  • 11:35 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1409.eqiad.wmnet with reason: host reimage
  • 11:35 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1405.eqiad.wmnet with reason: host reimage
  • 11:34 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1399.eqiad.wmnet with reason: host reimage
  • 11:34 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1435.eqiad.wmnet with reason: host reimage
  • 11:32 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1371.eqiad.wmnet with reason: host reimage
  • 11:30 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1409.eqiad.wmnet with reason: host reimage
  • 11:29 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1371.eqiad.wmnet with reason: host reimage
  • 11:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P61698 and previous config saved to /var/cache/conftool/dbconfig/20240502-112918-marostegui.json
  • 11:25 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1210.eqiad.wmnet
  • 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1185.eqiad.wmnet
  • 11:21 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1405.eqiad.wmnet with OS bullseye
  • 11:21 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1399.eqiad.wmnet with OS bullseye
  • 11:21 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1435.eqiad.wmnet with OS bullseye
  • 11:17 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1409.eqiad.wmnet with OS bullseye
  • 11:15 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1371.eqiad.wmnet with OS bullseye
  • 11:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P61697 and previous config saved to /var/cache/conftool/dbconfig/20240502-111410-marostegui.json
  • 11:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1185.eqiad.wmnet
  • 11:08 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti01.svc.magru.wmnet on all recursors
  • 11:08 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache ganeti01.svc.magru.wmnet on all recursors
  • 11:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti01.svc.magru.wmnet. on all recursors
  • 11:07 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti01.svc.magru.wmnet. on all recursors
  • 11:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti01.svc.magru.wmnet on all recursors
  • 11:07 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti01.svc.magru.wmnet on all recursors
  • 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:05 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1183.eqiad.wmnet
  • 10:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T361627)', diff saved to https://phabricator.wikimedia.org/P61696 and previous config saved to /var/cache/conftool/dbconfig/20240502-105903-marostegui.json
  • 10:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61695 and previous config saved to /var/cache/conftool/dbconfig/20240502-105530-root.json
  • 10:47 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1183.eqiad.wmnet
  • 10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2206 (T361627)', diff saved to https://phabricator.wikimedia.org/P61694 and previous config saved to /var/cache/conftool/dbconfig/20240502-104658-marostegui.json
  • 10:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2206.codfw.wmnet with reason: Maintenance
  • 10:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2206.codfw.wmnet with reason: Maintenance
  • 10:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61693 and previous config saved to /var/cache/conftool/dbconfig/20240502-104024-root.json
  • 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: new VIP for ganeti01/magru - jmm@cumin2002"
  • 10:37 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: new VIP for ganeti01/magru - jmm@cumin2002"
  • 10:36 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2213.codfw.wmnet
  • 10:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2199.codfw.wmnet with reason: Maintenance
  • 10:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2199.codfw.wmnet with reason: Maintenance
  • 10:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T361627)', diff saved to https://phabricator.wikimedia.org/P61692 and previous config saved to /var/cache/conftool/dbconfig/20240502-103601-marostegui.json
  • 10:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 10:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61691 and previous config saved to /var/cache/conftool/dbconfig/20240502-102518-root.json
  • 10:22 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2213.codfw.wmnet
  • 10:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P61690 and previous config saved to /var/cache/conftool/dbconfig/20240502-102053-marostegui.json
  • 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
  • 10:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2211.codfw.wmnet
  • 10:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
  • 10:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61689 and previous config saved to /var/cache/conftool/dbconfig/20240502-101012-root.json
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P61688 and previous config saved to /var/cache/conftool/dbconfig/20240502-100546-marostegui.json
  • 10:00 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1005.eqiad.wmnet with OS bookworm
  • 09:58 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2211.codfw.wmnet
  • 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2192.codfw.wmnet
  • 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61687 and previous config saved to /var/cache/conftool/dbconfig/20240502-095506-root.json
  • 09:54 moritzm: installing util-linux security updates
  • 09:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T361627)', diff saved to https://phabricator.wikimedia.org/P61686 and previous config saved to /var/cache/conftool/dbconfig/20240502-095038-marostegui.json
  • 09:50 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2192.codfw.wmnet
  • 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2178.codfw.wmnet
  • 09:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61685 and previous config saved to /var/cache/conftool/dbconfig/20240502-094000-root.json
  • 09:38 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw2382.codfw.wmnet with reason: Degraded RAID/storage controller issues
  • 09:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2179 (T361627)', diff saved to https://phabricator.wikimedia.org/P61684 and previous config saved to /var/cache/conftool/dbconfig/20240502-093827-marostegui.json
  • 09:38 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw2382.codfw.wmnet with reason: Degraded RAID/storage controller issues
  • 09:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 09:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 09:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T361627)', diff saved to https://phabricator.wikimedia.org/P61683 and previous config saved to /var/cache/conftool/dbconfig/20240502-093803-marostegui.json
  • 09:35 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1005.eqiad.wmnet with reason: host reimage
  • 09:32 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1005.eqiad.wmnet with reason: host reimage
  • 09:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2152.codfw.wmnet with OS bookworm
  • 09:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2178.codfw.wmnet
  • 09:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61682 and previous config saved to /var/cache/conftool/dbconfig/20240502-092454-root.json
  • 09:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P61681 and previous config saved to /var/cache/conftool/dbconfig/20240502-092256-marostegui.json
  • 09:18 hnowlan: depooling 5 appservers in advance of migrating them to k8s workers
  • 09:18 stevemunene@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
  • 09:13 stevemunene@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: sync on main
  • 09:13 stevemunene@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
  • 09:12 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1005.eqiad.wmnet with OS bookworm
  • 09:10 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cephosd1005.eqiad.wmnet with OS bookworm
  • 09:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2152.codfw.wmnet with reason: host reimage
  • 09:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2171.codfw.wmnet
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P61680 and previous config saved to /var/cache/conftool/dbconfig/20240502-090748-marostegui.json
  • 09:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2152.codfw.wmnet with reason: host reimage
  • 09:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 09:02 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 09:02 stevemunene@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: sync on main
  • 08:59 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2171.codfw.wmnet
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T361627)', diff saved to https://phabricator.wikimedia.org/P61679 and previous config saved to /var/cache/conftool/dbconfig/20240502-085241-marostegui.json
  • 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2157.codfw.wmnet
  • 08:49 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2152.codfw.wmnet with OS bookworm
  • 08:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T361627)', diff saved to https://phabricator.wikimedia.org/P61677 and previous config saved to /var/cache/conftool/dbconfig/20240502-084041-marostegui.json
  • 08:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 08:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 08:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T361627)', diff saved to https://phabricator.wikimedia.org/P61676 and previous config saved to /var/cache/conftool/dbconfig/20240502-084018-marostegui.json
  • 08:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P61675 and previous config saved to /var/cache/conftool/dbconfig/20240502-082510-marostegui.json
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P61674 and previous config saved to /var/cache/conftool/dbconfig/20240502-081002-marostegui.json
  • 08:08 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2157.codfw.wmnet
  • 08:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2123.codfw.wmnet
  • 07:57 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-public
  • 07:56 brouberol@cumin1002: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:datahubsearch
  • 07:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T361627)', diff saved to https://phabricator.wikimedia.org/P61673 and previous config saved to /var/cache/conftool/dbconfig/20240502-075455-marostegui.json
  • 07:48 brouberol@cumin1002: START - Cookbook sre.opensearch.roll-restart-reboot rolling restart_daemons on A:datahubsearch
  • 07:47 moritzm: installing Java 8 security updates
  • 07:47 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-public
  • 07:44 volans@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) netbox to netbox2002.codfw.wmnet,netbox1002.eqiad.wmnet with reason: Update Netbox dependencies for netbox - volans@cumin1002
  • 07:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T361627)', diff saved to https://phabricator.wikimedia.org/P61672 and previous config saved to /var/cache/conftool/dbconfig/20240502-074400-marostegui.json
  • 07:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 07:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T361627)', diff saved to https://phabricator.wikimedia.org/P61671 and previous config saved to /var/cache/conftool/dbconfig/20240502-074320-marostegui.json
  • 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-internal
  • 07:40 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2123.codfw.wmnet
  • 07:38 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-internal
  • 07:38 volans@cumin1002: START - Cookbook sre.deploy.python-code netbox to netbox2002.codfw.wmnet,netbox1002.eqiad.wmnet with reason: Update Netbox dependencies for netbox - volans@cumin1002
  • 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P61670 and previous config saved to /var/cache/conftool/dbconfig/20240502-072813-marostegui.json
  • 07:13 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-test
  • 07:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P61669 and previous config saved to /var/cache/conftool/dbconfig/20240502-071305-marostegui.json
  • 07:13 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-test
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T361627)', diff saved to https://phabricator.wikimedia.org/P61668 and previous config saved to /var/cache/conftool/dbconfig/20240502-065758-marostegui.json
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2147 (T361627)', diff saved to https://phabricator.wikimedia.org/P61667 and previous config saved to /var/cache/conftool/dbconfig/20240502-064533-marostegui.json
  • 06:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 06:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61666 and previous config saved to /var/cache/conftool/dbconfig/20240502-064230-root.json
  • 06:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 06:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 06:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137 (T361627)', diff saved to https://phabricator.wikimedia.org/P61665 and previous config saved to /var/cache/conftool/dbconfig/20240502-063343-marostegui.json
  • 06:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61664 and previous config saved to /var/cache/conftool/dbconfig/20240502-062725-root.json
  • 06:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137', diff saved to https://phabricator.wikimedia.org/P61663 and previous config saved to /var/cache/conftool/dbconfig/20240502-061836-marostegui.json
  • 06:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61662 and previous config saved to /var/cache/conftool/dbconfig/20240502-061218-root.json
  • 06:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137', diff saved to https://phabricator.wikimedia.org/P61661 and previous config saved to /var/cache/conftool/dbconfig/20240502-060328-marostegui.json
  • 05:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61660 and previous config saved to /var/cache/conftool/dbconfig/20240502-055712-root.json
  • 05:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137 (T361627)', diff saved to https://phabricator.wikimedia.org/P61659 and previous config saved to /var/cache/conftool/dbconfig/20240502-054821-marostegui.json
  • 05:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 05:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 05:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61658 and previous config saved to /var/cache/conftool/dbconfig/20240502-054206-root.json
  • 05:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2137 (T361627)', diff saved to https://phabricator.wikimedia.org/P61657 and previous config saved to /var/cache/conftool/dbconfig/20240502-053717-marostegui.json
  • 05:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 05:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 05:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T361627)', diff saved to https://phabricator.wikimedia.org/P61656 and previous config saved to /var/cache/conftool/dbconfig/20240502-053654-marostegui.json
  • 05:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1181.eqiad.wmnet with OS bookworm
  • 05:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61655 and previous config saved to /var/cache/conftool/dbconfig/20240502-052700-root.json
  • 05:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P61654 and previous config saved to /var/cache/conftool/dbconfig/20240502-052146-marostegui.json
  • 05:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 05:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 05:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2162.codfw.wmnet with OS bookworm
  • 05:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61653 and previous config saved to /var/cache/conftool/dbconfig/20240502-051155-root.json
  • 05:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
  • 05:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P61652 and previous config saved to /var/cache/conftool/dbconfig/20240502-050639-marostegui.json
  • 05:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
  • 04:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
  • 04:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
  • 04:52 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1181.eqiad.wmnet with OS bookworm
  • 04:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T361627)', diff saved to https://phabricator.wikimedia.org/P61651 and previous config saved to /var/cache/conftool/dbconfig/20240502-045131-marostegui.json
  • 04:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1181 T363892', diff saved to https://phabricator.wikimedia.org/P61650 and previous config saved to /var/cache/conftool/dbconfig/20240502-045017-root.json
  • 04:48 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db1236 to s7 primary and set section read-write T363892', diff saved to https://phabricator.wikimedia.org/P61649 and previous config saved to /var/cache/conftool/dbconfig/20240502-044848-marostegui.json
  • 04:48 marostegui@cumin1002: dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - T363892', diff saved to https://phabricator.wikimedia.org/P61648 and previous config saved to /var/cache/conftool/dbconfig/20240502-044819-marostegui.json
  • 04:48 marostegui: Starting s7 eqiad failover from db1181 to db1236 - T363892
  • 04:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2136 (T361627)', diff saved to https://phabricator.wikimedia.org/P61647 and previous config saved to /var/cache/conftool/dbconfig/20240502-044020-marostegui.json
  • 04:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 04:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 04:35 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2162.codfw.wmnet with OS bookworm
  • 04:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2162.codfw.wmnet with reason: Reimage
  • 04:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2162.codfw.wmnet with reason: Reimage
  • 04:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2162', diff saved to https://phabricator.wikimedia.org/P61646 and previous config saved to /var/cache/conftool/dbconfig/20240502-043403-root.json
  • 04:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 T363892
  • 04:30 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1236 with weight 0 T363892', diff saved to https://phabricator.wikimedia.org/P61645 and previous config saved to /var/cache/conftool/dbconfig/20240502-043019-marostegui.json
  • 04:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s7 T363892
  • 04:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 04:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 04:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 04:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1181.eqiad.wmnet with reason: Maintenance

2024-05-01

  • 23:57 eileen: civicrm upgraded from 3ac4043c to 80ae4543
  • 21:37 eileen: config revision changed from 36b287b6 to b772c8bc
  • 20:22 jdrewniak@deploy1002: Finished scap: Backport for [Vector] Enable appearance menu and increased font-size on testwiki (T362147) (duration: 19m 29s)
  • 20:10 jdrewniak@deploy1002: jdlrobson and jdrewniak: Continuing with sync
  • 20:08 jdrewniak@deploy1002: jdlrobson and jdrewniak: Backport for [Vector] Enable appearance menu and increased font-size on testwiki (T362147) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:03 jdrewniak@deploy1002: Started scap: Backport for [Vector] Enable appearance menu and increased font-size on testwiki (T362147)
  • 19:40 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns7002.wikimedia.org with OS bookworm
  • 19:40 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 19:39 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 19:12 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
  • 19:09 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
  • 18:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 18:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 18:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T361627)', diff saved to https://phabricator.wikimedia.org/P61644 and previous config saved to /var/cache/conftool/dbconfig/20240501-185521-marostegui.json
  • 18:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P61643 and previous config saved to /var/cache/conftool/dbconfig/20240501-184013-marostegui.json
  • 18:36 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS bookworm
  • 18:36 sukhe@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns7002.wikimedia.org with OS bookworm
  • 18:36 sukhe@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dns7002.magru.wmnet']
  • 18:35 sukhe@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dns7002.magru.wmnet']
  • 18:35 sukhe@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dns7002.magru.wmnet']
  • 18:35 sukhe@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dns7002.magru.wmnet']
  • 18:28 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1005.eqiad.wmnet with OS bookworm
  • 18:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P61642 and previous config saved to /var/cache/conftool/dbconfig/20240501-182505-marostegui.json
  • 18:16 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS bookworm
  • 18:15 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns7001.wikimedia.org with OS bookworm
  • 18:15 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 18:14 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 18:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T361627)', diff saved to https://phabricator.wikimedia.org/P61641 and previous config saved to /var/cache/conftool/dbconfig/20240501-180958-marostegui.json
  • 18:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1236 (T361627)', diff saved to https://phabricator.wikimedia.org/P61640 and previous config saved to /var/cache/conftool/dbconfig/20240501-180645-marostegui.json
  • 18:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 18:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 18:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T361627)', diff saved to https://phabricator.wikimedia.org/P61639 and previous config saved to /var/cache/conftool/dbconfig/20240501-180622-marostegui.json
  • 18:03 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1004.eqiad.wmnet with OS bookworm
  • 17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P61638 and previous config saved to /var/cache/conftool/dbconfig/20240501-175114-marostegui.json
  • 17:49 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7001.wikimedia.org with reason: host reimage
  • 17:46 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7001.wikimedia.org with reason: host reimage
  • 17:38 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1004.eqiad.wmnet with reason: host reimage
  • 17:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P61637 and previous config saved to /var/cache/conftool/dbconfig/20240501-173607-marostegui.json
  • 17:35 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1004.eqiad.wmnet with reason: host reimage
  • 17:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T361627)', diff saved to https://phabricator.wikimedia.org/P61636 and previous config saved to /var/cache/conftool/dbconfig/20240501-172059-marostegui.json
  • 17:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T361627)', diff saved to https://phabricator.wikimedia.org/P61635 and previous config saved to /var/cache/conftool/dbconfig/20240501-171527-marostegui.json
  • 17:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 17:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 17:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T361627)', diff saved to https://phabricator.wikimedia.org/P61634 and previous config saved to /var/cache/conftool/dbconfig/20240501-171504-marostegui.json
  • 17:14 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1004.eqiad.wmnet with OS bookworm
  • 17:12 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host dns7001.wikimedia.org with OS bookworm
  • 17:02 sukhe: sudo cumin -b1 -s10 "A:dnsbox" "run-puppet-agent --enable 'merging CR 1026166'"
  • 16:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P61633 and previous config saved to /var/cache/conftool/dbconfig/20240501-165957-marostegui.json
  • 16:59 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1003.eqiad.wmnet with OS bookworm
  • 16:59 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns6001.wikimedia.org
  • 16:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P61632 and previous config saved to /var/cache/conftool/dbconfig/20240501-164450-marostegui.json
  • 16:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns6001.wikimedia.org
  • 16:43 sukhe: sudo cumin "A:dnsbox" "disable-puppet 'merging CR 1026166'"
  • 16:34 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1003.eqiad.wmnet with reason: host reimage
  • 16:31 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1003.eqiad.wmnet with reason: host reimage
  • 16:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T361627)', diff saved to https://phabricator.wikimedia.org/P61630 and previous config saved to /var/cache/conftool/dbconfig/20240501-162942-marostegui.json
  • 16:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T361627)', diff saved to https://phabricator.wikimedia.org/P61629 and previous config saved to /var/cache/conftool/dbconfig/20240501-162629-marostegui.json
  • 16:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 16:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 16:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T361627)', diff saved to https://phabricator.wikimedia.org/P61628 and previous config saved to /var/cache/conftool/dbconfig/20240501-162607-marostegui.json
  • 16:11 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1003.eqiad.wmnet with OS bookworm
  • 16:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P61627 and previous config saved to /var/cache/conftool/dbconfig/20240501-161059-marostegui.json
  • 16:10 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cephosd1003.eqiad.wmnet with OS bookworm
  • 16:01 milimetric@deploy1002: Finished deploy [airflow-dags/analytics@09b4f5f]: Testing different settings for mediawiki_history_shapshot_config (duration: 00m 28s)
  • 16:00 milimetric@deploy1002: Started deploy [airflow-dags/analytics@09b4f5f]: Testing different settings for mediawiki_history_shapshot_config
  • 15:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P61626 and previous config saved to /var/cache/conftool/dbconfig/20240501-155552-marostegui.json
  • 15:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T361627)', diff saved to https://phabricator.wikimedia.org/P61625 and previous config saved to /var/cache/conftool/dbconfig/20240501-154042-marostegui.json
  • 15:39 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1003.eqiad.wmnet with OS bookworm
  • 15:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T361627)', diff saved to https://phabricator.wikimedia.org/P61624 and previous config saved to /var/cache/conftool/dbconfig/20240501-153829-marostegui.json
  • 15:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 15:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 15:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T361627)', diff saved to https://phabricator.wikimedia.org/P61623 and previous config saved to /var/cache/conftool/dbconfig/20240501-153806-marostegui.json
  • 15:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P61622 and previous config saved to /var/cache/conftool/dbconfig/20240501-152259-marostegui.json
  • 15:22 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.43.0-wmf.3 refs T361397
  • 15:15 sukhe@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns7001.wikimedia.org with OS bookworm
  • 15:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P61621 and previous config saved to /var/cache/conftool/dbconfig/20240501-150751-marostegui.json
  • 14:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T361627)', diff saved to https://phabricator.wikimedia.org/P61620 and previous config saved to /var/cache/conftool/dbconfig/20240501-145243-marostegui.json
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T361627)', diff saved to https://phabricator.wikimedia.org/P61619 and previous config saved to /var/cache/conftool/dbconfig/20240501-145131-marostegui.json
  • 14:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 14:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61618 and previous config saved to /var/cache/conftool/dbconfig/20240501-145108-marostegui.json
  • 14:43 dancy@deploy1002: Installation of scap version "4.81.0" completed for 325 hosts
  • 14:42 dancy@deploy1002: Installing scap version "4.81.0" for 325 hosts
  • 14:36 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1002.eqiad.wmnet with OS bookworm
  • 14:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P61617 and previous config saved to /var/cache/conftool/dbconfig/20240501-143601-marostegui.json
  • 14:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61616 and previous config saved to /var/cache/conftool/dbconfig/20240501-142233-root.json
  • 14:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P61615 and previous config saved to /var/cache/conftool/dbconfig/20240501-142053-marostegui.json
  • 14:12 bking@deploy1002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:11 bking@deploy1002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1002.eqiad.wmnet with reason: host reimage
  • 14:08 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1002.eqiad.wmnet with reason: host reimage
  • 14:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61614 and previous config saved to /var/cache/conftool/dbconfig/20240501-140728-root.json
  • 14:05 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7001.wikimedia.org with reason: host reimage
  • 14:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61613 and previous config saved to /var/cache/conftool/dbconfig/20240501-140545-marostegui.json
  • 14:03 bking@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T361627)', diff saved to https://phabricator.wikimedia.org/P61612 and previous config saved to /var/cache/conftool/dbconfig/20240501-140333-marostegui.json
  • 14:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 14:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 14:03 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7001.wikimedia.org with reason: host reimage
  • 14:03 bking@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 13:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 13:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 13:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T361627)', diff saved to https://phabricator.wikimedia.org/P61611 and previous config saved to /var/cache/conftool/dbconfig/20240501-135915-marostegui.json
  • 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61610 and previous config saved to /var/cache/conftool/dbconfig/20240501-135222-root.json
  • 13:47 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1002.eqiad.wmnet with OS bookworm
  • 13:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P61609 and previous config saved to /var/cache/conftool/dbconfig/20240501-134407-marostegui.json
  • 13:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61608 and previous config saved to /var/cache/conftool/dbconfig/20240501-133717-root.json
  • 13:33 Amir1: promoting HNowlan (WMF) to admin in testwiki
  • 13:29 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host dns7001.wikimedia.org with OS bookworm
  • 13:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P61607 and previous config saved to /var/cache/conftool/dbconfig/20240501-132900-marostegui.json
  • 13:25 sukhe: running authdns-update for CR 1026119: depool magru text*
  • 13:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61606 and previous config saved to /var/cache/conftool/dbconfig/20240501-132211-root.json
  • 13:15 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1001.eqiad.wmnet with OS bookworm
  • 13:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T361627)', diff saved to https://phabricator.wikimedia.org/P61605 and previous config saved to /var/cache/conftool/dbconfig/20240501-131351-marostegui.json
  • 13:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T361627)', diff saved to https://phabricator.wikimedia.org/P61604 and previous config saved to /var/cache/conftool/dbconfig/20240501-130822-marostegui.json
  • 13:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 13:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 13:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T361627)', diff saved to https://phabricator.wikimedia.org/P61603 and previous config saved to /var/cache/conftool/dbconfig/20240501-130747-marostegui.json
  • 13:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61602 and previous config saved to /var/cache/conftool/dbconfig/20240501-130704-root.json
  • 12:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS bookworm
  • 12:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P61601 and previous config saved to /var/cache/conftool/dbconfig/20240501-125239-marostegui.json
  • 12:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61600 and previous config saved to /var/cache/conftool/dbconfig/20240501-125158-root.json
  • 12:48 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1001.eqiad.wmnet with reason: host reimage
  • 12:45 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1001.eqiad.wmnet with reason: host reimage
  • 12:24 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host cephosd1001.eqiad.wmnet with OS bookworm
  • 12:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T361627)', diff saved to https://phabricator.wikimedia.org/P61598 and previous config saved to /var/cache/conftool/dbconfig/20240501-122224-marostegui.json
  • 12:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T361627)', diff saved to https://phabricator.wikimedia.org/P61597 and previous config saved to /var/cache/conftool/dbconfig/20240501-122012-marostegui.json
  • 12:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 12:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 12:15 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS bookworm
  • 12:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 12:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 12:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2154', diff saved to https://phabricator.wikimedia.org/P61596 and previous config saved to /var/cache/conftool/dbconfig/20240501-121347-root.json
  • 12:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61595 and previous config saved to /var/cache/conftool/dbconfig/20240501-120833-root.json
  • 11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T361627)', diff saved to https://phabricator.wikimedia.org/P61594 and previous config saved to /var/cache/conftool/dbconfig/20240501-115915-marostegui.json
  • 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61593 and previous config saved to /var/cache/conftool/dbconfig/20240501-115327-root.json
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P61592 and previous config saved to /var/cache/conftool/dbconfig/20240501-114408-marostegui.json
  • 11:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61591 and previous config saved to /var/cache/conftool/dbconfig/20240501-113821-root.json
  • 11:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P61590 and previous config saved to /var/cache/conftool/dbconfig/20240501-112900-marostegui.json
  • 11:24 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs7003.magru.wmnet with OS bullseye
  • 11:24 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 11:23 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61589 and previous config saved to /var/cache/conftool/dbconfig/20240501-112315-root.json
  • 11:22 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 11:17 sukhe@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs7002.magru.wmnet
  • 11:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T361627)', diff saved to https://phabricator.wikimedia.org/P61588 and previous config saved to /var/cache/conftool/dbconfig/20240501-111353-marostegui.json
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2220 (T361627)', diff saved to https://phabricator.wikimedia.org/P61587 and previous config saved to /var/cache/conftool/dbconfig/20240501-110834-marostegui.json
  • 11:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 11:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T361627)', diff saved to https://phabricator.wikimedia.org/P61586 and previous config saved to /var/cache/conftool/dbconfig/20240501-110822-marostegui.json
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61585 and previous config saved to /var/cache/conftool/dbconfig/20240501-110809-root.json
  • 11:07 sukhe@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs7001.magru.wmnet
  • 11:05 sukhe@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs7002.magru.wmnet
  • 10:58 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs7003.magru.wmnet with reason: host reimage
  • 10:55 sukhe@cumin1002: START - Cookbook sre.hosts.reboot-single for host lvs7001.magru.wmnet
  • 10:55 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs7003.magru.wmnet with reason: host reimage
  • 10:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P61584 and previous config saved to /var/cache/conftool/dbconfig/20240501-105315-marostegui.json
  • 10:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61583 and previous config saved to /var/cache/conftool/dbconfig/20240501-105304-root.json
  • 10:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS bookworm
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P61582 and previous config saved to /var/cache/conftool/dbconfig/20240501-103801-marostegui.json
  • 10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2163 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61581 and previous config saved to /var/cache/conftool/dbconfig/20240501-103758-root.json
  • 10:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 100%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61580 and previous config saved to /var/cache/conftool/dbconfig/20240501-103338-arnaudb.json
  • 10:30 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host lvs7003.magru.wmnet with OS bullseye
  • 10:30 sukhe@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs7003.magru.wmnet with OS bullseye
  • 10:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: Down with HW issues
  • 10:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: Down with HW issues
  • 10:28 sukhe@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs7003.magru.wmnet']
  • 10:27 sukhe@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs7003.magru.wmnet']
  • 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T361627)', diff saved to https://phabricator.wikimedia.org/P61579 and previous config saved to /var/cache/conftool/dbconfig/20240501-102253-marostegui.json
  • 10:22 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host lvs7003.magru.wmnet with OS bullseye
  • 10:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
  • 10:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 75%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61578 and previous config saved to /var/cache/conftool/dbconfig/20240501-101832-arnaudb.json
  • 10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
  • 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2208 (T361627)', diff saved to https://phabricator.wikimedia.org/P61577 and previous config saved to /var/cache/conftool/dbconfig/20240501-101728-marostegui.json
  • 10:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 10:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 10:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 10:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61576 and previous config saved to /var/cache/conftool/dbconfig/20240501-101151-root.json
  • 10:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 10:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T361627)', diff saved to https://phabricator.wikimedia.org/P61575 and previous config saved to /var/cache/conftool/dbconfig/20240501-100650-marostegui.json
  • 10:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 50%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61574 and previous config saved to /var/cache/conftool/dbconfig/20240501-100326-arnaudb.json
  • 10:00 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS bookworm
  • 09:58 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2163', diff saved to https://phabricator.wikimedia.org/P61573 and previous config saved to /var/cache/conftool/dbconfig/20240501-095845-root.json
  • 09:56 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61572 and previous config saved to /var/cache/conftool/dbconfig/20240501-095646-root.json
  • 09:52 topranks: restarting routinator service on rpki1001
  • 09:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P61571 and previous config saved to /var/cache/conftool/dbconfig/20240501-095142-marostegui.json
  • 09:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 25%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61570 and previous config saved to /var/cache/conftool/dbconfig/20240501-094821-arnaudb.json
  • 09:42 marostegui@deploy1002: Finished scap: Backport for etcd.php: Add es7 (T355285 T355424) (duration: 14m 53s)
  • 09:41 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61569 and previous config saved to /var/cache/conftool/dbconfig/20240501-094140-root.json
  • 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P61568 and previous config saved to /var/cache/conftool/dbconfig/20240501-093635-marostegui.json
  • 09:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 15%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61567 and previous config saved to /var/cache/conftool/dbconfig/20240501-093315-arnaudb.json
  • 09:30 marostegui@deploy1002: marostegui: Continuing with sync
  • 09:30 marostegui@deploy1002: marostegui: Backport for etcd.php: Add es7 (T355285 T355424) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:27 marostegui@deploy1002: Started scap: Backport for etcd.php: Add es7 (T355285 T355424)
  • 09:26 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61566 and previous config saved to /var/cache/conftool/dbconfig/20240501-092634-root.json
  • 09:22 topranks: withdrawing public prefix announcement to AS7195 to test backup in magru (T362421)
  • 09:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T361627)', diff saved to https://phabricator.wikimedia.org/P61565 and previous config saved to /var/cache/conftool/dbconfig/20240501-092125-marostegui.json
  • 09:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 10%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61564 and previous config saved to /var/cache/conftool/dbconfig/20240501-091809-arnaudb.json
  • 09:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T361627)', diff saved to https://phabricator.wikimedia.org/P61563 and previous config saved to /var/cache/conftool/dbconfig/20240501-091513-marostegui.json
  • 09:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 09:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 09:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T361627)', diff saved to https://phabricator.wikimedia.org/P61562 and previous config saved to /var/cache/conftool/dbconfig/20240501-091451-marostegui.json
  • 09:13 marostegui@cumin1002: dbctl commit (dc=all): 'Push es7 codfw config T355424', diff saved to https://phabricator.wikimedia.org/P61561 and previous config saved to /var/cache/conftool/dbconfig/20240501-091352-marostegui.json
  • 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61560 and previous config saved to /var/cache/conftool/dbconfig/20240501-091128-root.json
  • 09:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db1157 (re)pooling @ 5%: post schema change repool', diff saved to https://phabricator.wikimedia.org/P61559 and previous config saved to /var/cache/conftool/dbconfig/20240501-090303-arnaudb.json
  • 08:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P61558 and previous config saved to /var/cache/conftool/dbconfig/20240501-085943-marostegui.json
  • 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61557 and previous config saved to /var/cache/conftool/dbconfig/20240501-085622-root.json
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61556 and previous config saved to /var/cache/conftool/dbconfig/20240501-085223-root.json
  • 08:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P61555 and previous config saved to /var/cache/conftool/dbconfig/20240501-084436-marostegui.json
  • 08:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS bookworm
  • 08:41 marostegui@cumin1002: dbctl commit (dc=all): 'db2164 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61554 and previous config saved to /var/cache/conftool/dbconfig/20240501-084116-root.json
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61553 and previous config saved to /var/cache/conftool/dbconfig/20240501-083717-root.json
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61552 and previous config saved to /var/cache/conftool/dbconfig/20240501-083641-root.json
  • 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'Push es7 eqiad config T355285', diff saved to https://phabricator.wikimedia.org/P61551 and previous config saved to /var/cache/conftool/dbconfig/20240501-083120-marostegui.json
  • 08:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T361627)', diff saved to https://phabricator.wikimedia.org/P61550 and previous config saved to /var/cache/conftool/dbconfig/20240501-082928-marostegui.json
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168 (T361627)', diff saved to https://phabricator.wikimedia.org/P61549 and previous config saved to /var/cache/conftool/dbconfig/20240501-082357-marostegui.json
  • 08:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T361627)', diff saved to https://phabricator.wikimedia.org/P61548 and previous config saved to /var/cache/conftool/dbconfig/20240501-082334-marostegui.json
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61547 and previous config saved to /var/cache/conftool/dbconfig/20240501-082211-root.json
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61546 and previous config saved to /var/cache/conftool/dbconfig/20240501-082135-root.json
  • 08:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
  • 08:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P61545 and previous config saved to /var/cache/conftool/dbconfig/20240501-080827-marostegui.json
  • 08:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61544 and previous config saved to /var/cache/conftool/dbconfig/20240501-080706-root.json
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61543 and previous config saved to /var/cache/conftool/dbconfig/20240501-080630-root.json
  • 08:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 08:05 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 08:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61542 and previous config saved to /var/cache/conftool/dbconfig/20240501-080354-root.json
  • 07:59 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS bookworm
  • 07:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2164', diff saved to https://phabricator.wikimedia.org/P61541 and previous config saved to /var/cache/conftool/dbconfig/20240501-075614-root.json
  • 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P61540 and previous config saved to /var/cache/conftool/dbconfig/20240501-075320-marostegui.json
  • 07:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61539 and previous config saved to /var/cache/conftool/dbconfig/20240501-075200-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61538 and previous config saved to /var/cache/conftool/dbconfig/20240501-075124-root.json
  • 07:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61537 and previous config saved to /var/cache/conftool/dbconfig/20240501-074848-root.json
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T361627)', diff saved to https://phabricator.wikimedia.org/P61536 and previous config saved to /var/cache/conftool/dbconfig/20240501-073812-marostegui.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61535 and previous config saved to /var/cache/conftool/dbconfig/20240501-073655-root.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61534 and previous config saved to /var/cache/conftool/dbconfig/20240501-073615-root.json
  • 07:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61533 and previous config saved to /var/cache/conftool/dbconfig/20240501-073342-root.json
  • 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T361627)', diff saved to https://phabricator.wikimedia.org/P61532 and previous config saved to /var/cache/conftool/dbconfig/20240501-073201-marostegui.json
  • 07:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T361627)', diff saved to https://phabricator.wikimedia.org/P61531 and previous config saved to /var/cache/conftool/dbconfig/20240501-073123-marostegui.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1186 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61530 and previous config saved to /var/cache/conftool/dbconfig/20240501-072149-root.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61529 and previous config saved to /var/cache/conftool/dbconfig/20240501-072110-root.json
  • 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61528 and previous config saved to /var/cache/conftool/dbconfig/20240501-071836-root.json
  • 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P61527 and previous config saved to /var/cache/conftool/dbconfig/20240501-071615-marostegui.json
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61526 and previous config saved to /var/cache/conftool/dbconfig/20240501-070603-root.json
  • 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61525 and previous config saved to /var/cache/conftool/dbconfig/20240501-070330-root.json
  • 07:02 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1186.eqiad.wmnet onto db1234.eqiad.wmnet
  • 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P61524 and previous config saved to /var/cache/conftool/dbconfig/20240501-070108-marostegui.json
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P61523 and previous config saved to /var/cache/conftool/dbconfig/20240501-065845-root.json
  • 06:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61522 and previous config saved to /var/cache/conftool/dbconfig/20240501-064824-root.json
  • 06:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T361627)', diff saved to https://phabricator.wikimedia.org/P61521 and previous config saved to /var/cache/conftool/dbconfig/20240501-064600-marostegui.json
  • 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P61520 and previous config saved to /var/cache/conftool/dbconfig/20240501-064339-root.json
  • 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T361627)', diff saved to https://phabricator.wikimedia.org/P61519 and previous config saved to /var/cache/conftool/dbconfig/20240501-063942-marostegui.json
  • 06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 06:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T361627)', diff saved to https://phabricator.wikimedia.org/P61518 and previous config saved to /var/cache/conftool/dbconfig/20240501-063919-marostegui.json
  • 06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS bookworm
  • 06:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61517 and previous config saved to /var/cache/conftool/dbconfig/20240501-063318-root.json
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61516 and previous config saved to /var/cache/conftool/dbconfig/20240501-062833-root.json
  • 06:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P61515 and previous config saved to /var/cache/conftool/dbconfig/20240501-062407-marostegui.json
  • 06:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
  • 06:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61514 and previous config saved to /var/cache/conftool/dbconfig/20240501-061327-root.json
  • 06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P61513 and previous config saved to /var/cache/conftool/dbconfig/20240501-060900-marostegui.json
  • 05:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61512 and previous config saved to /var/cache/conftool/dbconfig/20240501-055822-root.json
  • 05:58 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS bookworm
  • 05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2166', diff saved to https://phabricator.wikimedia.org/P61511 and previous config saved to /var/cache/conftool/dbconfig/20240501-055657-root.json
  • 05:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T361627)', diff saved to https://phabricator.wikimedia.org/P61510 and previous config saved to /var/cache/conftool/dbconfig/20240501-055353-marostegui.json
  • 05:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T361627)', diff saved to https://phabricator.wikimedia.org/P61509 and previous config saved to /var/cache/conftool/dbconfig/20240501-054720-marostegui.json
  • 05:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 05:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 05:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T361627)', diff saved to https://phabricator.wikimedia.org/P61508 and previous config saved to /var/cache/conftool/dbconfig/20240501-054657-marostegui.json
  • 05:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61507 and previous config saved to /var/cache/conftool/dbconfig/20240501-054316-root.json
  • 05:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es[1035,1039-1040].eqiad.wmnet with reason: Setting up T355285 T355424
  • 05:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on es[1035,1039-1040].eqiad.wmnet with reason: Setting up T355285 T355424
  • 05:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 6 hosts with reason: Setting up T355285 T355424
  • 05:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 6 hosts with reason: Setting up T355285 T355424
  • 05:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P61506 and previous config saved to /var/cache/conftool/dbconfig/20240501-053149-marostegui.json
  • 05:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1234.eqiad.wmnet with OS bookworm
  • 05:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1236.eqiad.wmnet with OS bookworm
  • 05:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61505 and previous config saved to /var/cache/conftool/dbconfig/20240501-052810-root.json
  • 05:23 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db1186.eqiad.wmnet onto db1234.eqiad.wmnet
  • 05:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1186 to clone db1234 T363890', diff saved to https://phabricator.wikimedia.org/P61504 and previous config saved to /var/cache/conftool/dbconfig/20240501-051848-marostegui.json
  • 05:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P61503 and previous config saved to /var/cache/conftool/dbconfig/20240501-051642-marostegui.json
  • 05:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
  • 05:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
  • 05:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
  • 05:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: Down with HW issues
  • 05:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: Down with HW issues
  • 05:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
  • 05:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T361627)', diff saved to https://phabricator.wikimedia.org/P61502 and previous config saved to /var/cache/conftool/dbconfig/20240501-050135-marostegui.json
  • 04:57 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1236.eqiad.wmnet with OS bookworm
  • 04:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1236', diff saved to https://phabricator.wikimedia.org/P61501 and previous config saved to /var/cache/conftool/dbconfig/20240501-045624-marostegui.json
  • 04:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T361627)', diff saved to https://phabricator.wikimedia.org/P61500 and previous config saved to /var/cache/conftool/dbconfig/20240501-045517-marostegui.json
  • 04:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 04:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 04:54 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1234.eqiad.wmnet with OS bookworm
  • 04:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 02:31 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs7002.magru.wmnet with OS bullseye
  • 02:31 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 02:29 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 02:07 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs7002.magru.wmnet with reason: host reimage
  • 02:04 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs7002.magru.wmnet with reason: host reimage
  • 01:37 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host lvs7002.magru.wmnet with OS bullseye
  • 01:26 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs7001.magru.wmnet with OS bullseye
  • 01:26 sukhe@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 01:25 sukhe@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin1002"
  • 01:02 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs7001.magru.wmnet with reason: host reimage
  • 00:58 sukhe@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs7001.magru.wmnet with reason: host reimage
  • 00:33 sukhe@cumin1002: START - Cookbook sre.hosts.reimage for host lvs7001.magru.wmnet with OS bullseye
  • 00:23 xcollazo@deploy1002: Finished deploy [airflow-dags/analytics@b10376a]: (no justification provided) (duration: 00m 31s)
  • 00:22 xcollazo@deploy1002: Started deploy [airflow-dags/analytics@b10376a]: (no justification provided)
  • 00:05 eileen: civicrm upgraded from 393e1deb to 3ac4043

Archives

See Server Admin Log/Archives.