Server Admin Log
Appearance
2025-11-07
- 17:59 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on moss-fe1002.eqiad.wmnet with reason: C/D Migration
- 17:57 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on kafka-logging1003.eqiad.wmnet with reason: C/D Migration
- 17:57 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on cp1114.eqiad.wmnet with reason: C/D Migration
- 17:55 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-be1067.eqiad.wmnet with reason: C/D Migration
- 17:53 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1054.eqiad.wmnet with reason: C/D Migration
- 17:53 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1053.eqiad.wmnet with reason: C/D Migration
- 17:52 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy7002.magru.wmnet with OS trixie
- 17:51 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on cp1113.eqiad.wmnet with reason: C/D Migration
- 17:49 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy7001.magru.wmnet with OS trixie
- 17:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P85108 and previous config saved to /var/cache/conftool/dbconfig/20251107-174946-marostegui.json
- 17:49 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on thanos-fe1007.eqiad.wmnet with reason: C/D Migration
- 17:48 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy3002.esams.wmnet with OS trixie
- 17:47 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1052.eqiad.wmnet with reason: C/D Migration
- 17:45 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1051.eqiad.wmnet with reason: C/D Migration
- 17:44 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1230.eqiad.wmnet with reason: C/D Migration
- 17:43 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1229.eqiad.wmnet with reason: C/D Migration
- 17:41 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on cirrussearch1103.eqiad.wmnet with reason: C/D Migration
- 17:40 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1134.eqiad.wmnet with reason: C/D Migration
- 17:39 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1138.eqiad.wmnet with reason: C/D Migration
- 17:37 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy7002.magru.wmnet with reason: host reimage
- 17:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P85107 and previous config saved to /var/cache/conftool/dbconfig/20251107-173439-marostegui.json
- 17:31 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy7001.magru.wmnet with reason: host reimage
- 17:30 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1137.eqiad.wmnet with reason: C/D Migration
- 17:29 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1136.eqiad.wmnet with reason: C/D Migration
- 17:29 cdanis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy7002.magru.wmnet with reason: host reimage
- 17:28 cdanis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy7001.magru.wmnet with reason: host reimage
- 17:28 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy3002.esams.wmnet with reason: host reimage
- 17:26 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1135.eqiad.wmnet with reason: C/D Migration
- 17:24 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on kubestage1004.eqiad.wmnet with reason: C/D Migration
- 17:23 cdanis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy3002.esams.wmnet with reason: host reimage
- 17:22 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1247.eqiad.wmnet with reason: C/D Migration
- 17:21 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudlb2002-dev.codfw.wmnet with reason: host reimage
- 17:21 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudlb2002-dev.codfw.wmnet with reason: host reimage
- 17:21 robh: eqiad d2 network migrations done for today, moving onto d3
- 17:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T407997)', diff saved to https://phabricator.wikimedia.org/P85106 and previous config saved to /var/cache/conftool/dbconfig/20251107-171931-marostegui.json
- 17:19 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wdqs1017.eqiad.wmnet with reason: C/D Migration
- 17:18 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on relforge1009.eqiad.wmnet with reason: C/D Migration
- 17:16 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-fe1013.eqiad.wmnet with reason: C/D Migration
- 17:15 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1228.eqiad.wmnet with reason: C/D Migration
- 17:14 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1227.eqiad.wmnet with reason: C/D Migration
- 17:13 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-presto1020.eqiad.wmnet with reason: C/D Migration
- 17:10 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on druid1013.eqiad.wmnet with reason: C/D Migration
- 17:09 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-fe1020.eqiad.wmnet with reason: C/D Migration
- 17:08 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1209.eqiad.wmnet with reason: C/D Migration
- 17:06 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wdqs1022.eqiad.wmnet with reason: C/D Migration
- 17:06 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on titan1002.eqiad.wmnet with reason: C/D Migration
- 17:05 robh: eqiad d2 migrations in progress
- 17:04 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on maps1014.eqiad.wmnet with reason: C/D Migration
- 17:03 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudlb2002-dev.codfw.wmnet with OS trixie
- 17:03 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-be1091.eqiad.wmnet with reason: C/D Migration
- 17:02 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy3001.esams.wmnet with OS trixie
- 17:00 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1175.eqiad.wmnet with reason: C/D Migration
- 17:00 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
- 17:00 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 17:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2216 (T407997)', diff saved to https://phabricator.wikimedia.org/P85105 and previous config saved to /var/cache/conftool/dbconfig/20251107-170042-marostegui.json
- 17:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2216.codfw.wmnet with reason: Maintenance
- 17:00 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
- 17:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2212 (T407997)', diff saved to https://phabricator.wikimedia.org/P85104 and previous config saved to /var/cache/conftool/dbconfig/20251107-170018-marostegui.json
- 16:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:57 cdanis@cumin1003: START - Cookbook sre.hosts.reimage for host tcp-proxy7002.magru.wmnet with OS trixie
- 16:56 cdanis@cumin1003: START - Cookbook sre.hosts.reimage for host tcp-proxy7001.magru.wmnet with OS trixie
- 16:56 cdanis@cumin1003: START - Cookbook sre.hosts.reimage for host tcp-proxy3002.esams.wmnet with OS trixie
- 16:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:49 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on cp1112.eqiad.wmnet with reason: C/D Migration
- 16:47 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy3001.esams.wmnet with reason: host reimage
- 16:46 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on bast1003.wikimedia.org with reason: C/D Migration
- 16:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2212', diff saved to https://phabricator.wikimedia.org/P85103 and previous config saved to /var/cache/conftool/dbconfig/20251107-164510-marostegui.json
- 16:44 cdanis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy3001.esams.wmnet with reason: host reimage
- 16:43 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1182.eqiad.wmnet with reason: C/D Migration
- 16:42 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1172.eqiad.wmnet with reason: C/D Migration
- 16:41 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1253.eqiad.wmnet with reason: C/D Migration
- 16:39 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1263.eqiad.wmnet with reason: C/D Migration
- 16:39 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on es1051.eqiad.wmnet with reason: C/D Migration
- 16:37 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on rdb1012.eqiad.wmnet with reason: C/D Migration
- 16:36 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS bookworm
- 16:34 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS bookworm
- 16:33 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on aqs1014.eqiad.wmnet with reason: C/D Migration
- 16:32 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on restbase1033.eqiad.wmnet with reason: C/D Migration
- 16:32 robh: eqiad row C migrations complete for today, moving onto row D, D1 to start
- 16:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2212', diff saved to https://phabricator.wikimedia.org/P85102 and previous config saved to /var/cache/conftool/dbconfig/20251107-163003-marostegui.json
- 16:28 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mwlog1002.eqiad.wmnet with reason: C/D Migration
- 16:27 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1171.eqiad.wmnet with reason: C/D Migration
- 16:26 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1170.eqiad.wmnet with reason: C/D Migration
- 16:25 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1220.eqiad.wmnet with reason: C/D Migration
- 16:24 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1219.eqiad.wmnet with reason: C/D Migration
- 16:23 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on dbproxy1024.eqiad.wmnet with reason: C/D Migration
- 16:21 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on pki1002.eqiad.wmnet with reason: C/D Migration
- 16:20 cdanis@cumin1003: START - Cookbook sre.hosts.reimage for host tcp-proxy3001.esams.wmnet with OS trixie
- 16:20 robh: eqiad c/d migration now working rack c6
- 16:19 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1231.eqiad.wmnet with reason: C/D Migration
- 16:18 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1245.eqiad.wmnet with reason: C/D Migration
- 16:17 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1244.eqiad.wmnet with reason: C/D Migration
- 16:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2212 (T407997)', diff saved to https://phabricator.wikimedia.org/P85101 and previous config saved to /var/cache/conftool/dbconfig/20251107-161455-marostegui.json
- 16:10 robh: eqiad c3 network migrations complete for today, moving onto next rack
- 16:09 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy3001.esams.wmnet with OS trixie
- 16:08 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:08 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on krb1002.eqiad.wmnet with reason: C/D Migration
- 16:06 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on dbproxy1029.eqiad.wmnet with reason: C/D Migration
- 16:05 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 16:04 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:03 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1150.eqiad.wmnet with reason: C/D Migration
- 16:03 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1252.eqiad.wmnet with reason: C/D Migration
- 16:02 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 16:01 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:59 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-conf1006.eqiad.wmnet with reason: C/D Migration
- 15:59 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 15:58 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1230.eqiad.wmnet with reason: C/D Migration
- 15:58 dzahn@dns1004: END - running authdns-update
- 15:57 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1167.eqiad.wmnet with reason: C/D Migration
- 15:57 dzahn@dns1004: START - running authdns-update
- 15:57 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1166.eqiad.wmnet with reason: C/D Migration
- 15:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2212 (T407997)', diff saved to https://phabricator.wikimedia.org/P85099 and previous config saved to /var/cache/conftool/dbconfig/20251107-155605-marostegui.json
- 15:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2212.codfw.wmnet with reason: Maintenance
- 15:54 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1180.eqiad.wmnet with reason: C/D Migration
- 15:53 robh: eqiad C3 switch migrations in progress
- 15:52 robh: eqiad C2 switch migrations in progress
- 15:52 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy3001.esams.wmnet with reason: host reimage
- 15:52 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on es1057.eqiad.wmnet with reason: C/D Migration
- 15:51 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1243.eqiad.wmnet with reason: C/D Migration
- 15:49 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1242.eqiad.wmnet with reason: C/D Migration
- 15:48 cdanis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy3001.esams.wmnet with reason: host reimage
- 15:47 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc-gp1005.eqiad.wmnet with reason: C/D Migration
- 15:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2028.codfw.wmnet with OS trixie
- 15:44 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1132.eqiad.wmnet with reason: C/D Migration
- 15:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2202.codfw.wmnet with reason: Maintenance
- 15:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T407997)', diff saved to https://phabricator.wikimedia.org/P85098 and previous config saved to /var/cache/conftool/dbconfig/20251107-153957-marostegui.json
- 15:38 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1264.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:33 dpogorzelski@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=ml-serve1012.eqiad.wmnet,dc=eqiad,cluster=ml_serve,service=kubesvc
- 15:28 taavi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:28 taavi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add x1/x4 wiki replicas cloudlb addresses - taavi@cumin1003"
- 15:28 taavi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add x1/x4 wiki replicas cloudlb addresses - taavi@cumin1003"
- 15:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P85097 and previous config saved to /var/cache/conftool/dbconfig/20251107-152449-marostegui.json
- 15:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 15:23 taavi@cumin1003: START - Cookbook sre.dns.netbox
- 15:19 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 15:12 cdanis@cumin1003: START - Cookbook sre.hosts.reimage for host tcp-proxy3001.esams.wmnet with OS trixie
- 15:10 cdanis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host tcp-proxy3001.esams.wmnet with OS trixie
- 15:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P85096 and previous config saved to /var/cache/conftool/dbconfig/20251107-150941-marostegui.json
- 15:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host db1264.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:07 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 15:05 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 15:04 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:04 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 15:02 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 15:02 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1264.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:00 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 14:59 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
- 14:59 cdanis@cumin1003: START - Cookbook sre.hosts.reimage for host tcp-proxy3001.esams.wmnet with OS trixie
- 14:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host db1264.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 14:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T407997)', diff saved to https://phabricator.wikimedia.org/P85095 and previous config saved to /var/cache/conftool/dbconfig/20251107-145434-marostegui.json
- 14:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 14:49 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 14:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 14:44 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 14:42 andrew@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcontrol1008-dev.eqiad.wmnet']
- 14:42 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcontrol1008-dev.eqiad.wmnet']
- 14:40 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 14:39 andrew@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcontrol1008-dev.eqiad.wmnet']
- 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2188 (T407997)', diff saved to https://phabricator.wikimedia.org/P85093 and previous config saved to /var/cache/conftool/dbconfig/20251107-143657-marostegui.json
- 14:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2188.codfw.wmnet with reason: Maintenance
- 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T407997)', diff saved to https://phabricator.wikimedia.org/P85092 and previous config saved to /var/cache/conftool/dbconfig/20251107-143633-marostegui.json
- 14:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P85091 and previous config saved to /var/cache/conftool/dbconfig/20251107-142125-marostegui.json
- 14:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P85090 and previous config saved to /var/cache/conftool/dbconfig/20251107-140619-marostegui.json
- 13:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T407997)', diff saved to https://phabricator.wikimedia.org/P85089 and previous config saved to /var/cache/conftool/dbconfig/20251107-135111-marostegui.json
- 13:47 dpogorzelski@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=ml-serve1011.eqiad.wmnet,dc=eqiad,cluster=ml_serve,service=kubesvc
- 13:46 marostegui: Deploy schema change on x1 codfw master with replication T409539
- 13:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2176 (T407997)', diff saved to https://phabricator.wikimedia.org/P85088 and previous config saved to /var/cache/conftool/dbconfig/20251107-133002-marostegui.json
- 13:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 13:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T407997)', diff saved to https://phabricator.wikimedia.org/P85087 and previous config saved to /var/cache/conftool/dbconfig/20251107-132938-marostegui.json
- 13:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P85086 and previous config saved to /var/cache/conftool/dbconfig/20251107-131431-marostegui.json
- 13:12 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps-test2006.codfw.wmnet
- 13:05 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:05 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 13:01 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 12:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P85085 and previous config saved to /var/cache/conftool/dbconfig/20251107-125923-marostegui.json
- 12:58 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 12:45 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps-test2006.codfw.wmnet
- 12:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T407997)', diff saved to https://phabricator.wikimedia.org/P85084 and previous config saved to /var/cache/conftool/dbconfig/20251107-124415-marostegui.json
- 12:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2174 (T407997)', diff saved to https://phabricator.wikimedia.org/P85083 and previous config saved to /var/cache/conftool/dbconfig/20251107-122347-marostegui.json
- 12:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 12:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T407997)', diff saved to https://phabricator.wikimedia.org/P85082 and previous config saved to /var/cache/conftool/dbconfig/20251107-122324-marostegui.json
- 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps-test2005.codfw.wmnet
- 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 12:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P85081 and previous config saved to /var/cache/conftool/dbconfig/20251107-120816-marostegui.json
- 11:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P85079 and previous config saved to /var/cache/conftool/dbconfig/20251107-115309-marostegui.json
- 11:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T407997)', diff saved to https://phabricator.wikimedia.org/P85078 and previous config saved to /var/cache/conftool/dbconfig/20251107-113801-marostegui.json
- 11:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 11:17 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2173 (T407997)', diff saved to https://phabricator.wikimedia.org/P85077 and previous config saved to /var/cache/conftool/dbconfig/20251107-111737-marostegui.json
- 11:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2173.codfw.wmnet with reason: Maintenance
- 11:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T407997)', diff saved to https://phabricator.wikimedia.org/P85076 and previous config saved to /var/cache/conftool/dbconfig/20251107-111712-marostegui.json
- 11:15 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 11:10 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps-test2005.codfw.wmnet
- 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 11:05 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
- 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps-test2004.codfw.wmnet
- 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 11:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P85075 and previous config saved to /var/cache/conftool/dbconfig/20251107-110204-marostegui.json
- 10:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:53 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P85074 and previous config saved to /var/cache/conftool/dbconfig/20251107-104657-marostegui.json
- 10:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:45 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps-test2004.codfw.wmnet
- 10:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:43 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2093.codfw.wmnet with OS bullseye
- 10:43 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:43 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps-test2003.codfw.wmnet
- 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:39 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:38 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 10:38 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 10:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:35 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:35 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2091.codfw.wmnet with OS bullseye
- 10:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T407997)', diff saved to https://phabricator.wikimedia.org/P85073 and previous config saved to /var/cache/conftool/dbconfig/20251107-103149-marostegui.json
- 10:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:29 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2090.codfw.wmnet with OS bullseye
- 10:29 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 10:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:22 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps-test2003.codfw.wmnet
- 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps-test2002.codfw.wmnet
- 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:18 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2170 (T407997)', diff saved to https://phabricator.wikimedia.org/P85072 and previous config saved to /var/cache/conftool/dbconfig/20251107-101126-marostegui.json
- 10:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T407997)', diff saved to https://phabricator.wikimedia.org/P85071 and previous config saved to /var/cache/conftool/dbconfig/20251107-101102-marostegui.json
- 10:09 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 10:07 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps-test2002.codfw.wmnet
- 10:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 10:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2164 gradually with 4 steps - Migration of db2164.codfw.wmnet completed
- 09:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P85069 and previous config saved to /var/cache/conftool/dbconfig/20251107-095555-marostegui.json
- 09:54 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 09:54 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 09:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 09:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 09:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P85067 and previous config saved to /var/cache/conftool/dbconfig/20251107-094047-marostegui.json
- 09:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 09:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 09:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T407997)', diff saved to https://phabricator.wikimedia.org/P85065 and previous config saved to /var/cache/conftool/dbconfig/20251107-092539-marostegui.json
- 09:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 09:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 09:17 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2164 gradually with 4 steps - Migration of db2164.codfw.wmnet completed
- 09:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2153 (T407997)', diff saved to https://phabricator.wikimedia.org/P85063 and previous config saved to /var/cache/conftool/dbconfig/20251107-090521-marostegui.json
- 09:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2153.codfw.wmnet with reason: Maintenance
- 09:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T407997)', diff saved to https://phabricator.wikimedia.org/P85062 and previous config saved to /var/cache/conftool/dbconfig/20251107-090457-marostegui.json
- 08:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 08:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host maps-test2001.codfw.wmnet with OS bookworm
- 08:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P85061 and previous config saved to /var/cache/conftool/dbconfig/20251107-084949-marostegui.json
- 08:44 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts dbprov1003.eqiad.wmnet
- 08:44 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:44 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 08:40 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 08:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P85060 and previous config saved to /var/cache/conftool/dbconfig/20251107-083442-marostegui.json
- 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on maps-test2001.codfw.wmnet with reason: host reimage
- 08:32 jynus@cumin1003: START - Cookbook sre.dns.netbox
- 08:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 08:28 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts dbprov1003.eqiad.wmnet
- 08:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on maps-test2001.codfw.wmnet with reason: host reimage
- 08:27 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2164 - Upgrading db2164.codfw.wmnet
- 08:27 fceratto@cumin1003: START - Cookbook sre.mysql.depool db2164 - Upgrading db2164.codfw.wmnet
- 08:27 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 08:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T407997)', diff saved to https://phabricator.wikimedia.org/P85058 and previous config saved to /var/cache/conftool/dbconfig/20251107-081934-marostegui.json
- 08:06 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host maps-test2001.codfw.wmnet with OS bookworm
- 08:00 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbprov2003.codfw.wmnet
- 08:00 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:00 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 07:59 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 07:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2146 (T407997)', diff saved to https://phabricator.wikimedia.org/P85057 and previous config saved to /var/cache/conftool/dbconfig/20251107-075857-marostegui.json
- 07:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2146.codfw.wmnet with reason: Maintenance
- 07:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T407997)', diff saved to https://phabricator.wikimedia.org/P85056 and previous config saved to /var/cache/conftool/dbconfig/20251107-075833-marostegui.json
- 07:50 jynus@cumin1003: START - Cookbook sre.dns.netbox
- 07:45 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts dbprov2003.codfw.wmnet
- 07:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P85055 and previous config saved to /var/cache/conftool/dbconfig/20251107-074326-marostegui.json
- 07:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P85054 and previous config saved to /var/cache/conftool/dbconfig/20251107-072818-marostegui.json
- 07:27 moritzm: fix failed logrotation on install1005
- 07:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T407997)', diff saved to https://phabricator.wikimedia.org/P85053 and previous config saved to /var/cache/conftool/dbconfig/20251107-071310-marostegui.json
- 06:52 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Rate-limit by wmfuniq fix, conftool 6 - oblivian@cumin1003"
- 06:52 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Rate-limit by wmfuniq fix, conftool 6 - oblivian@cumin1003
- 06:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2145 (T407997)', diff saved to https://phabricator.wikimedia.org/P85052 and previous config saved to /var/cache/conftool/dbconfig/20251107-065226-marostegui.json
- 06:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2145.codfw.wmnet with reason: Maintenance
- 06:51 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Rate-limit by wmfuniq fix, conftool 6 - oblivian@cumin1003
- 06:51 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Rate-limit by wmfuniq fix, conftool 6 - oblivian@cumin1003"
- 06:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 03:06 ryankemper@cumin1002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 03:05 tstarling@deploy2002: Finished scap sync-world: Backport for Add English translations to namespaces that lack them (T407127), Set robot noindex policy for draft namespaces that lacked it (T407127) (duration: 09m 58s)
- 02:58 tstarling@deploy2002: tstarling: Continuing with sync
- 02:57 tstarling@deploy2002: tstarling: Backport for Add English translations to namespaces that lack them (T407127), Set robot noindex policy for draft namespaces that lacked it (T407127) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 02:55 tstarling@deploy2002: Started scap sync-world: Backport for Add English translations to namespaces that lack them (T407127), Set robot noindex policy for draft namespaces that lacked it (T407127)
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 34s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-11-06
- 23:56 zabe@deploy2002: Finished scap sync-world: Backport for Update for new WikimediaMaintenance script locations (duration: 07m 15s)
- 23:51 zabe@deploy2002: zabe: Continuing with sync
- 23:51 zabe@deploy2002: zabe: Backport for Update for new WikimediaMaintenance script locations synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:48 zabe@deploy2002: Started scap sync-world: Backport for Update for new WikimediaMaintenance script locations
- 23:44 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
- 23:43 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
- 23:13 cjming: end of UTC late backport window
- 23:11 cjming@deploy2002: Finished scap sync-world: Backport for Use wikimedia.org as the "server" for the wiki-agnostic RESTbase specs, Use prefixed 'sub' field in OAuth 2 access tokens on beta cluster (T399199), Re-run xLab MW Module Loaded experiment v2 (T401705) (duration: 08m 34s)
- 23:06 cjming@deploy2002: cjming, tgr, aaron: Continuing with sync
- 23:04 cjming@deploy2002: cjming, tgr, aaron: Backport for Use wikimedia.org as the "server" for the wiki-agnostic RESTbase specs, Use prefixed 'sub' field in OAuth 2 access tokens on beta cluster (T399199), Re-run xLab MW Module Loaded experiment v2 (T401705) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there
- 23:02 cjming@deploy2002: Started scap sync-world: Backport for Use wikimedia.org as the "server" for the wiki-agnostic RESTbase specs, Use prefixed 'sub' field in OAuth 2 access tokens on beta cluster (T399199), Re-run xLab MW Module Loaded experiment v2 (T401705)
- 22:49 catrope@deploy2002: Finished scap sync-world: Backport for AccountRecovery: Use canonical URL in confirmation email, Enable Special:AccountRecovery everywhere (T399742) (duration: 10m 24s)
- 22:46 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 22:42 catrope@deploy2002: catrope: Continuing with sync
- 22:40 catrope@deploy2002: catrope: Backport for AccountRecovery: Use canonical URL in confirmation email, Enable Special:AccountRecovery everywhere (T399742) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:40 ryankemper@cumin1002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 22:38 catrope@deploy2002: Started scap sync-world: Backport for AccountRecovery: Use canonical URL in confirmation email, Enable Special:AccountRecovery everywhere (T399742)
- 22:36 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 22:34 kemayo@deploy2002: Finished scap sync-world: Backport for Edit check: allow any check to be an a/b test including default ones (T406134), Enable editcheck addReference a/b test on enwiki (T406134) (duration: 13m 52s)
- 22:27 kemayo@deploy2002: kemayo: Continuing with sync
- 22:24 kemayo@deploy2002: kemayo: Backport for Edit check: allow any check to be an a/b test including default ones (T406134), Enable editcheck addReference a/b test on enwiki (T406134) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:20 kemayo@deploy2002: Started scap sync-world: Backport for Edit check: allow any check to be an a/b test including default ones (T406134), Enable editcheck addReference a/b test on enwiki (T406134)
- 21:59 ladsgroup@deploy2002: Finished scap sync-world: Backport for Revert "BacklinkCache: Switch order between pr_cascade and links queries", Revert "RestrictionStore: Switch order between pr_cascade and links queries" (duration: 55m 26s)
- 21:55 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 21:55 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 21:38 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 21:29 ladsgroup@deploy2002: ladsgroup: Backport for Revert "BacklinkCache: Switch order between pr_cascade and links queries", Revert "RestrictionStore: Switch order between pr_cascade and links queries" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:13 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 21:13 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added VIP for tcpproxy service in eqiad - dzahn@cumin2002"
- 21:13 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added VIP for tcpproxy service in eqiad - dzahn@cumin2002"
- 21:08 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 21:07 dzahn@dns1004: END - running authdns-update
- 21:04 ladsgroup@deploy2002: Started scap sync-world: Backport for Revert "BacklinkCache: Switch order between pr_cascade and links queries", Revert "RestrictionStore: Switch order between pr_cascade and links queries"
- 21:03 dzahn@dns1004: START - running authdns-update
- 20:57 eileen: civicrm upgraded from 75455a21 to 0f69c4eb
- 20:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T407997)', diff saved to https://phabricator.wikimedia.org/P85050 and previous config saved to /var/cache/conftool/dbconfig/20251106-204120-marostegui.json
- 20:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P85049 and previous config saved to /var/cache/conftool/dbconfig/20251106-202612-marostegui.json
- 20:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P85048 and previous config saved to /var/cache/conftool/dbconfig/20251106-201105-marostegui.json
- 19:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T407997)', diff saved to https://phabricator.wikimedia.org/P85047 and previous config saved to /var/cache/conftool/dbconfig/20251106-195557-marostegui.json
- 19:55 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2092.codfw.wmnet with OS bullseye
- 19:55 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 19:53 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 19:44 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcontrol1008-dev.eqiad.wmnet']
- 19:43 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 19:39 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2093.codfw.wmnet with reason: host reimage
- 19:39 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on lsw1-d6-eqiad,lsw1-d6-eqiad IPv6,lsw1-d6-eqiad.mgmt with reason: told switch to reboot and its stuck in UEFI shell
- 19:37 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2092.codfw.wmnet with reason: host reimage
- 19:37 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1251 (T407997)', diff saved to https://phabricator.wikimedia.org/P85046 and previous config saved to /var/cache/conftool/dbconfig/20251106-193705-marostegui.json
- 19:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1251.eqiad.wmnet with reason: Maintenance
- 19:34 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2093.codfw.wmnet with reason: host reimage
- 19:34 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2092.codfw.wmnet with reason: host reimage
- 19:33 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2091.codfw.wmnet with reason: host reimage
- 19:31 swfrench-wmf: rolling run-puppet-agent on A:cp hosts for haproxy config change
- 19:29 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2091.codfw.wmnet with reason: host reimage
- 19:27 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2090.codfw.wmnet with reason: host reimage
- 19:21 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2090.codfw.wmnet with reason: host reimage
- 19:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 19:19 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 19:18 swfrench-wmf: disable-puppet on A:cp hosts for haproxy config change
- 19:15 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.1 refs T408271
- 19:06 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wdqs1013.eqiad.wmnet with reason: C/D Migration
- 19:05 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wcqs1003.eqiad.wmnet with reason: C/D Migration
- 19:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1239.eqiad.wmnet with reason: Maintenance
- 19:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T407997)', diff saved to https://phabricator.wikimedia.org/P85045 and previous config saved to /var/cache/conftool/dbconfig/20251106-190506-marostegui.json
- 19:02 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on puppetserver1001.eqiad.wmnet with reason: C/D Migration
- 18:57 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-test-worker1002.eqiad.wmnet with reason: C/D Migration
- 18:55 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on sessionstore1005.eqiad.wmnet with reason: C/D Migration
- 18:53 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on es1045.eqiad.wmnet with reason: C/D Migration
- 18:52 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2093.codfw.wmnet with OS bullseye
- 18:52 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2092.codfw.wmnet with OS bullseye
- 18:52 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2091.codfw.wmnet with OS bullseye
- 18:51 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2090.codfw.wmnet with OS bullseye
- 18:51 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1262.eqiad.wmnet with reason: C/D Migration
- 18:51 jhancock@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-be2093']
- 18:51 jhancock@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-be2092']
- 18:51 jhancock@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-be2091']
- 18:50 jhancock@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be2093']
- 18:50 jhancock@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-be2090']
- 18:50 jhancock@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be2092']
- 18:50 jhancock@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be2091']
- 18:50 jhancock@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be2090']
- 18:50 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1218.eqiad.wmnet with reason: C/D Migration
- 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P85044 and previous config saved to /var/cache/conftool/dbconfig/20251106-184958-marostegui.json
- 18:49 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1217.eqiad.wmnet with reason: C/D Migration
- 18:46 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1169.eqiad.wmnet with reason: C/D Migration
- 18:44 robh: C5 eqiad c/d server switch migrations in progress
- 18:44 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1168.eqiad.wmnet with reason: C/D Migration
- 18:43 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2093.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:43 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2090.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:43 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2092.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:42 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2091.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:41 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on aqs1018.eqiad.wmnet with reason: C/D Migration
- 18:38 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on krb1002.eqiad.wmnet with reason: C/D Migration
- 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P85043 and previous config saved to /var/cache/conftool/dbconfig/20251106-183452-marostegui.json
- 18:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-wikifunctions: apply
- 18:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-wikifunctions: apply
- 18:28 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1048.eqiad.wmnet with reason: C/D Migration
- 18:27 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1047.eqiad.wmnet with reason: C/D Migration
- 18:27 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2093.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:26 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2092.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:26 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1046.eqiad.wmnet with reason: C/D Migration
- 18:25 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2091.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:25 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2090.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:24 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1045.eqiad.wmnet with reason: C/D Migration
- 18:23 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: apply
- 18:22 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: apply
- 18:20 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on kafka-logging1002.eqiad.wmnet with reason: C/D Migration
- 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T407997)', diff saved to https://phabricator.wikimedia.org/P85042 and previous config saved to /var/cache/conftool/dbconfig/20251106-181944-marostegui.json
- 18:18 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on maps1013.eqiad.wmnet with reason: C/D Migration
- 18:18 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:18 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:18 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:17 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:17 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on druid1012.eqiad.wmnet with reason: C/D Migration
- 18:16 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:15 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:15 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:15 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:15 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1234.eqiad.wmnet with reason: C/D Migration
- 18:14 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1221.eqiad.wmnet with reason: C/D Migration
- 18:12 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1220.eqiad.wmnet with reason: C/D Migration
- 18:11 swfrench@deploy2002: Stopping before sync operations
- 18:10 swfrench@deploy2002: Started scap sync-world: No-deployment scap run to switch mw-wikifunctions to PHP 8.3 - T405955
- 18:10 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
- 18:10 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
- 18:10 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:10 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:09 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
- 18:09 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:09 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1132.eqiad.wmnet with reason: C/D Migration
- 18:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:09 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
- 18:09 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:06 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:06 robh: Rack C2 C/D switch migrations in progress
- 18:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:05 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-be1092.eqiad.wmnet with reason: C/D Migration
- 18:04 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-be1086.eqiad.wmnet with reason: C/D Migration
- 18:02 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
- 18:02 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 18:01 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-be1066.eqiad.wmnet with reason: C/D Migration
- 18:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1235 (T407997)', diff saved to https://phabricator.wikimedia.org/P85041 and previous config saved to /var/cache/conftool/dbconfig/20251106-180052-marostegui.json
- 18:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1235.eqiad.wmnet with reason: Maintenance
- 18:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T407997)', diff saved to https://phabricator.wikimedia.org/P85040 and previous config saved to /var/cache/conftool/dbconfig/20251106-180028-marostegui.json
- 17:53 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on apus-fe1003.eqiad.wmnet with reason: C/D Migration
- 17:51 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 17:51 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2154 gradually with 4 steps - Migration of db2154.codfw.wmnet completed
- 17:50 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts people2003.codfw.wmnet
- 17:47 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2091.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:46 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2092.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:46 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2093.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:46 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2090.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P85038 and previous config saved to /var/cache/conftool/dbconfig/20251106-174521-marostegui.json
- 17:42 dzahn@cumin2002: START - Cookbook sre.hosts.decommission for hosts people2003.codfw.wmnet
- 17:39 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on people1004.eqiad.wmnet with reason: decom
- 17:39 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on people2003.codfw.wmnet with reason: decom
- 17:38 mutante: shutting down people1004 and people2003 - had already shut them down on Oct 29 but someone or something booted them again T408713
- 17:37 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cp1110.eqiad.wmnet with reason: C/D Migration
- 17:33 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on mc1050.eqiad.wmnet with reason: C/D Migration
- 17:31 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on mc1049.eqiad.wmnet with reason: C/D Migration
- 17:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P85036 and previous config saved to /var/cache/conftool/dbconfig/20251106-173013-marostegui.json
- 17:28 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on wdqs1014.eqiad.wmnet with reason: C/D Migration
- 17:25 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on relforge1008.eqiad.wmnet with reason: C/D Migration
- 17:23 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2094.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:21 robh: multiple moves from C/D per T405942
- 17:19 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1119.eqiad.wmnet with reason: C/D Migration
- 17:19 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1118.eqiad.wmnet with reason: C/D Migration
- 17:17 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1081.eqiad.wmnet with reason: C/D Migration
- 17:16 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1080.eqiad.wmnet with reason: C/D Migration
- 17:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T407997)', diff saved to https://phabricator.wikimedia.org/P85034 and previous config saved to /var/cache/conftool/dbconfig/20251106-171505-marostegui.json
- 17:14 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on ml-cache1002.eqiad.wmnet with reason: C/D Migration
- 17:13 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2094.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:12 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2093.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:12 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1223.eqiad.wmnet with reason: C/D Migration
- 17:12 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2092.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:11 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2091.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:11 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2090.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:10 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1222.eqiad.wmnet with reason: C/D Migration
- 17:09 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2094
- 17:08 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2094
- 17:08 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2093
- 17:08 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2093
- 17:08 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2092
- 17:08 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2092
- 17:08 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2091
- 17:08 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2091
- 17:08 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2090
- 17:07 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2090
- 17:07 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:07 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-be2090-4 to codfw - jhancock@cumin1003"
- 17:07 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-be2090-4 to codfw - jhancock@cumin1003"
- 17:06 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2154 gradually with 4 steps - Migration of db2154.codfw.wmnet completed
- 17:03 jhancock@cumin1003: START - Cookbook sre.dns.netbox
- 17:01 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-presto1019.eqiad.wmnet with reason: C/D Migration
- 16:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1234 (T407997)', diff saved to https://phabricator.wikimedia.org/P85032 and previous config saved to /var/cache/conftool/dbconfig/20251106-165631-marostegui.json
- 16:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1234.eqiad.wmnet with reason: Maintenance
- 16:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T407997)', diff saved to https://phabricator.wikimedia.org/P85031 and previous config saved to /var/cache/conftool/dbconfig/20251106-165607-marostegui.json
- 16:55 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on ms-fe1011.eqiad.wmnet with reason: C/D Migration
- 16:52 jynus: drop backup grants from m* section primaries T403166
- 16:52 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on prometheus1008.eqiad.wmnet with reason: C/D Migration
- 16:49 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1226.eqiad.wmnet with reason: C/D Migration
- 16:47 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1225.eqiad.wmnet with reason: C/D Migration
- 16:45 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1224.eqiad.wmnet with reason: C/D Migration
- 16:43 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1180.eqiad.wmnet with reason: C/D Migration
- 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P85030 and previous config saved to /var/cache/conftool/dbconfig/20251106-164100-marostegui.json
- 16:40 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1151.eqiad.wmnet with reason: C/D Migration
- 16:39 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2154 - Upgrading db2154.codfw.wmnet
- 16:38 fceratto@cumin1003: START - Cookbook sre.mysql.depool db2154 - Upgrading db2154.codfw.wmnet
- 16:38 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 16:37 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1133.eqiad.wmnet with reason: C/D Migration
- 16:35 urbanecm@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
- 16:34 urbanecm@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
- 16:32 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on ms-fe1019.eqiad.wmnet with reason: C/D Migration
- 16:28 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on ms-be1082.eqiad.wmnet with reason: C/D Migration
- 16:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P85028 and previous config saved to /var/cache/conftool/dbconfig/20251106-162552-marostegui.json
- 16:23 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on dbprov1003.eqiad.wmnet with reason: C/D Migration
- 16:21 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1088.eqiad.wmnet with reason: C/D Migration
- 16:18 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1087.eqiad.wmnet with reason: C/D Migration
- 16:17 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1083.eqiad.wmnet with reason: C/D Migration
- 16:17 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1083.eqiad.wmnet with reason: C/D Migration
- 16:16 moritzm: installing sysstat security updates
- 16:16 Amir1: cumin2024@db2205.codfw.wmnet[(none)]> drop database if exists boardvotetest and boardvote2007_test; (T297297)
- 16:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T407997)', diff saved to https://phabricator.wikimedia.org/P85027 and previous config saved to /var/cache/conftool/dbconfig/20251106-161045-marostegui.json
- 16:09 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1082.eqiad.wmnet with reason: C/D Migration
- 16:04 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cp1109.eqiad.wmnet with reason: C/D Migration
- 16:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 16:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2163 gradually with 4 steps - Migration of db2163.codfw.wmnet completed
- 16:02 ejegg: payments-wiki upgraded from c2a4b377 to 1d4b0d2a
- 15:55 dancy@deploy2002: Installation of scap version "4.225.0" completed for 2 hosts
- 15:53 dancy@deploy2002: Installing scap version "4.225.0" for 2 host(s)
- 15:52 robh: cp1108 moving as part of migration
- 15:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1232 (T407997)', diff saved to https://phabricator.wikimedia.org/P85025 and previous config saved to /var/cache/conftool/dbconfig/20251106-155207-marostegui.json
- 15:52 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cp1108.eqiad.wmnet with reason: C/D Migration
- 15:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1232.eqiad.wmnet with reason: Maintenance
- 15:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T407997)', diff saved to https://phabricator.wikimedia.org/P85024 and previous config saved to /var/cache/conftool/dbconfig/20251106-155143-marostegui.json
- 15:49 jynus: drop grants for dbprov1003 & dbprov2003 T403166
- 15:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P85022 and previous config saved to /var/cache/conftool/dbconfig/20251106-153636-marostegui.json
- 15:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P85019 and previous config saved to /var/cache/conftool/dbconfig/20251106-152129-marostegui.json
- 15:18 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2163 gradually with 4 steps - Migration of db2163.codfw.wmnet completed
- 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device fasw2-c1a-eqiad
- 15:10 jmm@cumin2002: START - Cookbook sre.network.tls for network device fasw2-c1a-eqiad
- 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr2-eqsin
- 15:10 jmm@cumin2002: START - Cookbook sre.network.tls for network device cr2-eqsin
- 15:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T407997)', diff saved to https://phabricator.wikimedia.org/P85017 and previous config saved to /var/cache/conftool/dbconfig/20251106-150622-marostegui.json
- 15:03 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-e1-eqiad
- 15:03 jmm@cumin2002: START - Cookbook sre.network.tls for network device ssw1-e1-eqiad
- 15:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-f1-eqiad
- 15:02 jmm@cumin2002: START - Cookbook sre.network.tls for network device ssw1-f1-eqiad
- 15:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-a2-codfw
- 15:02 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-a2-codfw
- 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-a3-codfw
- 15:00 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-a3-codfw
- 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-a4-codfw
- 15:00 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-a4-codfw
- 14:59 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-a6-codfw
- 14:59 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-a6-codfw
- 14:59 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-a5-codfw
- 14:59 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-a5-codfw
- 14:57 tappof: bump space for prometheus k8s-dse in eqiad
- 14:55 Lucas_WMDE: UTC afternoon backport+config window done
- 14:55 Lucas_WMDE: Deployed security patch for T409423
- 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change dns for row c gateway interfaces eqiad CRs - cmooney@cumin1003"
- 14:54 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change dns for row c gateway interfaces eqiad CRs - cmooney@cumin1003"
- 14:53 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-a8-codfw
- 14:53 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-a8-codfw
- 14:53 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-a7-codfw
- 14:53 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-a7-codfw
- 14:50 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2163 - Upgrading db2163.codfw.wmnet
- 14:50 fceratto@cumin1003: START - Cookbook sre.mysql.depool db2163 - Upgrading db2163.codfw.wmnet
- 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 14:47 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 14:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1219 (T407997)', diff saved to https://phabricator.wikimedia.org/P85015 and previous config saved to /var/cache/conftool/dbconfig/20251106-144714-marostegui.json
- 14:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 14:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T407997)', diff saved to https://phabricator.wikimedia.org/P85014 and previous config saved to /var/cache/conftool/dbconfig/20251106-144650-marostegui.json
- 14:46 logmsgbot: lucaswerkmeister-wmde Deployed security patch for T409423
- 14:40 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-b2-codfw
- 14:40 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-b2-codfw
- 14:40 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-b5-codfw
- 14:39 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-b5-codfw
- 14:39 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-b3-codfw
- 14:39 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-b3-codfw
- 14:37 moritzm: installing bind security updates (client-side tools/libs only)
- 14:36 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-b4-codfw
- 14:36 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-b4-codfw
- 14:35 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-b6-codfw
- 14:35 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-b6-codfw
- 14:35 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-b8-codfw
- 14:35 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-b8-codfw
- 14:34 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-b7-codfw
- 14:34 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-b7-codfw
- 14:34 lucaswerkmeister-wmde@deploy2002: mwscript-k8s job started: namespaceDupes tcywiki --fix # T328207
- 14:34 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-a1-codfw
- 14:34 elukey@cumin1003: START - Cookbook sre.network.tls for network device ssw1-a1-codfw
- 14:33 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [tcywiki] Add Portal and Draft namespaces and its talk (T409329) (duration: 08m 12s)
- 14:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 14:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2152 gradually with 4 steps - Migration of db2152.codfw.wmnet completed
- 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P85012 and previous config saved to /var/cache/conftool/dbconfig/20251106-143142-marostegui.json
- 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-a8-codfw
- 14:31 elukey@cumin1003: START - Cookbook sre.network.tls for network device ssw1-a8-codfw
- 14:29 lucaswerkmeister-wmde@deploy2002: superpes, lucaswerkmeister-wmde: Continuing with sync
- 14:27 topranks: move private1-c-eqiad sub-interface from ae3 to et-1/0/5 on cr1-eqiad (T405579)
- 14:27 lucaswerkmeister-wmde@deploy2002: superpes, lucaswerkmeister-wmde: Backport for [tcywiki] Add Portal and Draft namespaces and its talk (T409329) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:25 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [tcywiki] Add Portal and Draft namespaces and its talk (T409329)
- 14:23 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 14:23 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Update types for WatchArticleHook/UnwatchArticleHook, LQT Import: Fix quadratic time explosion in finding next offset (T405080) (duration: 07m 26s)
- 14:23 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 14:20 topranks: move private1-c-eqiad sub-interface from ae3 to et-1/0/5 on cr2-eqiad (T405579)
- 14:19 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 14:19 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 14:19 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, tchanders: Continuing with sync
- 14:18 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, tchanders: Backport for Update types for WatchArticleHook/UnwatchArticleHook, LQT Import: Fix quadratic time explosion in finding next offset (T405080) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P85010 and previous config saved to /var/cache/conftool/dbconfig/20251106-141635-marostegui.json
- 14:16 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Update types for WatchArticleHook/UnwatchArticleHook, LQT Import: Fix quadratic time explosion in finding next offset (T405080)
- 14:13 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for EventStreamConfig: Remove mediawiki.reference_previews stream (T242127), EventStreamConfig: Remove mediawiki.wikistories_* streams (T408178) (duration: 09m 12s)
- 14:09 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 14:09 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, phuedx: Continuing with sync
- 14:09 kartik@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 14:07 topranks: move public1-c-eqiad sub-interface from ae3 to et-1/0/5 on cr1-eqiad (T405579)
- 14:07 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, phuedx: Backport for EventStreamConfig: Remove mediawiki.reference_previews stream (T242127), EventStreamConfig: Remove mediawiki.wikistories_* streams (T408178) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:04 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for EventStreamConfig: Remove mediawiki.reference_previews stream (T242127), EventStreamConfig: Remove mediawiki.wikistories_* streams (T408178)
- 14:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T407997)', diff saved to https://phabricator.wikimedia.org/P85008 and previous config saved to /var/cache/conftool/dbconfig/20251106-140127-marostegui.json
- 14:00 topranks: move public1-c-eqiad sub-interface from ae3 to et-1/0/5 on cr2-eqiad (T405579)
- 13:57 kartik@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2152 gradually with 4 steps - Migration of db2152.codfw.wmnet completed
- 13:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1218 (T407997)', diff saved to https://phabricator.wikimedia.org/P85005 and previous config saved to /var/cache/conftool/dbconfig/20251106-134013-marostegui.json
- 13:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1218.eqiad.wmnet with reason: Maintenance
- 13:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T407997)', diff saved to https://phabricator.wikimedia.org/P85004 and previous config saved to /var/cache/conftool/dbconfig/20251106-133949-marostegui.json
- 13:36 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2152 - Upgrading db2152.codfw.wmnet
- 13:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P85002 and previous config saved to /var/cache/conftool/dbconfig/20251106-132442-marostegui.json
- 13:20 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 13:20 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 13:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 13:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 13:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P85001 and previous config saved to /var/cache/conftool/dbconfig/20251106-130934-marostegui.json
- 13:09 fceratto@cumin1003: START - Cookbook sre.mysql.depool db2152 - Upgrading db2152.codfw.wmnet
- 13:08 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 12:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T407997)', diff saved to https://phabricator.wikimedia.org/P85000 and previous config saved to /var/cache/conftool/dbconfig/20251106-125427-marostegui.json
- 12:36 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1206 (T407997)', diff saved to https://phabricator.wikimedia.org/P84999 and previous config saved to /var/cache/conftool/dbconfig/20251106-123507-marostegui.json
- 12:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1206.eqiad.wmnet with reason: Maintenance
- 12:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T407997)', diff saved to https://phabricator.wikimedia.org/P84998 and previous config saved to /var/cache/conftool/dbconfig/20251106-123444-marostegui.json
- 12:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P84996 and previous config saved to /var/cache/conftool/dbconfig/20251106-121937-marostegui.json
- 12:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P84995 and previous config saved to /var/cache/conftool/dbconfig/20251106-120429-marostegui.json
- 11:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T407997)', diff saved to https://phabricator.wikimedia.org/P84994 and previous config saved to /var/cache/conftool/dbconfig/20251106-114921-marostegui.json
- 11:39 cmooney@cumin1003: START - Cookbook sre.hosts.dhcp for host sretest1005.eqiad.wmnet
- 11:38 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 11:37 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 11:37 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 11:36 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 11:36 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host sretest1006.eqiad.wmnet
- 11:29 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1196 (T407997)', diff saved to https://phabricator.wikimedia.org/P84993 and previous config saved to /var/cache/conftool/dbconfig/20251106-112910-marostegui.json
- 11:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 11:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1196.eqiad.wmnet with reason: Maintenance
- 11:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T407997)', diff saved to https://phabricator.wikimedia.org/P84992 and previous config saved to /var/cache/conftool/dbconfig/20251106-112827-marostegui.json
- 11:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P84990 and previous config saved to /var/cache/conftool/dbconfig/20251106-111319-marostegui.json
- 10:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P84989 and previous config saved to /var/cache/conftool/dbconfig/20251106-105812-marostegui.json
- 10:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T407997)', diff saved to https://phabricator.wikimedia.org/P84988 and previous config saved to /var/cache/conftool/dbconfig/20251106-104304-marostegui.json
- 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1195 (T407997)', diff saved to https://phabricator.wikimedia.org/P84986 and previous config saved to /var/cache/conftool/dbconfig/20251106-101954-marostegui.json
- 10:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1195.eqiad.wmnet with reason: Maintenance
- 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T407997)', diff saved to https://phabricator.wikimedia.org/P84985 and previous config saved to /var/cache/conftool/dbconfig/20251106-101929-marostegui.json
- 10:15 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.clone (exit_code=97) of db1176.eqiad.wmnet onto db2230.codfw.wmnet
- 10:05 brouberol@dns1004: END - running authdns-update
- 10:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P84984 and previous config saved to /var/cache/conftool/dbconfig/20251106-100421-marostegui.json
- 10:04 brouberol@dns1004: START - running authdns-update
- 09:59 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db1176.eqiad.wmnet onto db2230.codfw.wmnet
- 09:57 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db1176.eqiad.wmnet onto db2230.codfw.wmnet
- 09:56 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db1176.eqiad.wmnet onto db2230.codfw.wmnet
- 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps1009.eqiad.wmnet
- 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1009.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1009.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:53 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
- 09:53 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
- 09:52 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
- 09:51 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
- 09:51 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 09:50 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 09:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 09:50 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 09:50 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 09:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P84983 and previous config saved to /var/cache/conftool/dbconfig/20251106-094914-marostegui.json
- 09:40 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps1009.eqiad.wmnet
- 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps1010.eqiad.wmnet
- 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:37 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T407997)', diff saved to https://phabricator.wikimedia.org/P84982 and previous config saved to /var/cache/conftool/dbconfig/20251106-093406-marostegui.json
- 09:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 09:32 mszwarc@deploy2002: Finished scap sync-world: Backport for Revert "Use OutputPageBeforeHTML instead of BeforePageDisplay to add modules" (T409367) (duration: 08m 52s)
- 09:28 mszwarc@deploy2002: mszwarc: Continuing with sync
- 09:27 mszwarc@deploy2002: mszwarc: Backport for Revert "Use OutputPageBeforeHTML instead of BeforePageDisplay to add modules" (T409367) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:27 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps1010.eqiad.wmnet
- 09:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps1008.eqiad.wmnet
- 09:26 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:26 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1008.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:24 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1008.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:24 brouberol@dns1004: END - running authdns-update
- 09:24 mszwarc@deploy2002: Started scap sync-world: Backport for Revert "Use OutputPageBeforeHTML instead of BeforePageDisplay to add modules" (T409367)
- 09:23 brouberol@dns1004: START - running authdns-update
- 09:20 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 09:20 brouberol@dns1004: END - running authdns-update
- 09:20 elukey: upgrade python3-conftool and spicerack on cumin1003 and cumin2002 hosts
- 09:19 brouberol@dns1004: START - running authdns-update
- 09:19 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 09:18 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1186 (T407997)', diff saved to https://phabricator.wikimedia.org/P84981 and previous config saved to /var/cache/conftool/dbconfig/20251106-091401-marostegui.json
- 09:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1186.eqiad.wmnet with reason: Maintenance
- 09:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T407997)', diff saved to https://phabricator.wikimedia.org/P84980 and previous config saved to /var/cache/conftool/dbconfig/20251106-091337-marostegui.json
- 09:12 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 09:12 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 09:12 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps1008.eqiad.wmnet
- 09:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps1007.eqiad.wmnet
- 09:11 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:11 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1007.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1007.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 09:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 09:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P84979 and previous config saved to /var/cache/conftool/dbconfig/20251106-085830-marostegui.json
- 08:56 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps1007.eqiad.wmnet
- 08:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps1006.eqiad.wmnet
- 08:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1006.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1006.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:51 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 08:51 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:51 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 08:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 08:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:44 dcausse: UTC morning backport window done
- 08:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P84978 and previous config saved to /var/cache/conftool/dbconfig/20251106-084322-marostegui.json
- 08:42 dcausse@deploy2002: Finished scap sync-world: Backport for "hide logged in users" is no longer working with "non-JavaScript interface" (T409157), cirrus: enable default_sort on en, fr and he wikipedias (T404858), cirrus: enable alt index with default_sort on a set of wikis (T404858) (duration: 12m 49s)
- 08:41 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps1006.eqiad.wmnet
- 08:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps1005.eqiad.wmnet
- 08:41 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:41 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:40 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:37 dcausse@deploy2002: dcausse, tstarling: Continuing with sync
- 08:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:32 dcausse@deploy2002: dcausse, tstarling: Backport for "hide logged in users" is no longer working with "non-JavaScript interface" (T409157), cirrus: enable default_sort on en, fr and he wikipedias (T404858), cirrus: enable alt index with default_sort on a set of wikis (T404858) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes c
- 08:30 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps1005.eqiad.wmnet
- 08:29 dcausse@deploy2002: Started scap sync-world: Backport for "hide logged in users" is no longer working with "non-JavaScript interface" (T409157), cirrus: enable default_sort on en, fr and he wikipedias (T404858), cirrus: enable alt index with default_sort on a set of wikis (T404858)
- 08:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T407997)', diff saved to https://phabricator.wikimedia.org/P84977 and previous config saved to /var/cache/conftool/dbconfig/20251106-082814-marostegui.json
- 08:23 brouberol@dns1004: END - running authdns-update
- 08:22 brouberol@dns1004: START - running authdns-update
- 08:22 brouberol@dns1004: START - running authdns-update
- 08:21 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts maps2009.codfw.wmnet
- 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1193 gradually with 4 steps - Repooling after upgrade
- 08:13 kharlan@deploy2002: Finished scap sync-world: Backport for Allow temporary accounts to create in fawiki/enwiki Draft namespace (T409366) (duration: 10m 07s)
- 08:09 kharlan@deploy2002: kharlan, novemlinguae: Continuing with sync
- 08:07 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1169 (T407997)', diff saved to https://phabricator.wikimedia.org/P84975 and previous config saved to /var/cache/conftool/dbconfig/20251106-080746-marostegui.json
- 08:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 08:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T407997)', diff saved to https://phabricator.wikimedia.org/P84974 and previous config saved to /var/cache/conftool/dbconfig/20251106-080723-marostegui.json
- 08:06 kharlan@deploy2002: kharlan, novemlinguae: Backport for Allow temporary accounts to create in fawiki/enwiki Draft namespace (T409366) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:05 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps2009.codfw.wmnet
- 08:03 kharlan@deploy2002: Started scap sync-world: Backport for Allow temporary accounts to create in fawiki/enwiki Draft namespace (T409366)
- 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P84972 and previous config saved to /var/cache/conftool/dbconfig/20251106-075215-marostegui.json
- 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P84970 and previous config saved to /var/cache/conftool/dbconfig/20251106-073707-marostegui.json
- 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1193 gradually with 4 steps - Repooling after upgrade
- 07:28 musikanimal@deploy2002: Finished scap sync-world: Backport for Hide the WikiEditor search button (duration: 107m 34s)
- 07:23 musikanimal@deploy2002: musikanimal: Continuing with sync
- 07:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1193 - Depool db1193 for migration to mariadb 10.11
- 07:22 marostegui@cumin1003: START - Cookbook sre.mysql.depool db1193 - Depool db1193 for migration to mariadb 10.11
- 07:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1193.eqiad.wmnet with reason: Maintenance
- 07:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T407997)', diff saved to https://phabricator.wikimedia.org/P84968 and previous config saved to /var/cache/conftool/dbconfig/20251106-072200-marostegui.json
- 07:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1193 T409299', diff saved to https://phabricator.wikimedia.org/P84967 and previous config saved to /var/cache/conftool/dbconfig/20251106-071949-marostegui.json
- 07:19 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1209 to s8 primary T409299', diff saved to https://phabricator.wikimedia.org/P84966 and previous config saved to /var/cache/conftool/dbconfig/20251106-071911-marostegui.json
- 07:18 marostegui: Starting s8 eqiad failover from db1193 to db1209 - T409299
- 07:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s8 T409299
- 07:15 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1209 with weight 0 T409299', diff saved to https://phabricator.wikimedia.org/P84965 and previous config saved to /var/cache/conftool/dbconfig/20251106-071506-marostegui.json
- 07:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1163 (T407997)', diff saved to https://phabricator.wikimedia.org/P84964 and previous config saved to /var/cache/conftool/dbconfig/20251106-070128-marostegui.json
- 07:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 06:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2205.codfw.wmnet with reason: Maintenance
- 05:42 musikanimal@deploy2002: musikanimal: Backport for Hide the WikiEditor search button synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 05:40 musikanimal@deploy2002: Started scap sync-world: Backport for Hide the WikiEditor search button
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 32s)
- 01:01 wfan: civicrm upgraded from f1f68f1c to 75455a21
- 01:01 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:49 ryankemper@cumin1002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 00:37 eileen: civicrm upgraded from 0f49dd1d to f1f68f1c
- 00:29 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1262.eqiad.wmnet with reason: HW issues, T409374
- 00:27 cdobbins@cumin2002: dbctl commit (dc=all): 'Depool db1262', diff saved to https://phabricator.wikimedia.org/P84962 and previous config saved to /var/cache/conftool/dbconfig/20251106-002737-cdobbins.json
2025-11-05
- 23:35 larssandergreen: civicrm upgraded from 40198c3f to 0f49dd1d
- 23:26 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-misc: apply
- 23:25 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-misc: apply
- 23:19 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-misc: apply
- 23:19 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-misc: apply
- 23:17 swfrench@deploy2002: Stopping before sync operations
- 23:17 swfrench@deploy2002: Started scap sync-world: No-deployment scap run to switch mw-misc to PHP 8.3 - T405955
- 23:00 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 23:00 ryankemper@cumin1002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 23:00 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 22:58 ryankemper@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=cirrussearch2089.codfw.wmnet
- 22:48 ryankemper@cumin1002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart (test new spicerack version) - ryankemper@cumin1002 - T390860
- 22:22 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart (test new spicerack version) - ryankemper@cumin1002 - T390860
- 22:19 ryankemper@cumin1002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_eqiad
- 22:19 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_eqiad
- 22:19 ryankemper@cumin1002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cirrussearch1068.eqiad.wmnet for test new spicerack elasticsearch library - ryankemper@cumin1002 - T390860
- 22:19 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.ban Banning hosts: cirrussearch1068.eqiad.wmnet for test new spicerack elasticsearch library - ryankemper@cumin1002 - T390860
- 22:18 ryankemper@cumin1002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cirrussearch1068.eqiad.wmnet for test new spicerack elasticsearch library - ryankemper@cumin1002 - T390860
- 22:18 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.ban Banning hosts: cirrussearch1068.eqiad.wmnet for test new spicerack elasticsearch library - ryankemper@cumin1002 - T390860
- 22:17 ryankemper@cumin1002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: foobar1001.eqiad.wmnet for test new spicerack elasticsearch library - ryankemper@cumin1002 - T390860
- 22:17 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.ban Banning hosts: foobar1001.eqiad.wmnet for test new spicerack elasticsearch library - ryankemper@cumin1002 - T390860
- 22:14 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy7001.magru.wmnet with OS trixie
- 22:13 ryankemper: [WDQS] Restarting blazegraph across all codfw `wdqs-main` hosts, hoping it resolves the lag issues although it's likely that it won't
- 22:12 ryankemper: T366248 `sudo rm -rfv /srv/dumps/xmldatadumps/public/other/cirrus_search_index/cirrus-search-index/` on `clouddumps100[1,2].wikimedia.org`
- 21:58 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy7001.magru.wmnet with reason: host reimage
- 21:53 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy7001.magru.wmnet with reason: host reimage
- 21:51 catrope@deploy2002: Finished scap sync-world: Backport for Configure HTTP proxy for EmailAuth AccountRecovery (T399742) (duration: 08m 01s)
- 21:47 catrope@deploy2002: catrope: Continuing with sync
- 21:46 catrope@deploy2002: catrope: Backport for Configure HTTP proxy for EmailAuth AccountRecovery (T399742) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:43 catrope@deploy2002: Started scap sync-world: Backport for Configure HTTP proxy for EmailAuth AccountRecovery (T399742)
- 21:37 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2239.codfw.wmnet with reason: Maintenance
- 21:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T407997)', diff saved to https://phabricator.wikimedia.org/P84960 and previous config saved to /var/cache/conftool/dbconfig/20251105-213734-marostegui.json
- 21:30 catrope@deploy2002: Finished scap sync-world: Backport for Set up Special:AccountRecovery and enable on testwiki (T399742) (duration: 08m 21s)
- 21:25 catrope@deploy2002: catrope: Continuing with sync
- 21:25 larssandergreen: civicrm upgraded from 8efb2be1 to 40198c3f
- 21:24 catrope@deploy2002: catrope: Backport for Set up Special:AccountRecovery and enable on testwiki (T399742) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:23 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy7001.magru.wmnet with OS trixie
- 21:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P84958 and previous config saved to /var/cache/conftool/dbconfig/20251105-212226-marostegui.json
- 21:21 catrope@deploy2002: Started scap sync-world: Backport for Set up Special:AccountRecovery and enable on testwiki (T399742)
- 21:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P84957 and previous config saved to /var/cache/conftool/dbconfig/20251105-210718-marostegui.json
- 20:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T407997)', diff saved to https://phabricator.wikimedia.org/P84955 and previous config saved to /var/cache/conftool/dbconfig/20251105-205211-marostegui.json
- 20:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2227 (T407997)', diff saved to https://phabricator.wikimedia.org/P84953 and previous config saved to /var/cache/conftool/dbconfig/20251105-203438-marostegui.json
- 20:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2227.codfw.wmnet with reason: Maintenance
- 20:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T407997)', diff saved to https://phabricator.wikimedia.org/P84952 and previous config saved to /var/cache/conftool/dbconfig/20251105-203413-marostegui.json
- 20:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P84951 and previous config saved to /var/cache/conftool/dbconfig/20251105-201905-marostegui.json
- 20:09 dancy@deploy2002: Installation of scap version "4.224.0" completed for 2 hosts
- 20:07 dancy@deploy2002: Installing scap version "4.224.0" for 2 host(s)
- 20:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P84950 and previous config saved to /var/cache/conftool/dbconfig/20251105-200357-marostegui.json
- 19:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T407997)', diff saved to https://phabricator.wikimedia.org/P84949 and previous config saved to /var/cache/conftool/dbconfig/20251105-194850-marostegui.json
- 19:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2209 (T407997)', diff saved to https://phabricator.wikimedia.org/P84948 and previous config saved to /var/cache/conftool/dbconfig/20251105-193126-marostegui.json
- 19:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2209.codfw.wmnet with reason: Maintenance
- 19:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T407997)', diff saved to https://phabricator.wikimedia.org/P84947 and previous config saved to /var/cache/conftool/dbconfig/20251105-193102-marostegui.json
- 19:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P84946 and previous config saved to /var/cache/conftool/dbconfig/20251105-191553-marostegui.json
- 19:13 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.1 refs T408271
- 19:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P84945 and previous config saved to /var/cache/conftool/dbconfig/20251105-190046-marostegui.json
- 18:55 larssandergreen: tools upgraded from 8e3ed11c to 773e8d11
- 18:52 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 18:51 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 18:46 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 18:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T407997)', diff saved to https://phabricator.wikimedia.org/P84944 and previous config saved to /var/cache/conftool/dbconfig/20251105-184538-marostegui.json
- 18:45 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 18:42 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 18:42 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 18:40 swfrench@deploy2002: Stopping before sync operations
- 18:39 swfrench@deploy2002: Started scap sync-world: No-deployment scap run to switch mw-parsoid to PHP 8.3 - T405955
- 18:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2194 (T407997)', diff saved to https://phabricator.wikimedia.org/P84943 and previous config saved to /var/cache/conftool/dbconfig/20251105-182805-marostegui.json
- 18:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2194.codfw.wmnet with reason: Maintenance
- 18:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T407997)', diff saved to https://phabricator.wikimedia.org/P84942 and previous config saved to /var/cache/conftool/dbconfig/20251105-182741-marostegui.json
- 18:25 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:25 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:25 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:25 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:20 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P84941 and previous config saved to /var/cache/conftool/dbconfig/20251105-181233-marostegui.json
- 18:08 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:07 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:07 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 17:58 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 17:58 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2161 gradually with 4 steps - Migration of db2161.codfw.wmnet completed
- 17:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P84939 and previous config saved to /var/cache/conftool/dbconfig/20251105-175726-marostegui.json
- 17:53 ejegg: donorwiki upgraded from 8fe00530 to c2a4b377
- 17:53 ejegg: payments-wiki upgraded from 8fe00530 to c2a4b377
- 17:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T407997)', diff saved to https://phabricator.wikimedia.org/P84937 and previous config saved to /var/cache/conftool/dbconfig/20251105-174218-marostegui.json
- 17:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2190 (T407997)', diff saved to https://phabricator.wikimedia.org/P84935 and previous config saved to /var/cache/conftool/dbconfig/20251105-172347-marostegui.json
- 17:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2190.codfw.wmnet with reason: Maintenance
- 17:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T407997)', diff saved to https://phabricator.wikimedia.org/P84934 and previous config saved to /var/cache/conftool/dbconfig/20251105-172324-marostegui.json
- 17:12 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2161 gradually with 4 steps - Migration of db2161.codfw.wmnet completed
- 17:10 swfrench-wmf: rolling run-puppet-agent on A:cp hosts for haproxy config change
- 17:08 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 17:08 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 17:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P84932 and previous config saved to /var/cache/conftool/dbconfig/20251105-170816-marostegui.json
- 17:08 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 17:08 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 17:00 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 17:00 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 17:00 swfrench-wmf: disable-puppet on A:cp hosts for haproxy config change
- 16:58 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
- 16:58 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
- 16:58 dancy@deploy2002: Installation of scap version "4.223.0" completed for 2 hosts
- 16:56 dancy@deploy2002: Installing scap version "4.223.0" for 2 host(s)
- 16:55 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Update BetaFeatures comments (duration: 07m 38s)
- 16:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P84931 and previous config saved to /var/cache/conftool/dbconfig/20251105-165309-marostegui.json
- 16:51 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Continuing with sync
- 16:50 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for Update BetaFeatures comments synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:48 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Update BetaFeatures comments
- 16:47 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 16:47 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 16:47 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 16:47 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:47 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 16:46 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:44 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:44 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:42 papaul_: pfw1a/b-codfw Junos downgrade complete
- 16:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T407997)', diff saved to https://phabricator.wikimedia.org/P84930 and previous config saved to /var/cache/conftool/dbconfig/20251105-163801-marostegui.json
- 16:36 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2161 - Upgrading db2161.codfw.wmnet
- 16:36 fceratto@cumin1003: START - Cookbook sre.mysql.depool db2161 - Upgrading db2161.codfw.wmnet
- 16:35 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 16:31 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 16:30 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 16:29 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 16:29 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 16:27 javiermonton@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 16:27 javiermonton@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 16:27 javiermonton@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 16:27 javiermonton@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
- 16:26 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:26 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:24 topranks: add peering to NL-ix route servers from drmrs T386986
- 16:21 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:21 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:20 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2177 (T407997)', diff saved to https://phabricator.wikimedia.org/P84928 and previous config saved to /var/cache/conftool/dbconfig/20251105-162055-marostegui.json
- 16:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2177.codfw.wmnet with reason: Maintenance
- 16:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T407997)', diff saved to https://phabricator.wikimedia.org/P84927 and previous config saved to /var/cache/conftool/dbconfig/20251105-162032-marostegui.json
- 16:15 javiermonton@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 16:14 javiermonton@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
- 16:12 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:12 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:08 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:08 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:06 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
- 16:06 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
- 16:05 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 16:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P84926 and previous config saved to /var/cache/conftool/dbconfig/20251105-160523-marostegui.json
- 16:05 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 16:04 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 16:04 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 16:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2087.codfw.wmnet with OS bullseye
- 16:02 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:02 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:00 papaul_: ongoing pfw1b-codfw Junos downgrade
- 15:58 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:58 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P84924 and previous config saved to /var/cache/conftool/dbconfig/20251105-155015-marostegui.json
- 15:47 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pfw1-codfw with reason: pfw1a/b-codfw
- 15:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2087.codfw.wmnet with reason: host reimage
- 15:46 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
- 15:45 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:45 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add ipv6 reverse dns for nl-ix port marseille - cmooney@cumin1003"
- 15:45 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
- 15:45 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add ipv6 reverse dns for nl-ix port marseille - cmooney@cumin1003"
- 15:45 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:45 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 15:45 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:45 moritzm: running racadm racreset on maps2009
- 15:45 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 15:45 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 15:44 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 15:44 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 15:44 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2087.codfw.wmnet with reason: host reimage
- 15:43 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 15:43 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 15:43 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 15:39 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps2008.codfw.wmnet
- 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2008.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 15:38 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 15:38 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 15:36 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2008.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 15:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T407997)', diff saved to https://phabricator.wikimedia.org/P84922 and previous config saved to /var/cache/conftool/dbconfig/20251105-153508-marostegui.json
- 15:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 15:31 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2087.codfw.wmnet with OS bullseye
- 15:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2086.codfw.wmnet with OS bullseye
- 15:30 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:29 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:27 marostegui@cumin1003: dbctl commit (dc=all): 'db1209 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84921 and previous config saved to /var/cache/conftool/dbconfig/20251105-152716-root.json
- 15:27 jforrester@deploy2002: Finished scap sync-world: Backport for Enable embedded Wikifunctions calls on bnwiki and seven Wiktionaries (T406342) (duration: 09m 35s)
- 15:26 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:26 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:26 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:25 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:24 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps2008.codfw.wmnet
- 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps2007.codfw.wmnet
- 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2007.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 15:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2007.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 15:20 jforrester@deploy2002: jforrester: Continuing with sync
- 15:19 jforrester@deploy2002: jforrester: Backport for Enable embedded Wikifunctions calls on bnwiki and seven Wiktionaries (T406342) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2156 (T407997)', diff saved to https://phabricator.wikimedia.org/P84920 and previous config saved to /var/cache/conftool/dbconfig/20251105-151802-marostegui.json
- 15:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2156.codfw.wmnet with reason: Maintenance
- 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T407997)', diff saved to https://phabricator.wikimedia.org/P84919 and previous config saved to /var/cache/conftool/dbconfig/20251105-151738-marostegui.json
- 15:17 jforrester@deploy2002: Started scap sync-world: Backport for Enable embedded Wikifunctions calls on bnwiki and seven Wiktionaries (T406342)
- 15:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 15:15 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:14 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:14 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2086.codfw.wmnet with reason: host reimage
- 15:13 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:13 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:13 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:12 marostegui@cumin1003: dbctl commit (dc=all): 'db1209 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84918 and previous config saved to /var/cache/conftool/dbconfig/20251105-151210-root.json
- 15:11 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps2007.codfw.wmnet
- 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps2006.codfw.wmnet
- 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 15:10 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:10 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:10 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 15:09 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2086.codfw.wmnet with reason: host reimage
- 15:09 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:08 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:07 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:07 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:06 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P84917 and previous config saved to /var/cache/conftool/dbconfig/20251105-150230-marostegui.json
- 15:02 Lucas_WMDE: UTC afternoon backport+config window done
- 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'db1209 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84916 and previous config saved to /var/cache/conftool/dbconfig/20251105-145704-root.json
- 14:56 arthurtaylor@deploy2002: Finished scap sync-world: Backport for Revert "Enable the MEX / wbui2025 beta feature on testwikidata" (T407737) (duration: 08m 23s)
- 14:56 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2086.codfw.wmnet with OS bullseye
- 14:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2085.codfw.wmnet with OS bullseye
- 14:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 (T403362)', diff saved to https://phabricator.wikimedia.org/P84915 and previous config saved to /var/cache/conftool/dbconfig/20251105-145457-ladsgroup.json
- 14:52 arthurtaylor@deploy2002: arthurtaylor: Continuing with sync
- 14:50 arthurtaylor@deploy2002: arthurtaylor: Backport for Revert "Enable the MEX / wbui2025 beta feature on testwikidata" (T407737) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:48 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:48 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:48 arthurtaylor@deploy2002: Started scap sync-world: Backport for Revert "Enable the MEX / wbui2025 beta feature on testwikidata" (T407737)
- 14:47 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1220* gradually with 4 steps - Work done
- 14:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P84913 and previous config saved to /var/cache/conftool/dbconfig/20251105-144723-marostegui.json
- 14:45 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:45 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:45 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:45 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:42 arthurtaylor@deploy2002: Sync cancelled.
- 14:41 marostegui@cumin1003: dbctl commit (dc=all): 'db1209 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84912 and previous config saved to /var/cache/conftool/dbconfig/20251105-144158-root.json
- 14:41 elukey: uploaded spicerack_12.0.0 to apt.wikimedia.org bullseye-wikimedia,bookworm-wikimedia
- 14:39 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P84911 and previous config saved to /var/cache/conftool/dbconfig/20251105-143949-ladsgroup.json
- 14:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2085.codfw.wmnet with reason: host reimage
- 14:35 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
- 14:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1209 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84910 and previous config saved to /var/cache/conftool/dbconfig/20251105-143419-marostegui.json
- 14:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1209.eqiad.wmnet with reason: Maintenance
- 14:33 arthurtaylor@deploy2002: arthurtaylor: Backport for Enable the MEX / wbui2025 beta feature on testwikidata (T407737) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T407997)', diff saved to https://phabricator.wikimedia.org/P84908 and previous config saved to /var/cache/conftool/dbconfig/20251105-143215-marostegui.json
- 14:31 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2085.codfw.wmnet with reason: host reimage
- 14:30 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:30 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:30 arthurtaylor@deploy2002: Started scap sync-world: Backport for Enable the MEX / wbui2025 beta feature on testwikidata (T407737)
- 14:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P84907 and previous config saved to /var/cache/conftool/dbconfig/20251105-142441-ladsgroup.json
- 14:24 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
- 14:24 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
- 14:24 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
- 14:19 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2085.codfw.wmnet with OS bullseye
- 14:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2149 (T407997)', diff saved to https://phabricator.wikimedia.org/P84905 and previous config saved to /var/cache/conftool/dbconfig/20251105-141507-marostegui.json
- 14:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2149.codfw.wmnet with reason: Maintenance
- 14:12 Amir1: cumin2024@db2205.codfw.wmnet[(none)]> drop database if exists jamestemp; (T297297)
- 14:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1189.eqiad.wmnet with reason: Maintenance
- 14:09 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 (T403362)', diff saved to https://phabricator.wikimedia.org/P84904 and previous config saved to /var/cache/conftool/dbconfig/20251105-140934-ladsgroup.json
- 14:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 14:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 14:02 ladsgroup@cumin1003: START - Cookbook sre.mysql.pool db1220* gradually with 4 steps - Work done
- 14:00 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps2006.codfw.wmnet
- 13:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 13:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T407997)', diff saved to https://phabricator.wikimedia.org/P84902 and previous config saved to /var/cache/conftool/dbconfig/20251105-135831-marostegui.json
- 13:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps2005.codfw.wmnet
- 13:53 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 13:53 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 13:50 Amir1: cumin2024@db2205.codfw.wmnet[(none)]> drop database if exists katesdb; (T297297)
- 13:48 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 13:44 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps2005.codfw.wmnet
- 13:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P84901 and previous config saved to /var/cache/conftool/dbconfig/20251105-134323-marostegui.json
- 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps2010.codfw.wmnet
- 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 13:39 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 13:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 13:28 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps2010.codfw.wmnet
- 13:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P84900 and previous config saved to /var/cache/conftool/dbconfig/20251105-132816-marostegui.json
- 13:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T407997)', diff saved to https://phabricator.wikimedia.org/P84899 and previous config saved to /var/cache/conftool/dbconfig/20251105-131308-marostegui.json
- 13:07 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1223 (T407997)', diff saved to https://phabricator.wikimedia.org/P84898 and previous config saved to /var/cache/conftool/dbconfig/20251105-130750-marostegui.json
- 13:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1223.eqiad.wmnet with reason: Maintenance
- 13:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T407997)', diff saved to https://phabricator.wikimedia.org/P84897 and previous config saved to /var/cache/conftool/dbconfig/20251105-130726-marostegui.json
- 13:05 brouberol@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 9 hosts with reason: rebalancing
- 12:55 marostegui: Deploy schema change on s3 master for vewikimedia T409282 T396130
- 12:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P84896 and previous config saved to /var/cache/conftool/dbconfig/20251105-125219-marostegui.json
- 12:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P84895 and previous config saved to /var/cache/conftool/dbconfig/20251105-123711-marostegui.json
- 12:28 marostegui@cumin1003: dbctl commit (dc=all): 'db1167 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84894 and previous config saved to /var/cache/conftool/dbconfig/20251105-122828-root.json
- 12:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T407997)', diff saved to https://phabricator.wikimedia.org/P84893 and previous config saved to /var/cache/conftool/dbconfig/20251105-122203-marostegui.json
- 12:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1212 (T407997)', diff saved to https://phabricator.wikimedia.org/P84892 and previous config saved to /var/cache/conftool/dbconfig/20251105-121647-marostegui.json
- 12:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 12:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1212.eqiad.wmnet with reason: Maintenance
- 12:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T407997)', diff saved to https://phabricator.wikimedia.org/P84891 and previous config saved to /var/cache/conftool/dbconfig/20251105-121616-marostegui.json
- 12:13 marostegui@cumin1003: dbctl commit (dc=all): 'db1167 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84890 and previous config saved to /var/cache/conftool/dbconfig/20251105-121323-root.json
- 12:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P84889 and previous config saved to /var/cache/conftool/dbconfig/20251105-120108-marostegui.json
- 11:58 marostegui@cumin1003: dbctl commit (dc=all): 'db1167 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84888 and previous config saved to /var/cache/conftool/dbconfig/20251105-115817-root.json
- 11:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2215 (T403362)', diff saved to https://phabricator.wikimedia.org/P84887 and previous config saved to /var/cache/conftool/dbconfig/20251105-115437-ladsgroup.json
- 11:54 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
- 11:52 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
- 11:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P84886 and previous config saved to /var/cache/conftool/dbconfig/20251105-114600-marostegui.json
- 11:43 marostegui@cumin1003: dbctl commit (dc=all): 'db1167 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84885 and previous config saved to /var/cache/conftool/dbconfig/20251105-114311-root.json
- 11:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1167 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84884 and previous config saved to /var/cache/conftool/dbconfig/20251105-113522-marostegui.json
- 11:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 11:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Migration
- 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T407997)', diff saved to https://phabricator.wikimedia.org/P84883 and previous config saved to /var/cache/conftool/dbconfig/20251105-113053-marostegui.json
- 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1198 (T407997)', diff saved to https://phabricator.wikimedia.org/P84882 and previous config saved to /var/cache/conftool/dbconfig/20251105-112556-marostegui.json
- 11:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1198.eqiad.wmnet with reason: Maintenance
- 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T407997)', diff saved to https://phabricator.wikimedia.org/P84881 and previous config saved to /var/cache/conftool/dbconfig/20251105-112532-marostegui.json
- 11:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P84880 and previous config saved to /var/cache/conftool/dbconfig/20251105-111025-marostegui.json
- 11:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-launcher1002.eqiad.wmnet
- 11:04 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:04 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-launcher1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
- 11:04 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-launcher1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
- 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 100%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84879 and previous config saved to /var/cache/conftool/dbconfig/20251105-110000-root.json
- 10:59 btullis@cumin1003: START - Cookbook sre.dns.netbox
- 10:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P84878 and previous config saved to /var/cache/conftool/dbconfig/20251105-105517-marostegui.json
- 10:45 btullis@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-launcher1002.eqiad.wmnet
- 10:44 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 75%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84876 and previous config saved to /var/cache/conftool/dbconfig/20251105-104454-root.json
- 10:41 btullis@deploy2002: Finished deploy [analytics/refinery@39e92e9]: Updating the deployment on an-launcher1003 (duration: 01m 06s)
- 10:40 btullis@deploy2002: Started deploy [analytics/refinery@39e92e9]: Updating the deployment on an-launcher1003
- 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T407997)', diff saved to https://phabricator.wikimedia.org/P84875 and previous config saved to /var/cache/conftool/dbconfig/20251105-104010-marostegui.json
- 10:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1175 (T407997)', diff saved to https://phabricator.wikimedia.org/P84874 and previous config saved to /var/cache/conftool/dbconfig/20251105-103513-marostegui.json
- 10:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 10:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T407997)', diff saved to https://phabricator.wikimedia.org/P84873 and previous config saved to /var/cache/conftool/dbconfig/20251105-103449-marostegui.json
- 10:29 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 60%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84872 and previous config saved to /var/cache/conftool/dbconfig/20251105-102948-root.json
- 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P84871 and previous config saved to /var/cache/conftool/dbconfig/20251105-101942-marostegui.json
- 10:14 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 50%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84870 and previous config saved to /var/cache/conftool/dbconfig/20251105-101442-root.json
- 10:06 moritzm: disabling Puppet on buster maps nodes for pending decom T381565
- 10:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P84869 and previous config saved to /var/cache/conftool/dbconfig/20251105-100434-marostegui.json
- 09:59 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 40%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84868 and previous config saved to /var/cache/conftool/dbconfig/20251105-095936-root.json
- 09:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T407997)', diff saved to https://phabricator.wikimedia.org/P84867 and previous config saved to /var/cache/conftool/dbconfig/20251105-094926-marostegui.json
- 09:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1166 (T407997)', diff saved to https://phabricator.wikimedia.org/P84866 and previous config saved to /var/cache/conftool/dbconfig/20251105-094431-marostegui.json
- 09:44 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 30%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84865 and previous config saved to /var/cache/conftool/dbconfig/20251105-094431-root.json
- 09:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 09:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T407997)', diff saved to https://phabricator.wikimedia.org/P84864 and previous config saved to /var/cache/conftool/dbconfig/20251105-094408-marostegui.json
- 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 25%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84863 and previous config saved to /var/cache/conftool/dbconfig/20251105-092925-root.json
- 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P84862 and previous config saved to /var/cache/conftool/dbconfig/20251105-092859-marostegui.json
- 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'db1174 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P84861 and previous config saved to /var/cache/conftool/dbconfig/20251105-091438-root.json
- 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 20%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84860 and previous config saved to /var/cache/conftool/dbconfig/20251105-091419-root.json
- 09:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P84859 and previous config saved to /var/cache/conftool/dbconfig/20251105-091352-marostegui.json
- 08:59 marostegui@cumin1003: dbctl commit (dc=all): 'db1174 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P84858 and previous config saved to /var/cache/conftool/dbconfig/20251105-085932-root.json
- 08:59 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 15%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84857 and previous config saved to /var/cache/conftool/dbconfig/20251105-085913-root.json
- 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T407997)', diff saved to https://phabricator.wikimedia.org/P84856 and previous config saved to /var/cache/conftool/dbconfig/20251105-085844-marostegui.json
- 08:53 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T407997)', diff saved to https://phabricator.wikimedia.org/P84855 and previous config saved to /var/cache/conftool/dbconfig/20251105-085347-marostegui.json
- 08:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1157.eqiad.wmnet with reason: Maintenance
- 08:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-growthbook: apply
- 08:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook: apply
- 08:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 08:44 marostegui@cumin1003: dbctl commit (dc=all): 'db1174 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P84854 and previous config saved to /var/cache/conftool/dbconfig/20251105-084426-root.json
- 08:44 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 10%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84853 and previous config saved to /var/cache/conftool/dbconfig/20251105-084407-root.json
- 08:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2218.codfw.wmnet with reason: Maintenance
- 08:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1181.eqiad.wmnet with reason: Maintenance
- 08:29 marostegui@cumin1003: dbctl commit (dc=all): 'db1174 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P84852 and previous config saved to /var/cache/conftool/dbconfig/20251105-082920-root.json
- 08:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T407997)', diff saved to https://phabricator.wikimedia.org/P84851 and previous config saved to /var/cache/conftool/dbconfig/20251105-082642-marostegui.json
- 08:25 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1174 (T407997)', diff saved to https://phabricator.wikimedia.org/P84850 and previous config saved to /var/cache/conftool/dbconfig/20251105-082533-marostegui.json
- 08:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 08:22 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 5%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84849 and previous config saved to /var/cache/conftool/dbconfig/20251105-082209-root.json
- 08:21 Emperor: run gitlab-package-puller by hand on apt-staging2001
- 08:13 brouberol@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on an-launcher1002.eqiad.wmnet with reason: host is being decommissioned
- 08:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T407997)', diff saved to https://phabricator.wikimedia.org/P84848 and previous config saved to /var/cache/conftool/dbconfig/20251105-080849-marostegui.json
- 08:07 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 4%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84847 and previous config saved to /var/cache/conftool/dbconfig/20251105-080702-root.json
- 08:00 marostegui@cumin1003: dbctl commit (dc=all): 'db2212 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84846 and previous config saved to /var/cache/conftool/dbconfig/20251105-080027-root.json
- 07:59 eileen: ivicrm upgraded from 1eeb1a46 to 8efb2be1
- 07:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P84845 and previous config saved to /var/cache/conftool/dbconfig/20251105-075341-marostegui.json
- 07:51 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 3%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84844 and previous config saved to /var/cache/conftool/dbconfig/20251105-075156-root.json
- 07:45 marostegui@cumin1003: dbctl commit (dc=all): 'db2212 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84843 and previous config saved to /var/cache/conftool/dbconfig/20251105-074521-root.json
- 07:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P84842 and previous config saved to /var/cache/conftool/dbconfig/20251105-073833-marostegui.json
- 07:36 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 2%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84841 and previous config saved to /var/cache/conftool/dbconfig/20251105-073651-root.json
- 07:33 marostegui@cumin1003: dbctl commit (dc=all): 'db1203 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84840 and previous config saved to /var/cache/conftool/dbconfig/20251105-073347-root.json
- 07:30 marostegui@cumin1003: dbctl commit (dc=all): 'db2215 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P84839 and previous config saved to /var/cache/conftool/dbconfig/20251105-073033-root.json
- 07:30 marostegui@cumin1003: dbctl commit (dc=all): 'db2212 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84838 and previous config saved to /var/cache/conftool/dbconfig/20251105-073016-root.json
- 07:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T407997)', diff saved to https://phabricator.wikimedia.org/P84837 and previous config saved to /var/cache/conftool/dbconfig/20251105-072326-marostegui.json
- 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 1%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84836 and previous config saved to /var/cache/conftool/dbconfig/20251105-072145-root.json
- 07:18 marostegui@cumin1003: dbctl commit (dc=all): 'db1203 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84835 and previous config saved to /var/cache/conftool/dbconfig/20251105-071841-root.json
- 07:16 marostegui@cumin1003: dbctl commit (dc=all): 'Add es1033 to es2 depooled T409257 T407472', diff saved to https://phabricator.wikimedia.org/P84834 and previous config saved to /var/cache/conftool/dbconfig/20251105-071605-marostegui.json
- 07:15 marostegui@cumin1003: dbctl commit (dc=all): 'db2215 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P84833 and previous config saved to /var/cache/conftool/dbconfig/20251105-071527-root.json
- 07:15 marostegui@cumin1003: dbctl commit (dc=all): 'db2212 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84832 and previous config saved to /var/cache/conftool/dbconfig/20251105-071510-root.json
- 07:07 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2212 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84831 and previous config saved to /var/cache/conftool/dbconfig/20251105-070707-marostegui.json
- 07:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2212.codfw.wmnet with reason: Maintenance
- 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1170 (T407997)', diff saved to https://phabricator.wikimedia.org/P84830 and previous config saved to /var/cache/conftool/dbconfig/20251105-070540-marostegui.json
- 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T407997)', diff saved to https://phabricator.wikimedia.org/P84828 and previous config saved to /var/cache/conftool/dbconfig/20251105-070516-marostegui.json
- 07:03 marostegui@cumin1003: dbctl commit (dc=all): 'db1203 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84827 and previous config saved to /var/cache/conftool/dbconfig/20251105-070335-root.json
- 07:00 marostegui@cumin1003: dbctl commit (dc=all): 'db2215 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P84826 and previous config saved to /var/cache/conftool/dbconfig/20251105-070021-root.json
- 06:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es1034.eqiad.wmnet
- 06:52 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 06:52 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es1034.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 06:51 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es1034.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P84825 and previous config saved to /var/cache/conftool/dbconfig/20251105-065008-marostegui.json
- 06:48 marostegui@cumin1003: dbctl commit (dc=all): 'db1203 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84824 and previous config saved to /var/cache/conftool/dbconfig/20251105-064829-root.json
- 06:48 marostegui@cumin1003: START - Cookbook sre.dns.netbox
- 06:47 eileen: civicrm upgraded from a7c697e9 to 1eeb1a46
- 06:45 marostegui@cumin1003: dbctl commit (dc=all): 'db2215 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P84823 and previous config saved to /var/cache/conftool/dbconfig/20251105-064515-root.json
- 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts es1034.eqiad.wmnet
- 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1203 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84822 and previous config saved to /var/cache/conftool/dbconfig/20251105-064028-marostegui.json
- 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1203.eqiad.wmnet with reason: Maintenance
- 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P84821 and previous config saved to /var/cache/conftool/dbconfig/20251105-063458-marostegui.json
- 06:32 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1:00:00 on 14 hosts with reason: Primary switchover x1 T409168
- 06:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: Primary switchover x1 T409168
- 06:30 marostegui@cumin1003: dbctl commit (dc=all): 'db2215 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P84820 and previous config saved to /var/cache/conftool/dbconfig/20251105-063009-root.json
- 06:29 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2215 T409168', diff saved to https://phabricator.wikimedia.org/P84819 and previous config saved to /var/cache/conftool/dbconfig/20251105-062920-marostegui.json
- 06:29 marostegui@dns1006: END - running authdns-update
- 06:28 marostegui@dns1006: START - running authdns-update
- 06:27 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2191 to x1 primary and set section read-write T409168', diff saved to https://phabricator.wikimedia.org/P84818 and previous config saved to /var/cache/conftool/dbconfig/20251105-062745-marostegui.json
- 06:27 marostegui@cumin1003: dbctl commit (dc=all): 'Set x1 codfw as read-only for maintenance - T409168', diff saved to https://phabricator.wikimedia.org/P84817 and previous config saved to /var/cache/conftool/dbconfig/20251105-062723-marostegui.json
- 06:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: Primary switchover x1 T409168
- 06:22 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2191 with weight 0 T409168', diff saved to https://phabricator.wikimedia.org/P84816 and previous config saved to /var/cache/conftool/dbconfig/20251105-062230-marostegui.json
- 06:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T407997)', diff saved to https://phabricator.wikimedia.org/P84815 and previous config saved to /var/cache/conftool/dbconfig/20251105-061950-marostegui.json
- 06:17 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1158 (T407997)', diff saved to https://phabricator.wikimedia.org/P84814 and previous config saved to /var/cache/conftool/dbconfig/20251105-061737-marostegui.json
- 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 05:14 eileen: civicrm upgraded from 53b042e5 to a7c697e9
- 04:38 eileen: civicrm upgraded from 090cd474 to 53b042e5
- 03:21 eileen: config revision changed from 553c9c90 to 18e60944
- 03:01 eileen: civicrm upgraded from 3a637a8b to 090cd474
- 02:58 tstarling@deploy2002: Finished scap sync-world: Backport for recentchanges: Fix watchlistactivity=all, i.e. seen/unseen conflict (T408167) (duration: 10m 39s)
- 02:53 tstarling@deploy2002: tstarling: Continuing with sync
- 02:50 tstarling@deploy2002: tstarling: Backport for recentchanges: Fix watchlistactivity=all, i.e. seen/unseen conflict (T408167) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 02:47 tstarling@deploy2002: Started scap sync-world: Backport for recentchanges: Fix watchlistactivity=all, i.e. seen/unseen conflict (T408167)
- 00:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T407997)', diff saved to https://phabricator.wikimedia.org/P84810 and previous config saved to /var/cache/conftool/dbconfig/20251105-000151-marostegui.json
2025-11-04
- 23:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P84809 and previous config saved to /var/cache/conftool/dbconfig/20251104-234643-marostegui.json
- 23:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P84808 and previous config saved to /var/cache/conftool/dbconfig/20251104-233135-marostegui.json
- 23:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T407997)', diff saved to https://phabricator.wikimedia.org/P84807 and previous config saved to /var/cache/conftool/dbconfig/20251104-231628-marostegui.json
- 23:08 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 23:08 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 23:07 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 23:07 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 22:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2222 (T407997)', diff saved to https://phabricator.wikimedia.org/P84806 and previous config saved to /var/cache/conftool/dbconfig/20251104-225853-marostegui.json
- 22:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2222.codfw.wmnet with reason: Maintenance
- 22:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T407997)', diff saved to https://phabricator.wikimedia.org/P84805 and previous config saved to /var/cache/conftool/dbconfig/20251104-225829-marostegui.json
- 22:53 aaron@deploy2002: Finished scap sync-world: Backport for Add a wgRestSandboxSpecs entry for wikimedia.org (math) specs (T396805) (duration: 07m 48s)
- 22:49 aaron@deploy2002: aaron: Continuing with sync
- 22:47 aaron@deploy2002: aaron: Backport for Add a wgRestSandboxSpecs entry for wikimedia.org (math) specs (T396805) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:45 aaron@deploy2002: Started scap sync-world: Backport for Add a wgRestSandboxSpecs entry for wikimedia.org (math) specs (T396805)
- 22:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P84804 and previous config saved to /var/cache/conftool/dbconfig/20251104-224321-marostegui.json
- 22:39 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
- 22:39 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
- 22:38 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
- 22:38 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
- 22:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P84803 and previous config saved to /var/cache/conftool/dbconfig/20251104-222814-marostegui.json
- 22:24 eileen: civicrm upgraded from ee0b5d3c to 3a637a8b
- 22:19 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1201731 T405808 (duration: 05m 39s)
- 22:14 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1201731 T405808
- 22:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T407997)', diff saved to https://phabricator.wikimedia.org/P84802 and previous config saved to /var/cache/conftool/dbconfig/20251104-221306-marostegui.json
- 21:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2221 (T407997)', diff saved to https://phabricator.wikimedia.org/P84801 and previous config saved to /var/cache/conftool/dbconfig/20251104-215649-marostegui.json
- 21:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2221.codfw.wmnet with reason: Maintenance
- 21:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T407997)', diff saved to https://phabricator.wikimedia.org/P84800 and previous config saved to /var/cache/conftool/dbconfig/20251104-215625-marostegui.json
- 21:48 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 21:48 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 21:43 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 21:43 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 21:43 eileen: civicrm upgraded from 2e7879c3 to ee0b5d3c
- 21:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P84799 and previous config saved to /var/cache/conftool/dbconfig/20251104-214117-marostegui.json
- 21:28 bvibber@deploy2002: Finished scap sync-world: Backport for cirrus: Start near match A/B test (T408154) (duration: 07m 53s)
- 21:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P84798 and previous config saved to /var/cache/conftool/dbconfig/20251104-212609-marostegui.json
- 21:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2009.codfw.wmnet with OS trixie
- 21:26 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 21:25 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 21:24 bvibber@deploy2002: bvibber, ebernhardson: Continuing with sync
- 21:24 bvibber@deploy2002: bvibber, ebernhardson: Backport for cirrus: Start near match A/B test (T408154) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:20 bvibber@deploy2002: Started scap sync-world: Backport for cirrus: Start near match A/B test (T408154)
- 21:18 bvibber@deploy2002: Finished scap sync-world: Backport for Guard against some null dereferences in CroppedImage (T409123 T409126), Guard against some null dereferences in CroppedImage (T409123 T409126) (duration: 11m 23s)
- 21:12 bvibber@deploy2002: bvibber: Continuing with sync
- 21:11 bvibber@deploy2002: bvibber: Backport for Guard against some null dereferences in CroppedImage (T409123 T409126), Guard against some null dereferences in CroppedImage (T409123 T409126) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T407997)', diff saved to https://phabricator.wikimedia.org/P84797 and previous config saved to /var/cache/conftool/dbconfig/20251104-211102-marostegui.json
- 21:08 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
- 21:07 bvibber@deploy2002: Started scap sync-world: Backport for Guard against some null dereferences in CroppedImage (T409123 T409126), Guard against some null dereferences in CroppedImage (T409123 T409126)
- 21:05 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
- 20:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2220 (T407997)', diff saved to https://phabricator.wikimedia.org/P84796 and previous config saved to /var/cache/conftool/dbconfig/20251104-205433-marostegui.json
- 20:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2220.codfw.wmnet with reason: Maintenance
- 20:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T407997)', diff saved to https://phabricator.wikimedia.org/P84795 and previous config saved to /var/cache/conftool/dbconfig/20251104-205420-marostegui.json
- 20:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
- 20:51 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2009.codfw.wmnet with OS trixie
- 20:43 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 20:43 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 20:41 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 20:41 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 20:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P84794 and previous config saved to /var/cache/conftool/dbconfig/20251104-203912-marostegui.json
- 20:34 eileen: civicrm upgraded from 77cad331 to 2e7879c3
- 20:29 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.1 refs T408271
- 20:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P84793 and previous config saved to /var/cache/conftool/dbconfig/20251104-202405-marostegui.json
- 20:17 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
- 20:16 eevans@deploy2002: helmfile [staging] START helmfile.d/services/data-gateway: apply
- 20:13 jhuneidi@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.1 refs T408271 (duration: 12m 07s)
- 20:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T407997)', diff saved to https://phabricator.wikimedia.org/P84792 and previous config saved to /var/cache/conftool/dbconfig/20251104-200857-marostegui.json
- 20:01 jhuneidi@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.1 refs T408271
- 19:57 brett: import ncmonitor 3.0.0 into bookworm-wikimedia
- 19:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (T407997)', diff saved to https://phabricator.wikimedia.org/P84791 and previous config saved to /var/cache/conftool/dbconfig/20251104-195203-marostegui.json
- 19:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2208.codfw.wmnet with reason: Maintenance
- 19:48 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ncmonitor1001.eqiad.wmnet
- 19:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
- 19:34 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
- 19:27 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
- 19:24 brett: import ncmonitor 3.0.0 into bookworm-wikimedia
- 19:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
- 19:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T407997)', diff saved to https://phabricator.wikimedia.org/P84790 and previous config saved to /var/cache/conftool/dbconfig/20251104-192142-marostegui.json
- 19:13 jhuneidi@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.1 refs T408271
- 19:09 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
- 19:09 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1220 (T403362)', diff saved to https://phabricator.wikimedia.org/P84789 and previous config saved to /var/cache/conftool/dbconfig/20251104-190946-ladsgroup.json
- 19:09 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
- 19:09 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
- 19:08 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
- 19:08 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
- 19:07 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
- 19:06 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
- 19:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P84788 and previous config saved to /var/cache/conftool/dbconfig/20251104-190634-marostegui.json
- 19:06 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
- 19:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
- 19:05 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
- 19:05 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
- 19:04 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
- 19:04 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
- 19:03 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
- 19:03 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
- 18:55 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
- 18:55 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
- 18:52 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
- 18:51 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
- 18:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P84785 and previous config saved to /var/cache/conftool/dbconfig/20251104-185126-marostegui.json
- 18:51 dancy@deploy2002: Installation of scap version "4.222.0" completed for 2 hosts
- 18:51 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
- 18:50 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
- 18:49 dancy@deploy2002: Installing scap version "4.222.0" for 2 host(s)
- 18:48 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
- 18:48 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
- 18:40 swfrench@deploy2002: Finished scap sync-world: Fully migrate mw-(api-int|jobrunner) to 8.3 - T405955 (duration: 07m 49s)
- 18:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T407997)', diff saved to https://phabricator.wikimedia.org/P84784 and previous config saved to /var/cache/conftool/dbconfig/20251104-183619-marostegui.json
- 18:32 swfrench@deploy2002: Started scap sync-world: Fully migrate mw-(api-int|jobrunner) to 8.3 - T405955
- 18:21 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudweb2002-dev.wikimedia.org with OS trixie
- 18:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2182 (T407997)', diff saved to https://phabricator.wikimedia.org/P84783 and previous config saved to /var/cache/conftool/dbconfig/20251104-181648-marostegui.json
- 18:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2182.codfw.wmnet with reason: Maintenance
- 18:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T407997)', diff saved to https://phabricator.wikimedia.org/P84782 and previous config saved to /var/cache/conftool/dbconfig/20251104-181623-marostegui.json
- 18:12 fnegri@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database thwikimedia (T409201)
- 18:09 jhuneidi@deploy2002: sync-world aborted: testwikis to 1.46.0-wmf.1 refs T408271 (duration: 03m 00s)
- 18:06 jhuneidi@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.1 refs T408271
- 18:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P84781 and previous config saved to /var/cache/conftool/dbconfig/20251104-180116-marostegui.json
- 17:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P84780 and previous config saved to /var/cache/conftool/dbconfig/20251104-174608-marostegui.json
- 17:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T407997)', diff saved to https://phabricator.wikimedia.org/P84779 and previous config saved to /var/cache/conftool/dbconfig/20251104-173100-marostegui.json
- 17:26 fnegri@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database tokwiki (T404570)
- 17:26 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database tokwiki (T404570)
- 17:24 fnegri@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database tokwiki (T404566)
- 17:24 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database tokwiki (T404566)
- 17:24 fnegri@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database tokwiki (T404703)
- 17:23 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database tokwiki (T404703)
- 17:23 fnegri@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database mswikiquote (T404703)
- 17:23 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database mswikiquote (T404703)
- 17:16 wfan: donorwiki upgraded from 09caf170 to 8fe00530
- 17:15 wfan: payments-wiki upgraded from 0132998e to 8fe00530
- 17:15 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database thwikimedia (T409201)
- 17:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2168 (T407997)', diff saved to https://phabricator.wikimedia.org/P84778 and previous config saved to /var/cache/conftool/dbconfig/20251104-171333-marostegui.json
- 17:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 17:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T407997)', diff saved to https://phabricator.wikimedia.org/P84777 and previous config saved to /var/cache/conftool/dbconfig/20251104-171320-marostegui.json
- 16:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P84776 and previous config saved to /var/cache/conftool/dbconfig/20251104-165812-marostegui.json
- 16:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P84775 and previous config saved to /var/cache/conftool/dbconfig/20251104-164304-marostegui.json
- 16:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T407997)', diff saved to https://phabricator.wikimedia.org/P84774 and previous config saved to /var/cache/conftool/dbconfig/20251104-162754-marostegui.json
- 16:27 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-eqiad
- 16:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 16:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 16:18 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 16:17 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 16:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2159 (T407997)', diff saved to https://phabricator.wikimedia.org/P84773 and previous config saved to /var/cache/conftool/dbconfig/20251104-161027-marostegui.json
- 16:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2159.codfw.wmnet with reason: Maintenance
- 16:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T407997)', diff saved to https://phabricator.wikimedia.org/P84772 and previous config saved to /var/cache/conftool/dbconfig/20251104-161003-marostegui.json
- 16:08 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-eqiad
- 16:06 brennen@deploy2002: Finished deploy [phabricator/deployment@e9011f3]: deploy phab1004 for T409193 (duration: 02m 29s)
- 16:04 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
- 16:03 brennen@deploy2002: Started deploy [phabricator/deployment@e9011f3]: deploy phab1004 for T409193
- 16:03 brennen@deploy2002: Finished deploy [phabricator/deployment@e9011f3]: deploy phab2002 for T409193 (duration: 00m 31s)
- 16:03 brennen@deploy2002: Started deploy [phabricator/deployment@e9011f3]: deploy phab2002 for T409193
- 16:02 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
- 15:59 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:56 jhancock@cumin1003: START - Cookbook sre.dns.netbox
- 15:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P84771 and previous config saved to /var/cache/conftool/dbconfig/20251104-155455-marostegui.json
- 15:49 topranks: upgrade lsw1-c3-eqiad and lsw1-d3-eqiad to SR-Linux v24.10.4
- 15:47 marostegui@cumin1003: dbctl commit (dc=all): 'db2213 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P84770 and previous config saved to /var/cache/conftool/dbconfig/20251104-154755-root.json
- 15:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P84768 and previous config saved to /var/cache/conftool/dbconfig/20251104-153948-marostegui.json
- 15:39 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be1088.eqiad.wmnet with OS trixie
- 15:39 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:36 jhancock@cumin1003: START - Cookbook sre.dns.netbox
- 15:33 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudweb2002-dev.wikimedia.org with reason: host reimage
- 15:32 marostegui@cumin1003: dbctl commit (dc=all): 'db2213 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P84767 and previous config saved to /var/cache/conftool/dbconfig/20251104-153249-root.json
- 15:30 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudweb2002-dev.wikimedia.org with reason: host reimage
- 15:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T407997)', diff saved to https://phabricator.wikimedia.org/P84766 and previous config saved to /var/cache/conftool/dbconfig/20251104-152440-marostegui.json
- 15:18 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-codfw
- 15:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'db2213 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P84765 and previous config saved to /var/cache/conftool/dbconfig/20251104-151744-root.json
- 15:13 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 15:13 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudweb2002-dev.wikimedia.org with OS trixie
- 15:06 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2150 (T407997)', diff saved to https://phabricator.wikimedia.org/P84764 and previous config saved to /var/cache/conftool/dbconfig/20251104-150623-marostegui.json
- 15:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2150.codfw.wmnet with reason: Maintenance
- 15:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2192.codfw.wmnet with reason: Maintenance
- 15:01 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-codfw
- 15:00 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1088.eqiad.wmnet with OS trixie
- 14:58 fceratto@cumin1002: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test2001.codfw.wmnet
- 14:58 fceratto@cumin1002: END (ERROR) - Cookbook sre.mysql.clone (exit_code=97) of db2230.codfw.wmnet onto db-test2001.codfw.wmnet
- 14:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P84763 and previous config saved to /var/cache/conftool/dbconfig/20251104-145506-marostegui.json
- 14:48 Lucas_WMDE: lucaswerkmeister-wmde@deploy2002 $ printf 'https://en.wikipedia.org/static/images/mobile/copyright/wiktionary-%s-az.svg\n' tagline wordmark | mwscript-k8s --comment='T408147' --attach -- purgeList enwiki
- 14:42 Lucas_WMDE: UTC afternoon backport+config window done
- 14:41 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for azwiktionary: use new wordmark and tagline (T408147), Remove wmgULSPosition for special wikis (T400067) (duration: 09m 33s)
- 14:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T407997)', diff saved to https://phabricator.wikimedia.org/P84762 and previous config saved to /var/cache/conftool/dbconfig/20251104-143958-marostegui.json
- 14:37 lucaswerkmeister-wmde@deploy2002: ekrem, lucaswerkmeister-wmde, abi: Continuing with sync
- 14:37 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet2007-dev.codfw.wmnet with OS trixie
- 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2213 (T407997)', diff saved to https://phabricator.wikimedia.org/P84761 and previous config saved to /var/cache/conftool/dbconfig/20251104-143546-marostegui.json
- 14:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2213.codfw.wmnet with reason: Maintenance
- 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T407997)', diff saved to https://phabricator.wikimedia.org/P84760 and previous config saved to /var/cache/conftool/dbconfig/20251104-143519-marostegui.json
- 14:34 lucaswerkmeister-wmde@deploy2002: ekrem, lucaswerkmeister-wmde, abi: Backport for azwiktionary: use new wordmark and tagline (T408147), Remove wmgULSPosition for special wikis (T400067) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:32 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for azwiktionary: use new wordmark and tagline (T408147), Remove wmgULSPosition for special wikis (T400067)
- 14:31 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 14:31 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 14:31 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 14:30 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 14:30 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 14:30 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 14:27 fceratto@cumin1002: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test2001.codfw.wmnet
- 14:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P84759 and previous config saved to /var/cache/conftool/dbconfig/20251104-142010-marostegui.json
- 14:18 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2007-dev.codfw.wmnet with reason: host reimage
- 14:15 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2007-dev.codfw.wmnet with reason: host reimage
- 14:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P84758 and previous config saved to /var/cache/conftool/dbconfig/20251104-140503-marostegui.json
- 13:57 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudnet2007-dev.codfw.wmnet with OS trixie
- 13:53 topranks: downgrade lsw1-c3-eqiad to SR-Linux v24.7.2
- 13:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T407997)', diff saved to https://phabricator.wikimedia.org/P84757 and previous config saved to /var/cache/conftool/dbconfig/20251104-134955-marostegui.json
- 13:45 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2211 (T407997)', diff saved to https://phabricator.wikimedia.org/P84756 and previous config saved to /var/cache/conftool/dbconfig/20251104-134545-marostegui.json
- 13:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2211.codfw.wmnet with reason: Maintenance
- 13:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2201.codfw.wmnet with reason: Maintenance
- 13:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T407997)', diff saved to https://phabricator.wikimedia.org/P84755 and previous config saved to /var/cache/conftool/dbconfig/20251104-134314-marostegui.json
- 13:41 moritzm: installing tiff security updates
- 13:35 marostegui@cumin1003: dbctl commit (dc=all): 'db1220 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P84754 and previous config saved to /var/cache/conftool/dbconfig/20251104-133526-root.json
- 13:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P84753 and previous config saved to /var/cache/conftool/dbconfig/20251104-132804-marostegui.json
- 13:20 marostegui@cumin1003: dbctl commit (dc=all): 'db1220 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P84752 and previous config saved to /var/cache/conftool/dbconfig/20251104-132019-root.json
- 13:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P84750 and previous config saved to /var/cache/conftool/dbconfig/20251104-131254-marostegui.json
- 13:05 marostegui@cumin1003: dbctl commit (dc=all): 'db1220 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P84749 and previous config saved to /var/cache/conftool/dbconfig/20251104-130512-root.json
- 12:59 topranks: downgrade lsw1-d3-eqiad to SR-Linux v24.10.1
- 12:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T407997)', diff saved to https://phabricator.wikimedia.org/P84748 and previous config saved to /var/cache/conftool/dbconfig/20251104-125745-marostegui.json
- 12:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2178 (T407997)', diff saved to https://phabricator.wikimedia.org/P84747 and previous config saved to /var/cache/conftool/dbconfig/20251104-125359-marostegui.json
- 12:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2178.codfw.wmnet with reason: Maintenance
- 12:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T407997)', diff saved to https://phabricator.wikimedia.org/P84746 and previous config saved to /var/cache/conftool/dbconfig/20251104-125335-marostegui.json
- 12:50 marostegui@cumin1003: dbctl commit (dc=all): 'db1220 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P84745 and previous config saved to /var/cache/conftool/dbconfig/20251104-125005-root.json
- 12:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1220 T409167', diff saved to https://phabricator.wikimedia.org/P84744 and previous config saved to /var/cache/conftool/dbconfig/20251104-124836-marostegui.json
- 12:48 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1237 to x1 primary T409167', diff saved to https://phabricator.wikimedia.org/P84743 and previous config saved to /var/cache/conftool/dbconfig/20251104-124803-marostegui.json
- 12:47 marostegui: Starting x1 eqiad failover from db1220 to db1237 - T409167
- 12:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: Primary switchover x1 T409167
- 12:45 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1237 with weight 0 T409167', diff saved to https://phabricator.wikimedia.org/P84742 and previous config saved to /var/cache/conftool/dbconfig/20251104-124556-marostegui.json
- 12:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P84741 and previous config saved to /var/cache/conftool/dbconfig/20251104-123827-marostegui.json
- 12:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P84740 and previous config saved to /var/cache/conftool/dbconfig/20251104-122320-marostegui.json
- 12:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T407997)', diff saved to https://phabricator.wikimedia.org/P84739 and previous config saved to /var/cache/conftool/dbconfig/20251104-120812-marostegui.json
- 12:08 fabfur: re-enable puppet on A:cp (T408060)
- 12:04 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2171 (T407997)', diff saved to https://phabricator.wikimedia.org/P84737 and previous config saved to /var/cache/conftool/dbconfig/20251104-120401-marostegui.json
- 12:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 12:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T407997)', diff saved to https://phabricator.wikimedia.org/P84736 and previous config saved to /var/cache/conftool/dbconfig/20251104-120338-marostegui.json
- 12:00 topranks: upgrade lsw1-d3-eqiad to SR-Linux v24.10.3
- 11:57 fabfur: temporary disable puppet on A:cp to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1199247 (T408060)
- 11:52 marostegui@cumin1003: dbctl commit (dc=all): 'db1192 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84735 and previous config saved to /var/cache/conftool/dbconfig/20251104-115217-root.json
- 11:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P84734 and previous config saved to /var/cache/conftool/dbconfig/20251104-114830-marostegui.json
- 11:47 marostegui@cumin1003: dbctl commit (dc=all): 'db2216 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84733 and previous config saved to /var/cache/conftool/dbconfig/20251104-114712-root.json
- 11:38 moritzm: installing Java 8 security updates on Bullseye
- 11:37 marostegui@cumin1003: dbctl commit (dc=all): 'db1192 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84732 and previous config saved to /var/cache/conftool/dbconfig/20251104-113711-root.json
- 11:33 hashar: Upgrading and restarting CI Jenkins | T404856
- 11:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P84731 and previous config saved to /var/cache/conftool/dbconfig/20251104-113322-marostegui.json
- 11:32 marostegui@cumin1003: dbctl commit (dc=all): 'db2216 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84730 and previous config saved to /var/cache/conftool/dbconfig/20251104-113205-root.json
- 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'db1192 (re)pooling @ 50%: 10', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20251104-112201-root.json
- 11:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T407997)', diff saved to https://phabricator.wikimedia.org/P84728 and previous config saved to /var/cache/conftool/dbconfig/20251104-111814-marostegui.json
- 11:16 marostegui@cumin1003: dbctl commit (dc=all): 'db2216 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84727 and previous config saved to /var/cache/conftool/dbconfig/20251104-111658-root.json
- 11:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2157 (T407997)', diff saved to https://phabricator.wikimedia.org/P84726 and previous config saved to /var/cache/conftool/dbconfig/20251104-111401-marostegui.json
- 11:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2157.codfw.wmnet with reason: Maintenance
- 11:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
- 11:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 11:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 11:06 marostegui@cumin1003: dbctl commit (dc=all): 'db1192 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84725 and previous config saved to /var/cache/conftool/dbconfig/20251104-110655-root.json
- 11:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T407997)', diff saved to https://phabricator.wikimedia.org/P84724 and previous config saved to /var/cache/conftool/dbconfig/20251104-110643-marostegui.json
- 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'db2216 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84723 and previous config saved to /var/cache/conftool/dbconfig/20251104-110152-root.json
- 10:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1192 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84722 and previous config saved to /var/cache/conftool/dbconfig/20251104-105851-marostegui.json
- 10:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1192.eqiad.wmnet with reason: Maintenance
- 10:54 moritzm: uploaded openjdk-8 8u472-ga-1~deb11u1 to apt.wikimedia.org (forward port of latest Java 8 security updates)
- 10:53 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2216 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84721 and previous config saved to /var/cache/conftool/dbconfig/20251104-105339-marostegui.json
- 10:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2216.codfw.wmnet with reason: Maintenance
- 10:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P84720 and previous config saved to /var/cache/conftool/dbconfig/20251104-105136-marostegui.json
- 10:42 jmm@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
- 10:40 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:40 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:40 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:40 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:39 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:39 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P84719 and previous config saved to /var/cache/conftool/dbconfig/20251104-103629-marostegui.json
- 10:25 jmm@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
- 10:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T407997)', diff saved to https://phabricator.wikimedia.org/P84718 and previous config saved to /var/cache/conftool/dbconfig/20251104-102121-marostegui.json
- 10:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1230 (T407997)', diff saved to https://phabricator.wikimedia.org/P84717 and previous config saved to /var/cache/conftool/dbconfig/20251104-101845-marostegui.json
- 10:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
- 10:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T407997)', diff saved to https://phabricator.wikimedia.org/P84716 and previous config saved to /var/cache/conftool/dbconfig/20251104-101713-marostegui.json
- 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P84715 and previous config saved to /var/cache/conftool/dbconfig/20251104-100206-marostegui.json
- 10:01 btullis@deploy2002: Finished deploy [analytics/hdfs-tools/deploy@bb26b34]: Deploying after updating targets (duration: 00m 24s)
- 10:01 btullis@deploy2002: Started deploy [analytics/hdfs-tools/deploy@bb26b34]: Deploying after updating targets
- 09:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P84714 and previous config saved to /var/cache/conftool/dbconfig/20251104-094658-marostegui.json
- 09:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T407997)', diff saved to https://phabricator.wikimedia.org/P84713 and previous config saved to /var/cache/conftool/dbconfig/20251104-093148-marostegui.json
- 09:29 ozge@deploy2002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
- 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1207 (T407997)', diff saved to https://phabricator.wikimedia.org/P84712 and previous config saved to /var/cache/conftool/dbconfig/20251104-092913-marostegui.json
- 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 09:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T407997)', diff saved to https://phabricator.wikimedia.org/P84711 and previous config saved to /var/cache/conftool/dbconfig/20251104-092850-marostegui.json
- 09:28 ozge@deploy2002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
- 09:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P84710 and previous config saved to /var/cache/conftool/dbconfig/20251104-091342-marostegui.json
- 08:59 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS bookworm
- 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P84709 and previous config saved to /var/cache/conftool/dbconfig/20251104-085834-marostegui.json
- 08:55 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS bookworm
- 08:54 moritzm: installing squid security updates
- 08:54 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:53 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:53 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:50 marostegui@cumin1003: dbctl commit (dc=all): 'db1178 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84708 and previous config saved to /var/cache/conftool/dbconfig/20251104-085043-root.json
- 08:47 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T407997)', diff saved to https://phabricator.wikimedia.org/P84707 and previous config saved to /var/cache/conftool/dbconfig/20251104-084327-marostegui.json
- 08:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1200 (T407997)', diff saved to https://phabricator.wikimedia.org/P84706 and previous config saved to /var/cache/conftool/dbconfig/20251104-084056-marostegui.json
- 08:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1200.eqiad.wmnet with reason: Maintenance
- 08:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T407997)', diff saved to https://phabricator.wikimedia.org/P84705 and previous config saved to /var/cache/conftool/dbconfig/20251104-084032-marostegui.json
- 08:35 marostegui@cumin1003: dbctl commit (dc=all): 'db1178 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84704 and previous config saved to /var/cache/conftool/dbconfig/20251104-083538-root.json
- 08:29 dcausse: UTC morning backport window done
- 08:29 dcausse@deploy2002: Finished scap sync-world: Backport for Revert^3 "cirrus: enable completion search with defaultsort A/B test" (duration: 09m 20s)
- 08:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P84703 and previous config saved to /var/cache/conftool/dbconfig/20251104-082525-marostegui.json
- 08:24 dcausse@deploy2002: dcausse: Continuing with sync
- 08:22 marostegui@cumin1003: dbctl commit (dc=all): 'db2176 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84702 and previous config saved to /var/cache/conftool/dbconfig/20251104-082226-root.json
- 08:21 dcausse@deploy2002: dcausse: Backport for Revert^3 "cirrus: enable completion search with defaultsort A/B test" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:20 marostegui@cumin1003: dbctl commit (dc=all): 'db1178 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84701 and previous config saved to /var/cache/conftool/dbconfig/20251104-082031-root.json
- 08:19 dcausse@deploy2002: Started scap sync-world: Backport for Revert^3 "cirrus: enable completion search with defaultsort A/B test"
- 08:14 tchanders@deploy2002: Finished scap sync-world: Backport for Deploy temporary accounts to enwiki (T409079) (duration: 12m 22s)
- 08:10 ozge@deploy2002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
- 08:10 tchanders@deploy2002: tchanders, stran: Continuing with sync
- 08:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P84700 and previous config saved to /var/cache/conftool/dbconfig/20251104-081017-marostegui.json
- 08:08 ozge@deploy2002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
- 08:07 marostegui@cumin1003: dbctl commit (dc=all): 'db2176 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84699 and previous config saved to /var/cache/conftool/dbconfig/20251104-080719-root.json
- 08:05 marostegui@cumin1003: dbctl commit (dc=all): 'db1178 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84698 and previous config saved to /var/cache/conftool/dbconfig/20251104-080522-root.json
- 08:04 tchanders@deploy2002: tchanders, stran: Backport for Deploy temporary accounts to enwiki (T409079) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:02 tchanders@deploy2002: Started scap sync-world: Backport for Deploy temporary accounts to enwiki (T409079)
- 08:02 ozge@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
- 08:00 ozge@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
- 07:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1178 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84697 and previous config saved to /var/cache/conftool/dbconfig/20251104-075718-marostegui.json
- 07:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1178.eqiad.wmnet with reason: Maintenance
- 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T407997)', diff saved to https://phabricator.wikimedia.org/P84696 and previous config saved to /var/cache/conftool/dbconfig/20251104-075510-marostegui.json
- 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1185 (T407997)', diff saved to https://phabricator.wikimedia.org/P84695 and previous config saved to /var/cache/conftool/dbconfig/20251104-075239-marostegui.json
- 07:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1185.eqiad.wmnet with reason: Maintenance
- 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'db2176 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84694 and previous config saved to /var/cache/conftool/dbconfig/20251104-075213-root.json
- 07:48 ozge@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
- 07:47 ozge@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
- 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'db2176 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84693 and previous config saved to /var/cache/conftool/dbconfig/20251104-073707-root.json
- 07:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2176 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84692 and previous config saved to /var/cache/conftool/dbconfig/20251104-072854-marostegui.json
- 07:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 07:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P84691 and previous config saved to /var/cache/conftool/dbconfig/20251104-072201-marostegui.json
- 07:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T407997)', diff saved to https://phabricator.wikimedia.org/P84690 and previous config saved to /var/cache/conftool/dbconfig/20251104-070653-marostegui.json
- 07:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1161 (T407997)', diff saved to https://phabricator.wikimedia.org/P84689 and previous config saved to /var/cache/conftool/dbconfig/20251104-070356-marostegui.json
- 07:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 07:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 07:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T407997)', diff saved to https://phabricator.wikimedia.org/P84688 and previous config saved to /var/cache/conftool/dbconfig/20251104-070311-marostegui.json
- 06:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P84687 and previous config saved to /var/cache/conftool/dbconfig/20251104-064803-marostegui.json
- 06:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P84686 and previous config saved to /var/cache/conftool/dbconfig/20251104-063253-marostegui.json
- 06:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T407997)', diff saved to https://phabricator.wikimedia.org/P84685 and previous config saved to /var/cache/conftool/dbconfig/20251104-061745-marostegui.json
- 06:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T407997)', diff saved to https://phabricator.wikimedia.org/P84684 and previous config saved to /var/cache/conftool/dbconfig/20251104-061449-marostegui.json
- 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1159.eqiad.wmnet with reason: Maintenance
- 06:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2204.codfw.wmnet with reason: Maintenance
- 05:02 mwpresync@deploy2002: Pruned MediaWiki: 1.45.0-wmf.23 (duration: 02m 28s)
- 04:51 eileen: civicrm upgraded from c9f9d2b5 to 77cad331
- 03:03 inflatador: bking@cumin2002 restart wdqs-blazegraph.service in CODFW to apply 1201326 T409132
- 02:30 eileen: civicrm upgraded from 1c0619b6 to c9f9d2b5
- 00:58 eileen: civicrm upgraded from 025f3ef3 to 1c0619b6
- 00:32 zabe@deploy2002: Finished scap sync-world: Backport for Using Hadoop for MostTranscludedPages on enwiki (T309738) (duration: 09m 05s)
- 00:26 zabe@deploy2002: zabe: Continuing with sync
- 00:25 zabe@deploy2002: zabe: Backport for Using Hadoop for MostTranscludedPages on enwiki (T309738) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:23 zabe@deploy2002: Started scap sync-world: Backport for Using Hadoop for MostTranscludedPages on enwiki (T309738)
- 00:10 cdanis@dns1004: END - running authdns-update
- 00:09 cdanis@dns1004: START - running authdns-update
- 00:05 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
- 00:04 dzahn@dns1004: END - running authdns-update
- 00:03 dzahn@dns1004: START - running authdns-update
2025-11-03
- 23:40 eileen: civicrm upgraded from b0c68b4a to 025f3ef3
- 23:01 inflatador: bking@cumin2002 repool wdqs2008 and 2012
- 22:56 inflatador: bking@cumin2002 depool wdqs2008 and 2012 so they can catch up on lag
- 22:54 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
- 22:54 ryankemper@cumin2002: END (ERROR) - Cookbook sre.wdqs.restart (exit_code=97)
- 22:54 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
- 22:51 ryankemper: [WDQS] Restarting all codfw wdqs-main hosts; we're getting slammed by increased triple count (same issue we've been seeing intermittently for a week or two)
- 22:28 eileen: civicrm upgraded from 29d3c24f to b0c68b4a
- 22:16 arlolra@deploy2002: Finished scap sync-world: Backport for Deploy Parsoid Read Views to 7 wikis (T408765) (duration: 08m 01s)
- 22:11 arlolra@deploy2002: arlolra: Continuing with sync
- 22:10 arlolra@deploy2002: arlolra: Backport for Deploy Parsoid Read Views to 7 wikis (T408765) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:08 arlolra@deploy2002: Started scap sync-world: Backport for Deploy Parsoid Read Views to 7 wikis (T408765)
- 22:07 inflatador: bking@cumin2002 suppress wdqs2009 alerts for next 90 days T409117
- 22:06 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 90 days, 0:00:00 on wdqs2009.codfw.wmnet with reason: no SLO for this endpoint
- 22:01 arlolra@deploy2002: Finished scap sync-world: Backport for [enwikivoyage] Enable block feature for AbuseFilter (T408885), zhwiki: Add SecurePoll Rights to CheckUser (T408902) (duration: 07m 05s)
- 21:56 arlolra@deploy2002: superpes, zhaofjx, arlolra: Continuing with sync
- 21:56 arlolra@deploy2002: superpes, zhaofjx, arlolra: Backport for [enwikivoyage] Enable block feature for AbuseFilter (T408885), zhwiki: Add SecurePoll Rights to CheckUser (T408902) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:54 arlolra@deploy2002: Started scap sync-world: Backport for [enwikivoyage] Enable block feature for AbuseFilter (T408885), zhwiki: Add SecurePoll Rights to CheckUser (T408902)
- 21:46 kemayo@deploy2002: Finished scap sync-world: Backport for Edit check: allow MWVE_FORCE_EDIT_CHECK_ENABLED to override ecenable (T408890) (duration: 09m 21s)
- 21:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T407997)', diff saved to https://phabricator.wikimedia.org/P84683 and previous config saved to /var/cache/conftool/dbconfig/20251103-214610-marostegui.json
- 21:42 kemayo@deploy2002: kemayo: Continuing with sync
- 21:39 kemayo@deploy2002: kemayo: Backport for Edit check: allow MWVE_FORCE_EDIT_CHECK_ENABLED to override ecenable (T408890) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:37 kemayo@deploy2002: Started scap sync-world: Backport for Edit check: allow MWVE_FORCE_EDIT_CHECK_ENABLED to override ecenable (T408890)
- 21:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P84682 and previous config saved to /var/cache/conftool/dbconfig/20251103-213102-marostegui.json
- 21:30 eileen: civicrm upgraded from 443ec62e to 29d3c24f
- 21:25 aaron@deploy2002: Finished scap sync-world: Backport for Set wgRestSandboxSpecs['wmf-restbase'] to use the static specs everywhere (T396805) (duration: 07m 31s)
- 21:21 aaron@deploy2002: aaron: Continuing with sync
- 21:20 aaron@deploy2002: aaron: Backport for Set wgRestSandboxSpecs['wmf-restbase'] to use the static specs everywhere (T396805) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:18 aaron@deploy2002: Started scap sync-world: Backport for Set wgRestSandboxSpecs['wmf-restbase'] to use the static specs everywhere (T396805)
- 21:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P84681 and previous config saved to /var/cache/conftool/dbconfig/20251103-211552-marostegui.json
- 21:15 aaron@deploy2002: Finished scap sync-world: Backport for Set wgRestSandboxSpecs['wmf-restbase'] on testwiki to use the static specs (T396805) (duration: 07m 16s)
- 21:14 eileen: civicrm upgraded from 443ec62e to 29d3c24f
- 21:11 aaron@deploy2002: aaron: Continuing with sync
- 21:10 aaron@deploy2002: aaron: Backport for Set wgRestSandboxSpecs['wmf-restbase'] on testwiki to use the static specs (T396805) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:08 aaron@deploy2002: Started scap sync-world: Backport for Set wgRestSandboxSpecs['wmf-restbase'] on testwiki to use the static specs (T396805)
- 21:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T407997)', diff saved to https://phabricator.wikimedia.org/P84680 and previous config saved to /var/cache/conftool/dbconfig/20251103-210044-marostegui.json
- 20:54 eileen: civicrm upgraded from 66c0e233 to 443ec62e
- 20:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2238 (T407997)', diff saved to https://phabricator.wikimedia.org/P84679 and previous config saved to /var/cache/conftool/dbconfig/20251103-204844-marostegui.json
- 20:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2238.codfw.wmnet with reason: Maintenance
- 20:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T407997)', diff saved to https://phabricator.wikimedia.org/P84678 and previous config saved to /var/cache/conftool/dbconfig/20251103-204820-marostegui.json
- 20:39 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 20:39 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 20:39 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 20:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 20:38 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 20:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 20:38 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 20:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 20:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P84677 and previous config saved to /var/cache/conftool/dbconfig/20251103-203312-marostegui.json
- 20:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 20:31 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 20:31 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 20:31 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 20:27 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 20:26 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 20:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P84676 and previous config saved to /var/cache/conftool/dbconfig/20251103-201803-marostegui.json
- 20:17 kharlan@deploy2002: Finished scap sync-world: Backport for Hooks: Fetch correct SimpleCaptcha instance in onEditPage__attemptSave_after (T408975) (duration: 07m 22s)
- 20:17 eileen: civicrm upgraded from ed25fa88 to 66c0e233
- 20:13 kharlan@deploy2002: kharlan: Continuing with sync
- 20:12 kharlan@deploy2002: kharlan: Backport for Hooks: Fetch correct SimpleCaptcha instance in onEditPage__attemptSave_after (T408975) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:10 kharlan@deploy2002: Started scap sync-world: Backport for Hooks: Fetch correct SimpleCaptcha instance in onEditPage__attemptSave_after (T408975)
- 20:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T407997)', diff saved to https://phabricator.wikimedia.org/P84675 and previous config saved to /var/cache/conftool/dbconfig/20251103-200255-marostegui.json
- 20:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2226 (T407997)', diff saved to https://phabricator.wikimedia.org/P84674 and previous config saved to /var/cache/conftool/dbconfig/20251103-200030-marostegui.json
- 20:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2226.codfw.wmnet with reason: Maintenance
- 20:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T407997)', diff saved to https://phabricator.wikimedia.org/P84673 and previous config saved to /var/cache/conftool/dbconfig/20251103-200006-marostegui.json
- 19:58 kharlan@deploy2002: Finished scap sync-world: Backport for SimpleCaptcha: Ensure correct instance is used on page creation (T408975) (duration: 07m 22s)
- 19:53 kharlan@deploy2002: kharlan: Continuing with sync
- 19:52 kharlan@deploy2002: kharlan: Backport for SimpleCaptcha: Ensure correct instance is used on page creation (T408975) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 19:50 kharlan@deploy2002: Started scap sync-world: Backport for SimpleCaptcha: Ensure correct instance is used on page creation (T408975)
- 19:45 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 19:45 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P84672 and previous config saved to /var/cache/conftool/dbconfig/20251103-194457-marostegui.json
- 19:37 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: use ve.newTarget hook to avoid globals (T408670) (duration: 07m 47s)
- 19:32 kharlan@deploy2002: kharlan: Continuing with sync
- 19:31 kharlan@deploy2002: kharlan: Backport for hCaptcha: use ve.newTarget hook to avoid globals (T408670) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 19:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P84670 and previous config saved to /var/cache/conftool/dbconfig/20251103-192950-marostegui.json
- 19:29 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: use ve.newTarget hook to avoid globals (T408670)
- 19:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T407997)', diff saved to https://phabricator.wikimedia.org/P84669 and previous config saved to /var/cache/conftool/dbconfig/20251103-191442-marostegui.json
- 19:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2225 (T407997)', diff saved to https://phabricator.wikimedia.org/P84668 and previous config saved to /var/cache/conftool/dbconfig/20251103-190237-marostegui.json
- 19:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2225.codfw.wmnet with reason: Maintenance
- 19:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T407997)', diff saved to https://phabricator.wikimedia.org/P84667 and previous config saved to /var/cache/conftool/dbconfig/20251103-190214-marostegui.json
- 18:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P84666 and previous config saved to /var/cache/conftool/dbconfig/20251103-184706-marostegui.json
- 18:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
- 18:37 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
- 18:36 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
- 18:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
- 18:36 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
- 18:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
- 18:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
- 18:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
- 18:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
- 18:32 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
- 18:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P84665 and previous config saved to /var/cache/conftool/dbconfig/20251103-183159-marostegui.json
- 18:31 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
- 18:30 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
- 18:30 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
- 18:30 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
- 18:29 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
- 18:29 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
- 18:22 swfrench@deploy2002: Finished scap sync-world: Backport for Enroll 100% of client sessions in PHP 8.3 (T405955) (duration: 07m 34s)
- 18:17 swfrench@deploy2002: swfrench: Continuing with sync
- 18:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T407997)', diff saved to https://phabricator.wikimedia.org/P84664 and previous config saved to /var/cache/conftool/dbconfig/20251103-181650-marostegui.json
- 18:16 swfrench@deploy2002: swfrench: Backport for Enroll 100% of client sessions in PHP 8.3 (T405955) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:14 swfrench@deploy2002: Started scap sync-world: Backport for Enroll 100% of client sessions in PHP 8.3 (T405955)
- 18:10 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:10 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:10 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:08 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:06 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:05 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:05 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2207 (T407997)', diff saved to https://phabricator.wikimedia.org/P84663 and previous config saved to /var/cache/conftool/dbconfig/20251103-180500-marostegui.json
- 18:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2207.codfw.wmnet with reason: Maintenance
- 17:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2197.codfw.wmnet with reason: Maintenance
- 17:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T407997)', diff saved to https://phabricator.wikimedia.org/P84662 and previous config saved to /var/cache/conftool/dbconfig/20251103-175448-marostegui.json
- 17:48 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:47 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:47 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P84661 and previous config saved to /var/cache/conftool/dbconfig/20251103-173940-marostegui.json
- 17:39 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be1088.eqiad.wmnet with OS trixie
- 17:29 _joe_: ran reprepro cleanvanished on apt-staging to try to clean hanging deb file
- 17:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P84660 and previous config saved to /var/cache/conftool/dbconfig/20251103-172433-marostegui.json
- 17:23 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2203.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T407997)', diff saved to https://phabricator.wikimedia.org/P84659 and previous config saved to /var/cache/conftool/dbconfig/20251103-170924-marostegui.json
- 17:07 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker2203.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:00 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 16:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2189 (T407997)', diff saved to https://phabricator.wikimedia.org/P84658 and previous config saved to /var/cache/conftool/dbconfig/20251103-165733-marostegui.json
- 16:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2189.codfw.wmnet with reason: Maintenance
- 16:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T407997)', diff saved to https://phabricator.wikimedia.org/P84657 and previous config saved to /var/cache/conftool/dbconfig/20251103-165709-marostegui.json
- 16:56 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 16:51 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:51 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for CR interfaces eqiad row D vlans - cmooney@cumin1003"
- 16:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for CR interfaces eqiad row D vlans - cmooney@cumin1003"
- 16:49 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:48 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 16:43 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1088.eqiad.wmnet with OS trixie
- 16:42 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be1088.eqiad.wmnet with OS trixie
- 16:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P84656 and previous config saved to /var/cache/conftool/dbconfig/20251103-164200-marostegui.json
- 16:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2078.codfw.wmnet with OS bullseye
- 16:36 reedy@deploy2002: Synchronized wmf-config/CommonSettings.php: T404806 (duration: 06m 27s)
- 16:32 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Apply JVM upgrade to 11.0.29 - eevans@cumin1003
- 16:27 topranks: make cr2-eqiad active for row D vlan sub-interfaces on et-1/0/5 T409067
- 16:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P84655 and previous config saved to /var/cache/conftool/dbconfig/20251103-162649-marostegui.json
- 16:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 16:22 topranks: enable row D vlan sub-interfaces on cr2-eqiad et-1/0/5 T409067
- 16:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2078.codfw.wmnet with reason: host reimage
- 16:18 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 16:12 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2078.codfw.wmnet with reason: host reimage
- 16:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T407997)', diff saved to https://phabricator.wikimedia.org/P84653 and previous config saved to /var/cache/conftool/dbconfig/20251103-161142-marostegui.json
- 16:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1088.eqiad.wmnet with OS trixie
- 16:04 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be1088.eqiad.wmnet with OS trixie
- 15:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2175 (T407997)', diff saved to https://phabricator.wikimedia.org/P84652 and previous config saved to /var/cache/conftool/dbconfig/20251103-155902-marostegui.json
- 15:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
- 15:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T407997)', diff saved to https://phabricator.wikimedia.org/P84651 and previous config saved to /var/cache/conftool/dbconfig/20251103-155838-marostegui.json
- 15:57 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Apply JVM upgrade to 11.0.29 - eevans@cumin1003
- 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
- 15:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2078
- 15:54 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2078
- 15:54 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2078.codfw.wmnet with OS bullseye
- 15:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
- 15:51 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on ms-be[2085-2087].codfw.wmnet with reason: awaiting controller swap
- 15:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P84650 and previous config saved to /var/cache/conftool/dbconfig/20251103-154330-marostegui.json
- 15:31 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es2031.codfw.wmnet
- 15:31 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:31 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2031.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 15:31 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2031.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 15:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P84649 and previous config saved to /var/cache/conftool/dbconfig/20251103-152822-marostegui.json
- 15:26 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 15:21 fceratto@cumin1003: START - Cookbook sre.hosts.decommission for hosts es2031.codfw.wmnet
- 15:19 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 15:16 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 15:15 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es2030.codfw.wmnet
- 15:15 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:15 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2030.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 15:14 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2030.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 15:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T407997)', diff saved to https://phabricator.wikimedia.org/P84648 and previous config saved to /var/cache/conftool/dbconfig/20251103-151315-marostegui.json
- 15:05 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 15:05 topranks: enable link from asw2-d7-eqiad to ssw1-d8-eqiad T409067
- 15:03 Lucas_WMDE: UTC afternoon backport+config window done
- 15:03 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for i18n: all behavior switches should start/end with __ (part 2), i18n: Remove deprecated behavior switches without underscores in et/sh-latn/vep (T407289) (duration: 09m 45s)
- 15:02 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1088.eqiad.wmnet with OS trixie
- 15:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2148 (T407997)', diff saved to https://phabricator.wikimedia.org/P84647 and previous config saved to /var/cache/conftool/dbconfig/20251103-150029-marostegui.json
- 15:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2148.codfw.wmnet with reason: Maintenance
- 14:58 lucaswerkmeister-wmde@deploy2002: cscott, lucaswerkmeister-wmde: Continuing with sync
- 14:58 fceratto@cumin1003: START - Cookbook sre.hosts.decommission for hosts es2030.codfw.wmnet
- 14:57 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS bookworm
- 14:56 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es2029.codfw.wmnet
- 14:56 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:56 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2029.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 14:56 topranks: disable et-1/1/3 on cr2-eqiad connecting to asw2-d-eqiad T409067
- 14:56 lucaswerkmeister-wmde@deploy2002: cscott, lucaswerkmeister-wmde: Backport for i18n: all behavior switches should start/end with __ (part 2), i18n: Remove deprecated behavior switches without underscores in et/sh-latn/vep (T407289) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:55 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS bookworm
- 14:53 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for i18n: all behavior switches should start/end with __ (part 2), i18n: Remove deprecated behavior switches without underscores in et/sh-latn/vep (T407289)
- 14:50 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2029.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 14:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repool db1259', diff saved to https://phabricator.wikimedia.org/P84646 and previous config saved to /var/cache/conftool/dbconfig/20251103-145018-marostegui.json
- 14:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 14:48 topranks: make cr1-eqiad VRRP primary for row D vlans T409067
- 14:47 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for upload: Remove stashed file in UploadFromStash when upload completed (T408610), recentchanges: Fix highlights where more than one action is defined (T409020) (duration: 12m 10s)
- 14:44 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 14:42 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, matmarex: Continuing with sync
- 14:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P84643 and previous config saved to /var/cache/conftool/dbconfig/20251103-144215-marostegui.json
- 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Cleanup T408408 T408409 T408410', diff saved to https://phabricator.wikimedia.org/P84642 and previous config saved to /var/cache/conftool/dbconfig/20251103-144204-fceratto.json
- 14:39 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
- 14:39 topranks: enable cr1-eqiad sub-interfaces for row D vlans T409067
- 14:39 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, matmarex: Backport for upload: Remove stashed file in UploadFromStash when upload completed (T408610), recentchanges: Fix highlights where more than one action is defined (T409020) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:37 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1007.eqiad.wmnet with reason: schema change
- 14:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
- 14:35 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for upload: Remove stashed file in UploadFromStash when upload completed (T408610), recentchanges: Fix highlights where more than one action is defined (T409020)
- 14:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1009.eqiad.wmnet with reason: schema change
- 14:34 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Enable pagination on Special:EditWatchlist everywhere (T41510) (duration: 12m 08s)
- 14:29 lucaswerkmeister-wmde@deploy2002: cparle, lucaswerkmeister-wmde: Continuing with sync
- 14:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:28 fceratto@cumin1003: START - Cookbook sre.hosts.decommission for hosts es2029.codfw.wmnet
- 14:26 lucaswerkmeister-wmde@deploy2002: cparle, lucaswerkmeister-wmde: Backport for Enable pagination on Special:EditWatchlist everywhere (T41510) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:22 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Enable pagination on Special:EditWatchlist everywhere (T41510)
- 14:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P84641 and previous config saved to /var/cache/conftool/dbconfig/20251103-142204-marostegui.json
- 14:20 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Revert "Adding Movepage-summary to wgForceUIMsgAsContentMsg to allow" (T183848), Freeze LiquidThreads on huwiki and svwikisource (T406026 T406227) (duration: 14m 16s)
- 14:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) es2031 - Depool es2031 T408410
- 14:16 fceratto@cumin1003: START - Cookbook sre.mysql.depool es2031 - Depool es2031 T408410
- 14:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) es2030 - Depool es2030 T408409
- 14:16 fceratto@cumin1003: START - Cookbook sre.mysql.depool es2030 - Depool es2030 T408409
- 14:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) es2030 - Depool es2030 T408409
- 14:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool es2030 - Depool es2030 T408409
- 14:12 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, esanders, func: Continuing with sync
- 14:10 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, esanders, func: Backport for Revert "Adding Movepage-summary to wgForceUIMsgAsContentMsg to allow" (T183848), Freeze LiquidThreads on huwiki and svwikisource (T406026 T406227) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T407997)', diff saved to https://phabricator.wikimedia.org/P84638 and previous config saved to /var/cache/conftool/dbconfig/20251103-140653-marostegui.json
- 14:05 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Revert "Adding Movepage-summary to wgForceUIMsgAsContentMsg to allow" (T183848), Freeze LiquidThreads on huwiki and svwikisource (T406026 T406227)
- 14:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 13:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 13:56 topranks: shut down cr1-eqiad link to asw2-d-eqiad to migrate traffic via Nokia spines T409067
- 13:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) es2029 - Depool es2029 T408408
- 13:55 fceratto@cumin1003: START - Cookbook sre.mysql.depool es2029 - Depool es2029 T408408
- 13:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1259 (T407997)', diff saved to https://phabricator.wikimedia.org/P84636 and previous config saved to /var/cache/conftool/dbconfig/20251103-135400-marostegui.json
- 13:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
- 13:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T407997)', diff saved to https://phabricator.wikimedia.org/P84635 and previous config saved to /var/cache/conftool/dbconfig/20251103-135336-marostegui.json
- 13:53 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on asw2-d-eqiad,cr[1-2]-eqiad with reason: moving uplinks from CRs to Nokia Spines on asw2-d-eqiad
- 13:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 13:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 13:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P84634 and previous config saved to /var/cache/conftool/dbconfig/20251103-133828-marostegui.json
- 13:33 fceratto@cumin1003: dbctl commit (dc=all): 'Update masters for T402859', diff saved to https://phabricator.wikimedia.org/P84633 and previous config saved to /var/cache/conftool/dbconfig/20251103-133342-fceratto.json
- 13:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P84632 and previous config saved to /var/cache/conftool/dbconfig/20251103-132320-marostegui.json
- 13:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T407997)', diff saved to https://phabricator.wikimedia.org/P84631 and previous config saved to /var/cache/conftool/dbconfig/20251103-130812-marostegui.json
- 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Update masters for T402859', diff saved to https://phabricator.wikimedia.org/P84630 and previous config saved to /var/cache/conftool/dbconfig/20251103-130011-fceratto.json
- 12:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1254 (T407997)', diff saved to https://phabricator.wikimedia.org/P84629 and previous config saved to /var/cache/conftool/dbconfig/20251103-125643-marostegui.json
- 12:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
- 12:55 fceratto@dns1004: END - running authdns-update
- 12:54 fceratto@dns1004: START - running authdns-update
- 12:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
- 12:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T407997)', diff saved to https://phabricator.wikimedia.org/P84628 and previous config saved to /var/cache/conftool/dbconfig/20251103-124632-marostegui.json
- 12:35 topranks: move analytics1-c-eqiad gateway IPs to new spine switch port cr2-eqiad T405579
- 12:33 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:33 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for analytics1-c-eqiad IPs cr1-eqiad - cmooney@cumin1003"
- 12:33 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for analytics1-c-eqiad IPs cr1-eqiad - cmooney@cumin1003"
- 12:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P84627 and previous config saved to /var/cache/conftool/dbconfig/20251103-123125-marostegui.json
- 12:27 topranks: adjust VRRP priority for analytics1-d-eqiad to make cr1-eqiad active gateway T405579
- 12:26 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 12:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P84626 and previous config saved to /var/cache/conftool/dbconfig/20251103-121617-marostegui.json
- 12:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T407997)', diff saved to https://phabricator.wikimedia.org/P84625 and previous config saved to /var/cache/conftool/dbconfig/20251103-120108-marostegui.json
- 11:58 topranks: move analytics1-c-eqiad gateway IPs to new spine switch ports eqiad T405579
- 11:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1233 (T407997)', diff saved to https://phabricator.wikimedia.org/P84624 and previous config saved to /var/cache/conftool/dbconfig/20251103-114913-marostegui.json
- 11:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
- 11:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T407997)', diff saved to https://phabricator.wikimedia.org/P84623 and previous config saved to /var/cache/conftool/dbconfig/20251103-114849-marostegui.json
- 11:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P84622 and previous config saved to /var/cache/conftool/dbconfig/20251103-113341-marostegui.json
- 11:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
- 11:28 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 11:27 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 11:18 brouberol@dns1004: END - running authdns-update
- 11:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P84621 and previous config saved to /var/cache/conftool/dbconfig/20251103-111834-marostegui.json
- 11:18 brouberol@dns1004: START - running authdns-update
- 11:10 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-test-worker1001.eqiad.wmnet with OS bullseye
- 11:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T407997)', diff saved to https://phabricator.wikimedia.org/P84620 and previous config saved to /var/cache/conftool/dbconfig/20251103-110326-marostegui.json
- 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1229 (T407997)', diff saved to https://phabricator.wikimedia.org/P84619 and previous config saved to /var/cache/conftool/dbconfig/20251103-110111-marostegui.json
- 11:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
- 10:52 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
- 10:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 10:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T407997)', diff saved to https://phabricator.wikimedia.org/P84618 and previous config saved to /var/cache/conftool/dbconfig/20251103-105038-marostegui.json
- 10:46 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
- 10:44 marostegui@dns1006: END - running authdns-update
- 10:44 marostegui: Switch m3 (phabricator) proxy to dbproxy1028 T408956
- 10:44 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-test-worker1001.eqiad.wmnet with reason: host reimage
- 10:44 marostegui@dns1006: START - running authdns-update
- 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 100%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84617 and previous config saved to /var/cache/conftool/dbconfig/20251103-104152-root.json
- 10:38 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-test-worker1001.eqiad.wmnet with reason: host reimage
- 10:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20251103-103527-marostegui.json
- 10:26 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 75%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84616 and previous config saved to /var/cache/conftool/dbconfig/20251103-102645-root.json
- 10:22 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host an-test-worker1001.eqiad.wmnet with OS bullseye
- 10:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P84614 and previous config saved to /var/cache/conftool/dbconfig/20251103-102018-marostegui.json
- 10:17 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
- 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 60%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84612 and previous config saved to /var/cache/conftool/dbconfig/20251103-101138-root.json
- 10:07 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
- 10:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T407997)', diff saved to https://phabricator.wikimedia.org/P84611 and previous config saved to /var/cache/conftool/dbconfig/20251103-100511-marostegui.json
- 10:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
- 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1197 (T407997)', diff saved to https://phabricator.wikimedia.org/P84610 and previous config saved to /var/cache/conftool/dbconfig/20251103-100257-marostegui.json
- 10:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T407997)', diff saved to https://phabricator.wikimedia.org/P84609 and previous config saved to /var/cache/conftool/dbconfig/20251103-100233-marostegui.json
- 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 50%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84608 and previous config saved to /var/cache/conftool/dbconfig/20251103-095632-root.json
- 09:50 moritzm: installing intel-microcode security updates
- 09:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P84607 and previous config saved to /var/cache/conftool/dbconfig/20251103-094726-marostegui.json
- 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 30%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84606 and previous config saved to /var/cache/conftool/dbconfig/20251103-094126-root.json
- 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
- 09:40 elukey@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
- 09:39 elukey@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
- 09:38 elukey@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
- 09:37 elukey@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
- 09:35 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
- 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P84605 and previous config saved to /var/cache/conftool/dbconfig/20251103-093218-marostegui.json
- 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 25%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84604 and previous config saved to /var/cache/conftool/dbconfig/20251103-092618-root.json
- 09:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T407997)', diff saved to https://phabricator.wikimedia.org/P84603 and previous config saved to /var/cache/conftool/dbconfig/20251103-091708-marostegui.json
- 09:15 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
- 09:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1188 (T407997)', diff saved to https://phabricator.wikimedia.org/P84602 and previous config saved to /var/cache/conftool/dbconfig/20251103-091452-marostegui.json
- 09:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T407997)', diff saved to https://phabricator.wikimedia.org/P84601 and previous config saved to /var/cache/conftool/dbconfig/20251103-091435-marostegui.json
- 09:11 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
- 09:11 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 15%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84600 and previous config saved to /var/cache/conftool/dbconfig/20251103-091109-root.json
- 09:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1174.eqiad.wmnet onto db1231.eqiad.wmnet
- 09:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1174 gradually with 4 steps - Pool db1174.eqiad.wmnet in after cloning
- 09:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1174 gradually with 4 steps - Pool db1174.eqiad.wmnet in after cloning
- 09:06 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) db1174 gradually with 4 steps - Pool db1174.eqiad.wmnet in after cloning
- 09:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:00 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix uncommitted changes for mwdebug2002 - elukey@cumin1003"
- 08:59 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix uncommitted changes for mwdebug2002 - elukey@cumin1003"
- 08:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P84599 and previous config saved to /var/cache/conftool/dbconfig/20251103-085925-marostegui.json
- 08:56 elukey@cumin1003: START - Cookbook sre.dns.netbox
- 08:56 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 10%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84598 and previous config saved to /var/cache/conftool/dbconfig/20251103-085600-root.json
- 08:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P84596 and previous config saved to /var/cache/conftool/dbconfig/20251103-084417-marostegui.json
- 08:41 godog: silence wikitech-static icinga alert for a couple of weeks - T409029
- 08:40 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 5%: After moving it to s7', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20251103-084049-root.json
- 08:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T407997)', diff saved to https://phabricator.wikimedia.org/P84594 and previous config saved to /var/cache/conftool/dbconfig/20251103-082909-marostegui.json
- 08:25 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 1%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84593 and previous config saved to /var/cache/conftool/dbconfig/20251103-082543-root.json
- 08:20 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1174 gradually with 4 steps - Pool db1174.eqiad.wmnet in after cloning
- 08:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1182 (T407997)', diff saved to https://phabricator.wikimedia.org/P84591 and previous config saved to /var/cache/conftool/dbconfig/20251103-081238-marostegui.json
- 08:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 08:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T407997)', diff saved to https://phabricator.wikimedia.org/P84590 and previous config saved to /var/cache/conftool/dbconfig/20251103-081214-marostegui.json
- 07:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P84589 and previous config saved to /var/cache/conftool/dbconfig/20251103-075706-marostegui.json
- 07:51 marostegui@cumin1003: dbctl commit (dc=all): 'db1177 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84588 and previous config saved to /var/cache/conftool/dbconfig/20251103-075130-root.json
- 07:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P84587 and previous config saved to /var/cache/conftool/dbconfig/20251103-074156-marostegui.json
- 07:36 marostegui@cumin1003: dbctl commit (dc=all): 'db1177 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84586 and previous config saved to /var/cache/conftool/dbconfig/20251103-073624-root.json
- 07:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T407997)', diff saved to https://phabricator.wikimedia.org/P84585 and previous config saved to /var/cache/conftool/dbconfig/20251103-072647-marostegui.json
- 07:25 marostegui@cumin1003: dbctl commit (dc=all): 'Remove es1034 from dbctl T409025', diff saved to https://phabricator.wikimedia.org/P84584 and previous config saved to /var/cache/conftool/dbconfig/20251103-072527-marostegui.json
- 07:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1162 (T407997)', diff saved to https://phabricator.wikimedia.org/P84583 and previous config saved to /var/cache/conftool/dbconfig/20251103-072431-marostegui.json
- 07:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 07:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T407997)', diff saved to https://phabricator.wikimedia.org/P84582 and previous config saved to /var/cache/conftool/dbconfig/20251103-072405-marostegui.json
- 07:23 marostegui@cumin1003: dbctl commit (dc=all): 'db2174 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84581 and previous config saved to /var/cache/conftool/dbconfig/20251103-072303-root.json
- 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'db1177 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84580 and previous config saved to /var/cache/conftool/dbconfig/20251103-072118-root.json
- 07:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P84579 and previous config saved to /var/cache/conftool/dbconfig/20251103-070853-marostegui.json
- 07:07 marostegui@cumin1003: dbctl commit (dc=all): 'db2174 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84578 and previous config saved to /var/cache/conftool/dbconfig/20251103-070753-root.json
- 07:06 marostegui@cumin1003: dbctl commit (dc=all): 'db1177 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84577 and previous config saved to /var/cache/conftool/dbconfig/20251103-070612-root.json
- 06:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1177 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84576 and previous config saved to /var/cache/conftool/dbconfig/20251103-065808-marostegui.json
- 06:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1177.eqiad.wmnet with reason: Maintenance
- 06:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P84575 and previous config saved to /var/cache/conftool/dbconfig/20251103-065346-marostegui.json
- 06:52 marostegui@cumin1003: dbctl commit (dc=all): 'db2174 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84574 and previous config saved to /var/cache/conftool/dbconfig/20251103-065248-root.json
- 06:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T407997)', diff saved to https://phabricator.wikimedia.org/P84573 and previous config saved to /var/cache/conftool/dbconfig/20251103-063838-marostegui.json
- 06:38 marostegui: Drop afl_ip related triggers from s2 T408780
- 06:37 marostegui@cumin1003: dbctl commit (dc=all): 'db2174 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84572 and previous config saved to /var/cache/conftool/dbconfig/20251103-063742-root.json
- 06:29 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2174 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84571 and previous config saved to /var/cache/conftool/dbconfig/20251103-062919-marostegui.json
- 06:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 06:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1156 (T407997)', diff saved to https://phabricator.wikimedia.org/P84570 and previous config saved to /var/cache/conftool/dbconfig/20251103-062603-marostegui.json
- 06:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 06:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1174 - Depool db1174.eqiad.wmnet to then clone it to db1231.eqiad.wmnet - marostegui@cumin1003
- 06:20 marostegui@cumin1003: START - Cookbook sre.mysql.depool db1174 - Depool db1174.eqiad.wmnet to then clone it to db1231.eqiad.wmnet - marostegui@cumin1003
- 06:20 marostegui@cumin1003: START - Cookbook sre.mysql.clone of db1174.eqiad.wmnet onto db1231.eqiad.wmnet
- 06:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db[1174,1231].eqiad.wmnet with reason: Moving db1231 to s7
- 06:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1231 T408829', diff saved to https://phabricator.wikimedia.org/P84568 and previous config saved to /var/cache/conftool/dbconfig/20251103-061906-marostegui.json
- 06:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
- 04:07 musikanimal@deploy2002: Finished scap sync-world: Backport for AbstractRenderer: ensure OutputPage::setDisplayTitle() gets passed safe HTML (duration: 39m 55s)
- 03:53 musikanimal@deploy2002: musikanimal: Continuing with sync
- 03:52 musikanimal@deploy2002: musikanimal: Backport for AbstractRenderer: ensure OutputPage::setDisplayTitle() gets passed safe HTML synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 03:27 musikanimal@deploy2002: Started scap sync-world: Backport for AbstractRenderer: ensure OutputPage::setDisplayTitle() gets passed safe HTML
- 01:15 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 15m 04s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-11-02
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 14m 01s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-11-01
- 23:50 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1002-dev.eqiad.wmnet with OS trixie
- 22:27 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudbackup1002-dev.eqiad.wmnet with reason: host reimage
- 22:22 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudbackup1002-dev.eqiad.wmnet with reason: host reimage
- 22:10 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudbackup1002-dev.eqiad.wmnet with OS trixie
- 11:01 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudbackup1001-dev.eqiad.wmnet with OS trixie