Server Admin Log/Archive 99
Appearance
2025-11-30
- 22:46 apine@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 22:45 apine@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 22:45 apine@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 22:44 apine@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 22:43 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 22:43 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 22:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 22:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86229 and previous config saved to /var/cache/conftool/dbconfig/20251130-222157-marostegui.json
- 22:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P86228 and previous config saved to /var/cache/conftool/dbconfig/20251130-220650-marostegui.json
- 21:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P86227 and previous config saved to /var/cache/conftool/dbconfig/20251130-215142-marostegui.json
- 21:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86226 and previous config saved to /var/cache/conftool/dbconfig/20251130-213634-marostegui.json
- 20:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1207 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86225 and previous config saved to /var/cache/conftool/dbconfig/20251130-201010-marostegui.json
- 20:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 20:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86224 and previous config saved to /var/cache/conftool/dbconfig/20251130-200947-marostegui.json
- 19:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P86223 and previous config saved to /var/cache/conftool/dbconfig/20251130-195439-marostegui.json
- 19:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P86222 and previous config saved to /var/cache/conftool/dbconfig/20251130-193931-marostegui.json
- 19:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86221 and previous config saved to /var/cache/conftool/dbconfig/20251130-192424-marostegui.json
- 17:45 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1200 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86220 and previous config saved to /var/cache/conftool/dbconfig/20251130-174456-marostegui.json
- 17:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1200.eqiad.wmnet with reason: Maintenance
- 17:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86219 and previous config saved to /var/cache/conftool/dbconfig/20251130-174433-marostegui.json
- 17:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P86218 and previous config saved to /var/cache/conftool/dbconfig/20251130-172925-marostegui.json
- 17:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P86217 and previous config saved to /var/cache/conftool/dbconfig/20251130-171418-marostegui.json
- 16:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86216 and previous config saved to /var/cache/conftool/dbconfig/20251130-165910-marostegui.json
- 15:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T410531)', diff saved to https://phabricator.wikimedia.org/P86215 and previous config saved to /var/cache/conftool/dbconfig/20251130-154157-marostegui.json
- 15:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P86214 and previous config saved to /var/cache/conftool/dbconfig/20251130-152649-marostegui.json
- 15:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1185 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86213 and previous config saved to /var/cache/conftool/dbconfig/20251130-151841-marostegui.json
- 15:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1185.eqiad.wmnet with reason: Maintenance
- 15:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86212 and previous config saved to /var/cache/conftool/dbconfig/20251130-151817-marostegui.json
- 15:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P86211 and previous config saved to /var/cache/conftool/dbconfig/20251130-151141-marostegui.json
- 15:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P86210 and previous config saved to /var/cache/conftool/dbconfig/20251130-150310-marostegui.json
- 14:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T410531)', diff saved to https://phabricator.wikimedia.org/P86209 and previous config saved to /var/cache/conftool/dbconfig/20251130-145634-marostegui.json
- 14:50 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2216 (T410531)', diff saved to https://phabricator.wikimedia.org/P86208 and previous config saved to /var/cache/conftool/dbconfig/20251130-145043-marostegui.json
- 14:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2216.codfw.wmnet with reason: Maintenance
- 14:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 (T410531)', diff saved to https://phabricator.wikimedia.org/P86207 and previous config saved to /var/cache/conftool/dbconfig/20251130-145020-marostegui.json
- 14:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P86206 and previous config saved to /var/cache/conftool/dbconfig/20251130-144802-marostegui.json
- 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P86205 and previous config saved to /var/cache/conftool/dbconfig/20251130-143513-marostegui.json
- 14:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86204 and previous config saved to /var/cache/conftool/dbconfig/20251130-143255-marostegui.json
- 14:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P86203 and previous config saved to /var/cache/conftool/dbconfig/20251130-142006-marostegui.json
- 14:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 (T410531)', diff saved to https://phabricator.wikimedia.org/P86202 and previous config saved to /var/cache/conftool/dbconfig/20251130-140458-marostegui.json
- 13:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2203 (T410531)', diff saved to https://phabricator.wikimedia.org/P86201 and previous config saved to /var/cache/conftool/dbconfig/20251130-135906-marostegui.json
- 13:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2203.codfw.wmnet with reason: Maintenance
- 13:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2202.codfw.wmnet with reason: Maintenance
- 13:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T410531)', diff saved to https://phabricator.wikimedia.org/P86200 and previous config saved to /var/cache/conftool/dbconfig/20251130-135435-marostegui.json
- 13:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P86199 and previous config saved to /var/cache/conftool/dbconfig/20251130-133928-marostegui.json
- 13:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P86198 and previous config saved to /var/cache/conftool/dbconfig/20251130-132420-marostegui.json
- 13:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T410531)', diff saved to https://phabricator.wikimedia.org/P86197 and previous config saved to /var/cache/conftool/dbconfig/20251130-130913-marostegui.json
- 13:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2188 (T410531)', diff saved to https://phabricator.wikimedia.org/P86196 and previous config saved to /var/cache/conftool/dbconfig/20251130-130321-marostegui.json
- 13:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2188.codfw.wmnet with reason: Maintenance
- 13:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T410531)', diff saved to https://phabricator.wikimedia.org/P86195 and previous config saved to /var/cache/conftool/dbconfig/20251130-130257-marostegui.json
- 12:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P86194 and previous config saved to /var/cache/conftool/dbconfig/20251130-124750-marostegui.json
- 12:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P86193 and previous config saved to /var/cache/conftool/dbconfig/20251130-123242-marostegui.json
- 12:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T410531)', diff saved to https://phabricator.wikimedia.org/P86192 and previous config saved to /var/cache/conftool/dbconfig/20251130-121734-marostegui.json
- 12:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2176 (T410531)', diff saved to https://phabricator.wikimedia.org/P86191 and previous config saved to /var/cache/conftool/dbconfig/20251130-121110-marostegui.json
- 12:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 12:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T410531)', diff saved to https://phabricator.wikimedia.org/P86190 and previous config saved to /var/cache/conftool/dbconfig/20251130-121046-marostegui.json
- 11:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P86189 and previous config saved to /var/cache/conftool/dbconfig/20251130-115539-marostegui.json
- 11:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P86188 and previous config saved to /var/cache/conftool/dbconfig/20251130-114031-marostegui.json
- 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T410531)', diff saved to https://phabricator.wikimedia.org/P86187 and previous config saved to /var/cache/conftool/dbconfig/20251130-112523-marostegui.json
- 11:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2174 (T410531)', diff saved to https://phabricator.wikimedia.org/P86186 and previous config saved to /var/cache/conftool/dbconfig/20251130-111857-marostegui.json
- 11:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 11:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T410531)', diff saved to https://phabricator.wikimedia.org/P86185 and previous config saved to /var/cache/conftool/dbconfig/20251130-111833-marostegui.json
- 11:07 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1161 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86184 and previous config saved to /var/cache/conftool/dbconfig/20251130-110739-marostegui.json
- 11:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 11:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 11:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86183 and previous config saved to /var/cache/conftool/dbconfig/20251130-110655-marostegui.json
- 11:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P86182 and previous config saved to /var/cache/conftool/dbconfig/20251130-110326-marostegui.json
- 10:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P86181 and previous config saved to /var/cache/conftool/dbconfig/20251130-105147-marostegui.json
- 10:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T410531)', diff saved to https://phabricator.wikimedia.org/P86178 and previous config saved to /var/cache/conftool/dbconfig/20251130-103311-marostegui.json
- 10:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2173 (T410531)', diff saved to https://phabricator.wikimedia.org/P86177 and previous config saved to /var/cache/conftool/dbconfig/20251130-102644-marostegui.json
- 10:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2173.codfw.wmnet with reason: Maintenance
- 10:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T410531)', diff saved to https://phabricator.wikimedia.org/P86176 and previous config saved to /var/cache/conftool/dbconfig/20251130-102620-marostegui.json
- 10:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86175 and previous config saved to /var/cache/conftool/dbconfig/20251130-102132-marostegui.json
- 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P86174 and previous config saved to /var/cache/conftool/dbconfig/20251130-101113-marostegui.json
- 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P86173 and previous config saved to /var/cache/conftool/dbconfig/20251130-095605-marostegui.json
- 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T410531)', diff saved to https://phabricator.wikimedia.org/P86172 and previous config saved to /var/cache/conftool/dbconfig/20251130-094058-marostegui.json
- 09:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2170 (T410531)', diff saved to https://phabricator.wikimedia.org/P86171 and previous config saved to /var/cache/conftool/dbconfig/20251130-093438-marostegui.json
- 09:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 09:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T410531)', diff saved to https://phabricator.wikimedia.org/P86170 and previous config saved to /var/cache/conftool/dbconfig/20251130-093414-marostegui.json
- 09:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P86169 and previous config saved to /var/cache/conftool/dbconfig/20251130-091906-marostegui.json
- 09:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P86168 and previous config saved to /var/cache/conftool/dbconfig/20251130-090358-marostegui.json
- 08:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T410531)', diff saved to https://phabricator.wikimedia.org/P86167 and previous config saved to /var/cache/conftool/dbconfig/20251130-084851-marostegui.json
- 08:42 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2153 (T410531)', diff saved to https://phabricator.wikimedia.org/P86166 and previous config saved to /var/cache/conftool/dbconfig/20251130-084229-marostegui.json
- 08:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2153.codfw.wmnet with reason: Maintenance
- 08:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T410531)', diff saved to https://phabricator.wikimedia.org/P86165 and previous config saved to /var/cache/conftool/dbconfig/20251130-084205-marostegui.json
- 08:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P86164 and previous config saved to /var/cache/conftool/dbconfig/20251130-082657-marostegui.json
- 08:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P86163 and previous config saved to /var/cache/conftool/dbconfig/20251130-081150-marostegui.json
- 07:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T410531)', diff saved to https://phabricator.wikimedia.org/P86162 and previous config saved to /var/cache/conftool/dbconfig/20251130-075642-marostegui.json
- 07:50 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2146 (T410531)', diff saved to https://phabricator.wikimedia.org/P86161 and previous config saved to /var/cache/conftool/dbconfig/20251130-075017-marostegui.json
- 07:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2146.codfw.wmnet with reason: Maintenance
- 07:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T410531)', diff saved to https://phabricator.wikimedia.org/P86160 and previous config saved to /var/cache/conftool/dbconfig/20251130-074953-marostegui.json
- 07:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P86159 and previous config saved to /var/cache/conftool/dbconfig/20251130-073445-marostegui.json
- 07:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P86158 and previous config saved to /var/cache/conftool/dbconfig/20251130-071938-marostegui.json
- 07:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T410531)', diff saved to https://phabricator.wikimedia.org/P86157 and previous config saved to /var/cache/conftool/dbconfig/20251130-070430-marostegui.json
- 06:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2145 (T410531)', diff saved to https://phabricator.wikimedia.org/P86156 and previous config saved to /var/cache/conftool/dbconfig/20251130-065805-marostegui.json
- 06:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2145.codfw.wmnet with reason: Maintenance
- 06:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86155 and previous config saved to /var/cache/conftool/dbconfig/20251130-065526-marostegui.json
- 06:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1159.eqiad.wmnet with reason: Maintenance
- 06:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 03:48 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 03:47 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T410589)', diff saved to https://phabricator.wikimedia.org/P86154 and previous config saved to /var/cache/conftool/dbconfig/20251130-034752-ladsgroup.json
- 03:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P86153 and previous config saved to /var/cache/conftool/dbconfig/20251130-033244-ladsgroup.json
- 03:17 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P86152 and previous config saved to /var/cache/conftool/dbconfig/20251130-031735-ladsgroup.json
- 03:02 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T410589)', diff saved to https://phabricator.wikimedia.org/P86151 and previous config saved to /var/cache/conftool/dbconfig/20251130-030228-ladsgroup.json
- 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 12m 22s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-11-29
- 22:52 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1032.eqiad.wmnet with OS trixie
- 22:52 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 22:51 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 22:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1032.eqiad.wmnet with reason: host reimage
- 22:29 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1028.eqiad.wmnet with OS bookworm
- 22:29 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 22:27 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1032.eqiad.wmnet with reason: host reimage
- 22:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 22:10 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1032.eqiad.wmnet with OS trixie
- 22:09 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1032.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 22:07 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1032.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 22:03 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1028.eqiad.wmnet with reason: host reimage
- 21:57 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1028.eqiad.wmnet with reason: host reimage
- 21:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1032.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1032.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:52 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1028.eqiad.wmnet with OS bookworm
- 20:48 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:47 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:34 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1032.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 16:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86150 and previous config saved to /var/cache/conftool/dbconfig/20251129-162720-marostegui.json
- 16:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P86149 and previous config saved to /var/cache/conftool/dbconfig/20251129-161213-marostegui.json
- 15:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P86148 and previous config saved to /var/cache/conftool/dbconfig/20251129-155706-marostegui.json
- 15:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86147 and previous config saved to /var/cache/conftool/dbconfig/20251129-154158-marostegui.json
- 13:38 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1259 (T410589)', diff saved to https://phabricator.wikimedia.org/P86146 and previous config saved to /var/cache/conftool/dbconfig/20251129-133810-ladsgroup.json
- 13:38 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1259.eqiad.wmnet with reason: Maintenance
- 13:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T410589)', diff saved to https://phabricator.wikimedia.org/P86145 and previous config saved to /var/cache/conftool/dbconfig/20251129-133748-ladsgroup.json
- 13:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P86144 and previous config saved to /var/cache/conftool/dbconfig/20251129-132240-ladsgroup.json
- 13:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P86143 and previous config saved to /var/cache/conftool/dbconfig/20251129-130732-ladsgroup.json
- 12:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T410589)', diff saved to https://phabricator.wikimedia.org/P86142 and previous config saved to /var/cache/conftool/dbconfig/20251129-125225-ladsgroup.json
- 11:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2229 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86141 and previous config saved to /var/cache/conftool/dbconfig/20251129-115552-marostegui.json
- 11:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance
- 11:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86140 and previous config saved to /var/cache/conftool/dbconfig/20251129-115528-marostegui.json
- 11:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P86139 and previous config saved to /var/cache/conftool/dbconfig/20251129-114020-marostegui.json
- 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P86138 and previous config saved to /var/cache/conftool/dbconfig/20251129-112513-marostegui.json
- 11:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86137 and previous config saved to /var/cache/conftool/dbconfig/20251129-111005-marostegui.json
- 07:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2224 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86136 and previous config saved to /var/cache/conftool/dbconfig/20251129-072407-marostegui.json
- 07:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2224.codfw.wmnet with reason: Maintenance
- 07:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86135 and previous config saved to /var/cache/conftool/dbconfig/20251129-072344-marostegui.json
- 07:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P86134 and previous config saved to /var/cache/conftool/dbconfig/20251129-070836-marostegui.json
- 06:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P86133 and previous config saved to /var/cache/conftool/dbconfig/20251129-065329-marostegui.json
- 06:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86132 and previous config saved to /var/cache/conftool/dbconfig/20251129-063822-marostegui.json
- 02:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2217 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86131 and previous config saved to /var/cache/conftool/dbconfig/20251129-025500-marostegui.json
- 02:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2217.codfw.wmnet with reason: Maintenance
- 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 12m 46s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:26 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1254 (T410589)', diff saved to https://phabricator.wikimedia.org/P86130 and previous config saved to /var/cache/conftool/dbconfig/20251129-002650-ladsgroup.json
- 00:26 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1254.eqiad.wmnet with reason: Maintenance
2025-11-28
- 22:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2197.codfw.wmnet with reason: Maintenance
- 22:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86129 and previous config saved to /var/cache/conftool/dbconfig/20251128-225427-marostegui.json
- 22:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P86128 and previous config saved to /var/cache/conftool/dbconfig/20251128-223920-marostegui.json
- 22:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P86127 and previous config saved to /var/cache/conftool/dbconfig/20251128-222412-marostegui.json
- 22:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86126 and previous config saved to /var/cache/conftool/dbconfig/20251128-220904-marostegui.json
- 21:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2193 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86125 and previous config saved to /var/cache/conftool/dbconfig/20251128-211632-marostegui.json
- 21:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2193.codfw.wmnet with reason: Maintenance
- 21:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86124 and previous config saved to /var/cache/conftool/dbconfig/20251128-211608-marostegui.json
- 21:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P86123 and previous config saved to /var/cache/conftool/dbconfig/20251128-210100-marostegui.json
- 20:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P86122 and previous config saved to /var/cache/conftool/dbconfig/20251128-204552-marostegui.json
- 20:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86121 and previous config saved to /var/cache/conftool/dbconfig/20251128-203045-marostegui.json
- 19:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1028.eqiad.wmnet with OS bookworm
- 19:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2180 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86120 and previous config saved to /var/cache/conftool/dbconfig/20251128-192843-marostegui.json
- 19:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2180.codfw.wmnet with reason: Maintenance
- 19:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86119 and previous config saved to /var/cache/conftool/dbconfig/20251128-192818-marostegui.json
- 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P86118 and previous config saved to /var/cache/conftool/dbconfig/20251128-191311-marostegui.json
- 18:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P86117 and previous config saved to /var/cache/conftool/dbconfig/20251128-185803-marostegui.json
- 18:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86116 and previous config saved to /var/cache/conftool/dbconfig/20251128-184256-marostegui.json
- 18:38 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1032.eqiad.wmnet with OS bookworm
- 18:20 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1028.eqiad.wmnet with OS bookworm
- 18:16 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1032.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:14 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:14 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1032.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:11 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:02 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1031.eqiad.wmnet with OS bookworm
- 17:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1030.eqiad.wmnet with OS bookworm
- 17:49 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1029.eqiad.wmnet with OS bookworm
- 17:49 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 17:48 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 17:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
- 17:39 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage
- 17:37 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
- 17:32 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage
- 17:28 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 17:22 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 17:18 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1031.eqiad.wmnet with OS bookworm
- 17:18 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1032.eqiad.wmnet with OS bookworm
- 17:13 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1030.eqiad.wmnet with OS bookworm
- 17:04 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS bookworm
- 17:04 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1028.eqiad.wmnet with OS trixie
- 16:17 tappof: Added 100 GB to /srv LV on titan1001/1002/2002 (T410152)
- 15:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 15:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T410531)', diff saved to https://phabricator.wikimedia.org/P86114 and previous config saved to /var/cache/conftool/dbconfig/20251128-152153-marostegui.json
- 15:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P86113 and previous config saved to /var/cache/conftool/dbconfig/20251128-150646-marostegui.json
- 14:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P86112 and previous config saved to /var/cache/conftool/dbconfig/20251128-145138-marostegui.json
- 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T410531)', diff saved to https://phabricator.wikimedia.org/P86111 and previous config saved to /var/cache/conftool/dbconfig/20251128-143631-marostegui.json
- 14:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1251 (T410531)', diff saved to https://phabricator.wikimedia.org/P86110 and previous config saved to /var/cache/conftool/dbconfig/20251128-143029-marostegui.json
- 14:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1251.eqiad.wmnet with reason: Maintenance
- 14:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2169 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86109 and previous config saved to /var/cache/conftool/dbconfig/20251128-142748-marostegui.json
- 14:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 14:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86108 and previous config saved to /var/cache/conftool/dbconfig/20251128-142725-marostegui.json
- 14:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 14:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1239.eqiad.wmnet with reason: Maintenance
- 14:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T410531)', diff saved to https://phabricator.wikimedia.org/P86107 and previous config saved to /var/cache/conftool/dbconfig/20251128-142133-marostegui.json
- 14:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P86106 and previous config saved to /var/cache/conftool/dbconfig/20251128-141217-marostegui.json
- 14:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P86105 and previous config saved to /var/cache/conftool/dbconfig/20251128-140626-marostegui.json
- 13:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P86104 and previous config saved to /var/cache/conftool/dbconfig/20251128-135710-marostegui.json
- 13:56 hashar@deploy2002: Finished deploy [integration/docroot@607a959]: build: Updating eslint-config-wikimedia to 0.32.2 (duration: 00m 11s)
- 13:56 hashar@deploy2002: Started deploy [integration/docroot@607a959]: build: Updating eslint-config-wikimedia to 0.32.2
- 13:52 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1028.eqiad.wmnet with OS bookworm
- 13:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P86103 and previous config saved to /var/cache/conftool/dbconfig/20251128-135119-marostegui.json
- 13:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86102 and previous config saved to /var/cache/conftool/dbconfig/20251128-134202-marostegui.json
- 13:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T410531)', diff saved to https://phabricator.wikimedia.org/P86101 and previous config saved to /var/cache/conftool/dbconfig/20251128-133610-marostegui.json
- 13:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1235 (T410531)', diff saved to https://phabricator.wikimedia.org/P86100 and previous config saved to /var/cache/conftool/dbconfig/20251128-133006-marostegui.json
- 13:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1235.eqiad.wmnet with reason: Maintenance
- 13:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T410531)', diff saved to https://phabricator.wikimedia.org/P86099 and previous config saved to /var/cache/conftool/dbconfig/20251128-132943-marostegui.json
- 13:29 ladsgroup@deploy2002: Finished scap sync-world: Backport for Revert "Deploy 2025 Global Readers Survey (non-enwiki)" (T410696) (duration: 06m 30s)
- 13:25 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1032.eqiad.wmnet with OS trixie
- 13:25 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 13:25 ladsgroup@deploy2002: ladsgroup: Backport for Revert "Deploy 2025 Global Readers Survey (non-enwiki)" (T410696) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:23 ladsgroup@deploy2002: Started scap sync-world: Backport for Revert "Deploy 2025 Global Readers Survey (non-enwiki)" (T410696)
- 13:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P86098 and previous config saved to /var/cache/conftool/dbconfig/20251128-131435-marostegui.json
- 12:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P86097 and previous config saved to /var/cache/conftool/dbconfig/20251128-125928-marostegui.json
- 12:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T410531)', diff saved to https://phabricator.wikimedia.org/P86096 and previous config saved to /var/cache/conftool/dbconfig/20251128-124420-marostegui.json
- 12:39 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1239.eqiad.wmnet with reason: Maintenance
- 12:39 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T410589)', diff saved to https://phabricator.wikimedia.org/P86095 and previous config saved to /var/cache/conftool/dbconfig/20251128-123902-ladsgroup.json
- 12:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1234 (T410531)', diff saved to https://phabricator.wikimedia.org/P86094 and previous config saved to /var/cache/conftool/dbconfig/20251128-123825-marostegui.json
- 12:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1234.eqiad.wmnet with reason: Maintenance
- 12:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T410531)', diff saved to https://phabricator.wikimedia.org/P86093 and previous config saved to /var/cache/conftool/dbconfig/20251128-123801-marostegui.json
- 12:23 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P86092 and previous config saved to /var/cache/conftool/dbconfig/20251128-122354-ladsgroup.json
- 12:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P86091 and previous config saved to /var/cache/conftool/dbconfig/20251128-122253-marostegui.json
- 12:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P86090 and previous config saved to /var/cache/conftool/dbconfig/20251128-120847-ladsgroup.json
- 12:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P86089 and previous config saved to /var/cache/conftool/dbconfig/20251128-120746-marostegui.json
- 11:53 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T410589)', diff saved to https://phabricator.wikimedia.org/P86088 and previous config saved to /var/cache/conftool/dbconfig/20251128-115340-ladsgroup.json
- 11:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T410531)', diff saved to https://phabricator.wikimedia.org/P86087 and previous config saved to /var/cache/conftool/dbconfig/20251128-115238-marostegui.json
- 11:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1232 (T410531)', diff saved to https://phabricator.wikimedia.org/P86086 and previous config saved to /var/cache/conftool/dbconfig/20251128-114640-marostegui.json
- 11:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1232.eqiad.wmnet with reason: Maintenance
- 11:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T410531)', diff saved to https://phabricator.wikimedia.org/P86085 and previous config saved to /var/cache/conftool/dbconfig/20251128-114615-marostegui.json
- 11:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P86084 and previous config saved to /var/cache/conftool/dbconfig/20251128-113107-marostegui.json
- 11:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P86083 and previous config saved to /var/cache/conftool/dbconfig/20251128-111600-marostegui.json
- 11:15 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T410531)', diff saved to https://phabricator.wikimedia.org/P86082 and previous config saved to /var/cache/conftool/dbconfig/20251128-110052-marostegui.json
- 10:57 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2239.codfw.wmnet with reason: Upgrade and reboot
- 10:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1219 (T410531)', diff saved to https://phabricator.wikimedia.org/P86081 and previous config saved to /var/cache/conftool/dbconfig/20251128-105451-marostegui.json
- 10:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 10:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T410531)', diff saved to https://phabricator.wikimedia.org/P86080 and previous config saved to /var/cache/conftool/dbconfig/20251128-105427-marostegui.json
- 10:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P86079 and previous config saved to /var/cache/conftool/dbconfig/20251128-103920-marostegui.json
- 10:39 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 10:38 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 10:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 10:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 10:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 10:26 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 10:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P86078 and previous config saved to /var/cache/conftool/dbconfig/20251128-102412-marostegui.json
- 10:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T410531)', diff saved to https://phabricator.wikimedia.org/P86077 and previous config saved to /var/cache/conftool/dbconfig/20251128-100905-marostegui.json
- 10:05 bwojtowicz@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 10:04 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1001.eqiad.wmnet with OS trixie
- 10:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1218 (T410531)', diff saved to https://phabricator.wikimedia.org/P86076 and previous config saved to /var/cache/conftool/dbconfig/20251128-100258-marostegui.json
- 10:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1218.eqiad.wmnet with reason: Maintenance
- 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T410531)', diff saved to https://phabricator.wikimedia.org/P86075 and previous config saved to /var/cache/conftool/dbconfig/20251128-100234-marostegui.json
- 10:02 bwojtowicz@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 09:48 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: host reimage
- 09:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P86074 and previous config saved to /var/cache/conftool/dbconfig/20251128-094727-marostegui.json
- 09:44 klausman@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: host reimage
- 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P86073 and previous config saved to /var/cache/conftool/dbconfig/20251128-093219-marostegui.json
- 09:27 klausman@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1001.eqiad.wmnet with OS trixie
- 09:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2158 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86072 and previous config saved to /var/cache/conftool/dbconfig/20251128-092341-marostegui.json
- 09:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2158.codfw.wmnet with reason: Maintenance
- 09:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86071 and previous config saved to /var/cache/conftool/dbconfig/20251128-092318-marostegui.json
- 09:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T410531)', diff saved to https://phabricator.wikimedia.org/P86070 and previous config saved to /var/cache/conftool/dbconfig/20251128-091712-marostegui.json
- 09:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1206 (T410531)', diff saved to https://phabricator.wikimedia.org/P86069 and previous config saved to /var/cache/conftool/dbconfig/20251128-091116-marostegui.json
- 09:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1206.eqiad.wmnet with reason: Maintenance
- 09:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T410531)', diff saved to https://phabricator.wikimedia.org/P86068 and previous config saved to /var/cache/conftool/dbconfig/20251128-091052-marostegui.json
- 09:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P86067 and previous config saved to /var/cache/conftool/dbconfig/20251128-090810-marostegui.json
- 08:59 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2239.codfw.wmnet with reason: Upgrade and reboot
- 08:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P86066 and previous config saved to /var/cache/conftool/dbconfig/20251128-085544-marostegui.json
- 08:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P86065 and previous config saved to /var/cache/conftool/dbconfig/20251128-085303-marostegui.json
- 08:50 brouberol@dns1004: END - running authdns-update
- 08:49 brouberol@dns1004: START - running authdns-update
- 08:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P86064 and previous config saved to /var/cache/conftool/dbconfig/20251128-084037-marostegui.json
- 08:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86063 and previous config saved to /var/cache/conftool/dbconfig/20251128-083755-marostegui.json
- 08:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T410531)', diff saved to https://phabricator.wikimedia.org/P86062 and previous config saved to /var/cache/conftool/dbconfig/20251128-082529-marostegui.json
- 08:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1196 (T410531)', diff saved to https://phabricator.wikimedia.org/P86061 and previous config saved to /var/cache/conftool/dbconfig/20251128-081852-marostegui.json
- 08:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 08:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1196.eqiad.wmnet with reason: Maintenance
- 08:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T410531)', diff saved to https://phabricator.wikimedia.org/P86060 and previous config saved to /var/cache/conftool/dbconfig/20251128-081820-marostegui.json
- 08:08 moritzm: installing Linux 6.1.158 kernel on Bookworm hosts
- 08:05 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade gitlab
- 08:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P86059 and previous config saved to /var/cache/conftool/dbconfig/20251128-080312-marostegui.json
- 07:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P86058 and previous config saved to /var/cache/conftool/dbconfig/20251128-074804-marostegui.json
- 07:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T410531)', diff saved to https://phabricator.wikimedia.org/P86057 and previous config saved to /var/cache/conftool/dbconfig/20251128-073257-marostegui.json
- 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1184 gradually with 4 steps - After testing
- 07:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1195.eqiad.wmnet with reason: Maintenance
- 07:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T410531)', diff saved to https://phabricator.wikimedia.org/P86055 and previous config saved to /var/cache/conftool/dbconfig/20251128-072551-marostegui.json
- 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P86053 and previous config saved to /var/cache/conftool/dbconfig/20251128-071043-marostegui.json
- 06:57 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade gitlab
- 06:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P86051 and previous config saved to /var/cache/conftool/dbconfig/20251128-065536-marostegui.json
- 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1184 gradually with 4 steps - After testing
- 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T410531)', diff saved to https://phabricator.wikimedia.org/P86049 and previous config saved to /var/cache/conftool/dbconfig/20251128-064028-marostegui.json
- 06:33 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1186 (T410531)', diff saved to https://phabricator.wikimedia.org/P86048 and previous config saved to /var/cache/conftool/dbconfig/20251128-063341-marostegui.json
- 06:33 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1186.eqiad.wmnet with reason: Maintenance
- 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2175 gradually with 4 steps - After schema change
- 06:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 06:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1160.eqiad.wmnet with reason: Maintenance
- 06:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2204.codfw.wmnet with reason: Maintenance
- 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2175 gradually with 4 steps - After schema change
- 05:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T410531)', diff saved to https://phabricator.wikimedia.org/P86044 and previous config saved to /var/cache/conftool/dbconfig/20251128-055303-marostegui.json
- 05:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2175 (T410531)', diff saved to https://phabricator.wikimedia.org/P86043 and previous config saved to /var/cache/conftool/dbconfig/20251128-054641-marostegui.json
- 05:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2175.codfw.wmnet with reason: Maintenance
- 05:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T410531)', diff saved to https://phabricator.wikimedia.org/P86042 and previous config saved to /var/cache/conftool/dbconfig/20251128-054628-marostegui.json
- 05:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db[1155-1156].eqiad.wmnet with reason: Schema change
- 05:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P86041 and previous config saved to /var/cache/conftool/dbconfig/20251128-053121-marostegui.json
- 05:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P86040 and previous config saved to /var/cache/conftool/dbconfig/20251128-051613-marostegui.json
- 05:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T410531)', diff saved to https://phabricator.wikimedia.org/P86039 and previous config saved to /var/cache/conftool/dbconfig/20251128-050106-marostegui.json
- 04:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2148 (T410531)', diff saved to https://phabricator.wikimedia.org/P86038 and previous config saved to /var/cache/conftool/dbconfig/20251128-045437-marostegui.json
- 04:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2148.codfw.wmnet with reason: Maintenance
- 04:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 04:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 04:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 04:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 04:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (T410531)', diff saved to https://phabricator.wikimedia.org/P86037 and previous config saved to /var/cache/conftool/dbconfig/20251128-043018-marostegui.json
- 04:22 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2151 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86036 and previous config saved to /var/cache/conftool/dbconfig/20251128-042254-marostegui.json
- 04:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2151.codfw.wmnet with reason: Maintenance
- 04:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P86035 and previous config saved to /var/cache/conftool/dbconfig/20251128-041511-marostegui.json
- 04:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P86034 and previous config saved to /var/cache/conftool/dbconfig/20251128-040004-marostegui.json
- 03:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (T410531)', diff saved to https://phabricator.wikimedia.org/P86033 and previous config saved to /var/cache/conftool/dbconfig/20251128-034457-marostegui.json
- 03:29 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1263 (T410531)', diff saved to https://phabricator.wikimedia.org/P86032 and previous config saved to /var/cache/conftool/dbconfig/20251128-032949-marostegui.json
- 03:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1263.eqiad.wmnet with reason: Maintenance
- 03:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 (T410531)', diff saved to https://phabricator.wikimedia.org/P86031 and previous config saved to /var/cache/conftool/dbconfig/20251128-032926-marostegui.json
- 03:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P86030 and previous config saved to /var/cache/conftool/dbconfig/20251128-031418-marostegui.json
- 02:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P86029 and previous config saved to /var/cache/conftool/dbconfig/20251128-025911-marostegui.json
- 02:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 (T410531)', diff saved to https://phabricator.wikimedia.org/P86028 and previous config saved to /var/cache/conftool/dbconfig/20251128-024403-marostegui.json
- 02:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1262 (T410531)', diff saved to https://phabricator.wikimedia.org/P86027 and previous config saved to /var/cache/conftool/dbconfig/20251128-022801-marostegui.json
- 02:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1262.eqiad.wmnet with reason: Maintenance
- 02:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (T410531)', diff saved to https://phabricator.wikimedia.org/P86026 and previous config saved to /var/cache/conftool/dbconfig/20251128-022737-marostegui.json
- 02:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P86025 and previous config saved to /var/cache/conftool/dbconfig/20251128-021229-marostegui.json
- 01:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P86024 and previous config saved to /var/cache/conftool/dbconfig/20251128-015722-marostegui.json
- 01:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (T410531)', diff saved to https://phabricator.wikimedia.org/P86023 and previous config saved to /var/cache/conftool/dbconfig/20251128-014214-marostegui.json
- 01:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 (T410531)', diff saved to https://phabricator.wikimedia.org/P86022 and previous config saved to /var/cache/conftool/dbconfig/20251128-013423-marostegui.json
- 01:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1261 (T410531)', diff saved to https://phabricator.wikimedia.org/P86021 and previous config saved to /var/cache/conftool/dbconfig/20251128-012633-marostegui.json
- 01:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1261.eqiad.wmnet with reason: Maintenance
- 01:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (T410531)', diff saved to https://phabricator.wikimedia.org/P86020 and previous config saved to /var/cache/conftool/dbconfig/20251128-012608-marostegui.json
- 01:23 tstarling@deploy2002: Finished scap sync-world: Backport for Fix accidentally removed stylesheet (T279736) (duration: 09m 33s)
- 01:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P86019 and previous config saved to /var/cache/conftool/dbconfig/20251128-011916-marostegui.json
- 01:18 tstarling@deploy2002: tstarling: Continuing with sync
- 01:16 tstarling@deploy2002: tstarling: Backport for Fix accidentally removed stylesheet (T279736) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 01:14 tstarling@deploy2002: Started scap sync-world: Backport for Fix accidentally removed stylesheet (T279736)
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 15s)
- 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P86018 and previous config saved to /var/cache/conftool/dbconfig/20251128-011101-marostegui.json
- 01:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P86017 and previous config saved to /var/cache/conftool/dbconfig/20251128-010408-marostegui.json
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P86016 and previous config saved to /var/cache/conftool/dbconfig/20251128-005553-marostegui.json
- 00:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 (T410531)', diff saved to https://phabricator.wikimedia.org/P86015 and previous config saved to /var/cache/conftool/dbconfig/20251128-004901-marostegui.json
- 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (T410531)', diff saved to https://phabricator.wikimedia.org/P86014 and previous config saved to /var/cache/conftool/dbconfig/20251128-004046-marostegui.json
- 00:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2248 (T410531)', diff saved to https://phabricator.wikimedia.org/P86013 and previous config saved to /var/cache/conftool/dbconfig/20251128-003231-marostegui.json
- 00:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2248.codfw.wmnet with reason: Maintenance
- 00:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 (T410531)', diff saved to https://phabricator.wikimedia.org/P86012 and previous config saved to /var/cache/conftool/dbconfig/20251128-003206-marostegui.json
- 00:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1260 (T410531)', diff saved to https://phabricator.wikimedia.org/P86011 and previous config saved to /var/cache/conftool/dbconfig/20251128-002134-marostegui.json
- 00:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1260.eqiad.wmnet with reason: Maintenance
- 00:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (T410531)', diff saved to https://phabricator.wikimedia.org/P86010 and previous config saved to /var/cache/conftool/dbconfig/20251128-002111-marostegui.json
- 00:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P86009 and previous config saved to /var/cache/conftool/dbconfig/20251128-001658-marostegui.json
- 00:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P86008 and previous config saved to /var/cache/conftool/dbconfig/20251128-000604-marostegui.json
- 00:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P86007 and previous config saved to /var/cache/conftool/dbconfig/20251128-000151-marostegui.json
2025-11-27
- 23:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P86006 and previous config saved to /var/cache/conftool/dbconfig/20251127-235056-marostegui.json
- 23:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 (T410531)', diff saved to https://phabricator.wikimedia.org/P86005 and previous config saved to /var/cache/conftool/dbconfig/20251127-234643-marostegui.json
- 23:37 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (T410531)', diff saved to https://phabricator.wikimedia.org/P86004 and previous config saved to /var/cache/conftool/dbconfig/20251127-233549-marostegui.json
- 23:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2247 (T410531)', diff saved to https://phabricator.wikimedia.org/P86003 and previous config saved to /var/cache/conftool/dbconfig/20251127-233005-marostegui.json
- 23:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2247.codfw.wmnet with reason: Maintenance
- 23:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 (T410531)', diff saved to https://phabricator.wikimedia.org/P86002 and previous config saved to /var/cache/conftool/dbconfig/20251127-232941-marostegui.json
- 23:27 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1233 (T410589)', diff saved to https://phabricator.wikimedia.org/P86001 and previous config saved to /var/cache/conftool/dbconfig/20251127-232755-ladsgroup.json
- 23:27 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1233.eqiad.wmnet with reason: Maintenance
- 23:27 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T410589)', diff saved to https://phabricator.wikimedia.org/P86000 and previous config saved to /var/cache/conftool/dbconfig/20251127-232731-ladsgroup.json
- 23:20 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1252 (T410531)', diff saved to https://phabricator.wikimedia.org/P85999 and previous config saved to /var/cache/conftool/dbconfig/20251127-232005-marostegui.json
- 23:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1252.eqiad.wmnet with reason: Maintenance
- 23:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T410531)', diff saved to https://phabricator.wikimedia.org/P85998 and previous config saved to /var/cache/conftool/dbconfig/20251127-231941-marostegui.json
- 23:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P85997 and previous config saved to /var/cache/conftool/dbconfig/20251127-231433-marostegui.json
- 23:12 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P85996 and previous config saved to /var/cache/conftool/dbconfig/20251127-231224-ladsgroup.json
- 23:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P85995 and previous config saved to /var/cache/conftool/dbconfig/20251127-230433-marostegui.json
- 22:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P85994 and previous config saved to /var/cache/conftool/dbconfig/20251127-225926-marostegui.json
- 22:57 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P85993 and previous config saved to /var/cache/conftool/dbconfig/20251127-225716-ladsgroup.json
- 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P85992 and previous config saved to /var/cache/conftool/dbconfig/20251127-224926-marostegui.json
- 22:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 (T410531)', diff saved to https://phabricator.wikimedia.org/P85991 and previous config saved to /var/cache/conftool/dbconfig/20251127-224418-marostegui.json
- 22:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T410589)', diff saved to https://phabricator.wikimedia.org/P85990 and previous config saved to /var/cache/conftool/dbconfig/20251127-224208-ladsgroup.json
- 22:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T410531)', diff saved to https://phabricator.wikimedia.org/P85989 and previous config saved to /var/cache/conftool/dbconfig/20251127-223418-marostegui.json
- 22:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2246 (T410531)', diff saved to https://phabricator.wikimedia.org/P85988 and previous config saved to /var/cache/conftool/dbconfig/20251127-222658-marostegui.json
- 22:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2246.codfw.wmnet with reason: Maintenance
- 22:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 (T410531)', diff saved to https://phabricator.wikimedia.org/P85987 and previous config saved to /var/cache/conftool/dbconfig/20251127-222635-marostegui.json
- 22:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1249 (T410531)', diff saved to https://phabricator.wikimedia.org/P85986 and previous config saved to /var/cache/conftool/dbconfig/20251127-221806-marostegui.json
- 22:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1249.eqiad.wmnet with reason: Maintenance
- 22:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T410531)', diff saved to https://phabricator.wikimedia.org/P85985 and previous config saved to /var/cache/conftool/dbconfig/20251127-221742-marostegui.json
- 22:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P85984 and previous config saved to /var/cache/conftool/dbconfig/20251127-221127-marostegui.json
- 22:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P85983 and previous config saved to /var/cache/conftool/dbconfig/20251127-220235-marostegui.json
- 21:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P85982 and previous config saved to /var/cache/conftool/dbconfig/20251127-215619-marostegui.json
- 21:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P85981 and previous config saved to /var/cache/conftool/dbconfig/20251127-214726-marostegui.json
- 21:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 (T410531)', diff saved to https://phabricator.wikimedia.org/P85980 and previous config saved to /var/cache/conftool/dbconfig/20251127-214112-marostegui.json
- 21:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T410531)', diff saved to https://phabricator.wikimedia.org/P85979 and previous config saved to /var/cache/conftool/dbconfig/20251127-213218-marostegui.json
- 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2245 (T410531)', diff saved to https://phabricator.wikimedia.org/P85978 and previous config saved to /var/cache/conftool/dbconfig/20251127-212345-marostegui.json
- 21:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2245.codfw.wmnet with reason: Maintenance
- 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 (T410531)', diff saved to https://phabricator.wikimedia.org/P85977 and previous config saved to /var/cache/conftool/dbconfig/20251127-212322-marostegui.json
- 21:21 dani@deploy2002: Finished scap sync-world: Backport for Undeploy 2025 Global Readers Survey on enwiki (T410696), Deploy experiment for 2025 Global Readers Survey (T410696), Deploy 2025 Global Readers Survey (non-enwiki) (T410696) (duration: 15m 34s)
- 21:16 dani@deploy2002: dani: Continuing with sync
- 21:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1248 (T410531)', diff saved to https://phabricator.wikimedia.org/P85976 and previous config saved to /var/cache/conftool/dbconfig/20251127-211610-marostegui.json
- 21:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1248.eqiad.wmnet with reason: Maintenance
- 21:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T410531)', diff saved to https://phabricator.wikimedia.org/P85975 and previous config saved to /var/cache/conftool/dbconfig/20251127-211547-marostegui.json
- 21:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P85974 and previous config saved to /var/cache/conftool/dbconfig/20251127-210814-marostegui.json
- 21:07 dani@deploy2002: dani: Backport for Undeploy 2025 Global Readers Survey on enwiki (T410696), Deploy experiment for 2025 Global Readers Survey (T410696), Deploy 2025 Global Readers Survey (non-enwiki) (T410696) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:05 dani@deploy2002: Started scap sync-world: Backport for Undeploy 2025 Global Readers Survey on enwiki (T410696), Deploy experiment for 2025 Global Readers Survey (T410696), Deploy 2025 Global Readers Survey (non-enwiki) (T410696)
- 21:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P85973 and previous config saved to /var/cache/conftool/dbconfig/20251127-210039-marostegui.json
- 20:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P85972 and previous config saved to /var/cache/conftool/dbconfig/20251127-205307-marostegui.json
- 20:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P85971 and previous config saved to /var/cache/conftool/dbconfig/20251127-204532-marostegui.json
- 20:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 (T410531)', diff saved to https://phabricator.wikimedia.org/P85970 and previous config saved to /var/cache/conftool/dbconfig/20251127-203759-marostegui.json
- 20:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T410531)', diff saved to https://phabricator.wikimedia.org/P85969 and previous config saved to /var/cache/conftool/dbconfig/20251127-203024-marostegui.json
- 20:20 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2240 (T410531)', diff saved to https://phabricator.wikimedia.org/P85968 and previous config saved to /var/cache/conftool/dbconfig/20251127-201958-marostegui.json
- 20:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2240.codfw.wmnet with reason: Maintenance
- 20:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1247 (T410531)', diff saved to https://phabricator.wikimedia.org/P85967 and previous config saved to /var/cache/conftool/dbconfig/20251127-201415-marostegui.json
- 20:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1247.eqiad.wmnet with reason: Maintenance
- 20:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2239.codfw.wmnet with reason: Maintenance
- 20:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T410531)', diff saved to https://phabricator.wikimedia.org/P85966 and previous config saved to /var/cache/conftool/dbconfig/20251127-200502-marostegui.json
- 19:59 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 19:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T410531)', diff saved to https://phabricator.wikimedia.org/P85965 and previous config saved to /var/cache/conftool/dbconfig/20251127-195925-marostegui.json
- 19:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P85964 and previous config saved to /var/cache/conftool/dbconfig/20251127-194954-marostegui.json
- 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P85963 and previous config saved to /var/cache/conftool/dbconfig/20251127-194417-marostegui.json
- 19:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 19:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 19:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P85962 and previous config saved to /var/cache/conftool/dbconfig/20251127-194137-marostegui.json
- 19:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P85961 and previous config saved to /var/cache/conftool/dbconfig/20251127-193446-marostegui.json
- 19:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P85960 and previous config saved to /var/cache/conftool/dbconfig/20251127-192909-marostegui.json
- 19:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P85959 and previous config saved to /var/cache/conftool/dbconfig/20251127-192630-marostegui.json
- 19:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T410531)', diff saved to https://phabricator.wikimedia.org/P85958 and previous config saved to /var/cache/conftool/dbconfig/20251127-191939-marostegui.json
- 19:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T410531)', diff saved to https://phabricator.wikimedia.org/P85957 and previous config saved to /var/cache/conftool/dbconfig/20251127-191402-marostegui.json
- 19:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P85956 and previous config saved to /var/cache/conftool/dbconfig/20251127-191122-marostegui.json
- 19:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2237 (T410531)', diff saved to https://phabricator.wikimedia.org/P85955 and previous config saved to /var/cache/conftool/dbconfig/20251127-190338-marostegui.json
- 19:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2237.codfw.wmnet with reason: Maintenance
- 19:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T410531)', diff saved to https://phabricator.wikimedia.org/P85954 and previous config saved to /var/cache/conftool/dbconfig/20251127-190314-marostegui.json
- 18:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1244 (T410531)', diff saved to https://phabricator.wikimedia.org/P85953 and previous config saved to /var/cache/conftool/dbconfig/20251127-185751-marostegui.json
- 18:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1244.eqiad.wmnet with reason: Maintenance
- 18:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T410531)', diff saved to https://phabricator.wikimedia.org/P85952 and previous config saved to /var/cache/conftool/dbconfig/20251127-185727-marostegui.json
- 18:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P85951 and previous config saved to /var/cache/conftool/dbconfig/20251127-185615-marostegui.json
- 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P85950 and previous config saved to /var/cache/conftool/dbconfig/20251127-184806-marostegui.json
- 18:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P85949 and previous config saved to /var/cache/conftool/dbconfig/20251127-184220-marostegui.json
- 18:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P85948 and previous config saved to /var/cache/conftool/dbconfig/20251127-183259-marostegui.json
- 18:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P85947 and previous config saved to /var/cache/conftool/dbconfig/20251127-182712-marostegui.json
- 18:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T410531)', diff saved to https://phabricator.wikimedia.org/P85946 and previous config saved to /var/cache/conftool/dbconfig/20251127-181751-marostegui.json
- 18:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T410531)', diff saved to https://phabricator.wikimedia.org/P85945 and previous config saved to /var/cache/conftool/dbconfig/20251127-181205-marostegui.json
- 18:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2236 (T410531)', diff saved to https://phabricator.wikimedia.org/P85944 and previous config saved to /var/cache/conftool/dbconfig/20251127-180116-marostegui.json
- 18:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2236.codfw.wmnet with reason: Maintenance
- 18:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T410531)', diff saved to https://phabricator.wikimedia.org/P85943 and previous config saved to /var/cache/conftool/dbconfig/20251127-180053-marostegui.json
- 17:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1243 (T410531)', diff saved to https://phabricator.wikimedia.org/P85942 and previous config saved to /var/cache/conftool/dbconfig/20251127-175638-marostegui.json
- 17:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1243.eqiad.wmnet with reason: Maintenance
- 17:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T410531)', diff saved to https://phabricator.wikimedia.org/P85941 and previous config saved to /var/cache/conftool/dbconfig/20251127-175615-marostegui.json
- 17:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1180 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P85940 and previous config saved to /var/cache/conftool/dbconfig/20251127-175513-marostegui.json
- 17:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 17:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P85939 and previous config saved to /var/cache/conftool/dbconfig/20251127-175449-marostegui.json
- 17:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P85938 and previous config saved to /var/cache/conftool/dbconfig/20251127-174545-marostegui.json
- 17:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P85937 and previous config saved to /var/cache/conftool/dbconfig/20251127-174107-marostegui.json
- 17:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P85936 and previous config saved to /var/cache/conftool/dbconfig/20251127-173942-marostegui.json
- 17:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P85935 and previous config saved to /var/cache/conftool/dbconfig/20251127-173038-marostegui.json
- 17:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P85934 and previous config saved to /var/cache/conftool/dbconfig/20251127-172559-marostegui.json
- 17:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P85933 and previous config saved to /var/cache/conftool/dbconfig/20251127-172434-marostegui.json
- 17:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T410531)', diff saved to https://phabricator.wikimedia.org/P85932 and previous config saved to /var/cache/conftool/dbconfig/20251127-171530-marostegui.json
- 17:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T410531)', diff saved to https://phabricator.wikimedia.org/P85931 and previous config saved to /var/cache/conftool/dbconfig/20251127-171052-marostegui.json
- 17:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P85930 and previous config saved to /var/cache/conftool/dbconfig/20251127-170927-marostegui.json
- 16:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2219 (T410531)', diff saved to https://phabricator.wikimedia.org/P85929 and previous config saved to /var/cache/conftool/dbconfig/20251127-165827-marostegui.json
- 16:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 16:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T410531)', diff saved to https://phabricator.wikimedia.org/P85928 and previous config saved to /var/cache/conftool/dbconfig/20251127-165803-marostegui.json
- 16:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1242 (T410531)', diff saved to https://phabricator.wikimedia.org/P85927 and previous config saved to /var/cache/conftool/dbconfig/20251127-165438-marostegui.json
- 16:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1242.eqiad.wmnet with reason: Maintenance
- 16:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T410531)', diff saved to https://phabricator.wikimedia.org/P85926 and previous config saved to /var/cache/conftool/dbconfig/20251127-165414-marostegui.json
- 16:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P85925 and previous config saved to /var/cache/conftool/dbconfig/20251127-164255-marostegui.json
- 16:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P85924 and previous config saved to /var/cache/conftool/dbconfig/20251127-163907-marostegui.json
- 16:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 16:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P85923 and previous config saved to /var/cache/conftool/dbconfig/20251127-162748-marostegui.json
- 16:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P85922 and previous config saved to /var/cache/conftool/dbconfig/20251127-162359-marostegui.json
- 16:17 moritzm: upgrade Envoy on chartmuseum* T405808
- 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1169.eqiad.wmnet with OS bookworm
- 16:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T410531)', diff saved to https://phabricator.wikimedia.org/P85921 and previous config saved to /var/cache/conftool/dbconfig/20251127-161240-marostegui.json
- 16:08 moritzm: installing unbound security updates
- 16:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T410531)', diff saved to https://phabricator.wikimedia.org/P85920 and previous config saved to /var/cache/conftool/dbconfig/20251127-160852-marostegui.json
- 16:07 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1173 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P85919 and previous config saved to /var/cache/conftool/dbconfig/20251127-160723-marostegui.json
- 16:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
- 16:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P85918 and previous config saved to /var/cache/conftool/dbconfig/20251127-160659-marostegui.json
- 15:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2210 (T410531)', diff saved to https://phabricator.wikimedia.org/P85915 and previous config saved to /var/cache/conftool/dbconfig/20251127-155559-marostegui.json
- 15:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2210.codfw.wmnet with reason: Maintenance
- 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T410531)', diff saved to https://phabricator.wikimedia.org/P85914 and previous config saved to /var/cache/conftool/dbconfig/20251127-155535-marostegui.json
- 15:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1169.eqiad.wmnet with reason: host reimage
- 15:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1241 (T410531)', diff saved to https://phabricator.wikimedia.org/P85912 and previous config saved to /var/cache/conftool/dbconfig/20251127-155246-marostegui.json
- 15:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1241.eqiad.wmnet with reason: Maintenance
- 15:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T410531)', diff saved to https://phabricator.wikimedia.org/P85911 and previous config saved to /var/cache/conftool/dbconfig/20251127-155223-marostegui.json
- 15:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P85910 and previous config saved to /var/cache/conftool/dbconfig/20251127-155151-marostegui.json
- 15:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1169.eqiad.wmnet with reason: host reimage
- 15:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P85908 and previous config saved to /var/cache/conftool/dbconfig/20251127-154027-marostegui.json
- 15:39 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs4010.ulsfo.wmnet} and A:liberica
- 15:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P85907 and previous config saved to /var/cache/conftool/dbconfig/20251127-153715-marostegui.json
- 15:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P85906 and previous config saved to /var/cache/conftool/dbconfig/20251127-153644-marostegui.json
- 15:33 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs4010.ulsfo.wmnet} and A:liberica
- 15:32 moritzm: installing libarchive security updates
- 15:31 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host db1169.eqiad.wmnet with OS bookworm
- 15:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P85905 and previous config saved to /var/cache/conftool/dbconfig/20251127-152519-marostegui.json
- 15:25 dpogorzelski@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
- 15:24 dpogorzelski@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop: sync
- 15:24 dpogorzelski@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
- 15:24 dpogorzelski@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: sync
- 15:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P85904 and previous config saved to /var/cache/conftool/dbconfig/20251127-152208-marostegui.json
- 15:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P85903 and previous config saved to /var/cache/conftool/dbconfig/20251127-152136-marostegui.json
- 15:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T410531)', diff saved to https://phabricator.wikimedia.org/P85902 and previous config saved to /var/cache/conftool/dbconfig/20251127-151012-marostegui.json
- 15:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T410531)', diff saved to https://phabricator.wikimedia.org/P85901 and previous config saved to /var/cache/conftool/dbconfig/20251127-150701-marostegui.json
- 14:57 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 14:57 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/thumbor: apply
- 14:53 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2206 (T410531)', diff saved to https://phabricator.wikimedia.org/P85900 and previous config saved to /var/cache/conftool/dbconfig/20251127-145307-marostegui.json
- 14:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2206.codfw.wmnet with reason: Maintenance
- 14:50 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1238 (T410531)', diff saved to https://phabricator.wikimedia.org/P85899 and previous config saved to /var/cache/conftool/dbconfig/20251127-145048-marostegui.json
- 14:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 14:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1238.eqiad.wmnet with reason: Maintenance
- 14:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T410531)', diff saved to https://phabricator.wikimedia.org/P85898 and previous config saved to /var/cache/conftool/dbconfig/20251127-145024-marostegui.json
- 14:49 moritzm: installing expat security updates
- 14:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 14:39 sgimeno@deploy2002: Finished scap sync-world: Backport for tokwiki: add logos and sitename (T411119) (duration: 06m 53s)
- 14:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2199.codfw.wmnet with reason: Maintenance
- 14:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T410531)', diff saved to https://phabricator.wikimedia.org/P85897 and previous config saved to /var/cache/conftool/dbconfig/20251127-143813-marostegui.json
- 14:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 14:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 14:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P85896 and previous config saved to /var/cache/conftool/dbconfig/20251127-143517-marostegui.json
- 14:35 sgimeno@deploy2002: anzx, sgimeno: Continuing with sync
- 14:34 sgimeno@deploy2002: anzx, sgimeno: Backport for tokwiki: add logos and sitename (T411119) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:32 sgimeno@deploy2002: Started scap sync-world: Backport for tokwiki: add logos and sitename (T411119)
- 14:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1229 (T410589)', diff saved to https://phabricator.wikimedia.org/P85895 and previous config saved to /var/cache/conftool/dbconfig/20251127-143123-ladsgroup.json
- 14:31 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database tokwiki (T404570)
- 14:31 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1229.eqiad.wmnet with reason: Maintenance
- 14:28 sgimeno@deploy2002: Finished scap sync-world: Backport for fix(ReviseTone): only initialize once, fix(ReviseTone): render behind EditNotice on mobile, instrumentation(ReviseTone): fix stream for edits and refine exposure (T405177 T406252) (duration: 10m 46s)
- 14:25 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1168 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P85894 and previous config saved to /var/cache/conftool/dbconfig/20251127-142533-marostegui.json
- 14:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 14:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P85893 and previous config saved to /var/cache/conftool/dbconfig/20251127-142509-marostegui.json
- 14:24 sgimeno@deploy2002: sgimeno, migr: Continuing with sync
- 14:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P85892 and previous config saved to /var/cache/conftool/dbconfig/20251127-142306-marostegui.json
- 14:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P85891 and previous config saved to /var/cache/conftool/dbconfig/20251127-142009-marostegui.json
- 14:19 sgimeno@deploy2002: sgimeno, migr: Backport for fix(ReviseTone): only initialize once, fix(ReviseTone): render behind EditNotice on mobile, instrumentation(ReviseTone): fix stream for edits and refine exposure (T405177 T406252) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:18 sgimeno@deploy2002: Started scap sync-world: Backport for fix(ReviseTone): only initialize once, fix(ReviseTone): render behind EditNotice on mobile, instrumentation(ReviseTone): fix stream for edits and refine exposure (T405177 T406252)
- 14:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P85889 and previous config saved to /var/cache/conftool/dbconfig/20251127-141002-marostegui.json
- 14:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P85888 and previous config saved to /var/cache/conftool/dbconfig/20251127-140758-marostegui.json
- 14:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T410531)', diff saved to https://phabricator.wikimedia.org/P85887 and previous config saved to /var/cache/conftool/dbconfig/20251127-140502-marostegui.json
- 13:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P85886 and previous config saved to /var/cache/conftool/dbconfig/20251127-135454-marostegui.json
- 13:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T410531)', diff saved to https://phabricator.wikimedia.org/P85885 and previous config saved to /var/cache/conftool/dbconfig/20251127-135251-marostegui.json
- 13:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1221 (T410531)', diff saved to https://phabricator.wikimedia.org/P85884 and previous config saved to /var/cache/conftool/dbconfig/20251127-134816-marostegui.json
- 13:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on 6 hosts with reason: Maintenance
- 13:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1221.eqiad.wmnet with reason: Maintenance
- 13:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T410531)', diff saved to https://phabricator.wikimedia.org/P85883 and previous config saved to /var/cache/conftool/dbconfig/20251127-134730-marostegui.json
- 13:41 dcausse@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 13:40 dcausse@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 13:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P85881 and previous config saved to /var/cache/conftool/dbconfig/20251127-133947-marostegui.json
- 13:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1165 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P85880 and previous config saved to /var/cache/conftool/dbconfig/20251127-133838-marostegui.json
- 13:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 13:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 13:38 dcausse@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 13:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repool db1165', diff saved to https://phabricator.wikimedia.org/P85879 and previous config saved to /var/cache/conftool/dbconfig/20251127-133813-marostegui.json
- 13:37 dcausse@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 13:34 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database tokwiki (T404570)
- 13:34 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 13:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2172 (T410531)', diff saved to https://phabricator.wikimedia.org/P85878 and previous config saved to /var/cache/conftool/dbconfig/20251127-133411-marostegui.json
- 13:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 13:34 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 13:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T410531)', diff saved to https://phabricator.wikimedia.org/P85877 and previous config saved to /var/cache/conftool/dbconfig/20251127-133347-marostegui.json
- 13:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P85875 and previous config saved to /var/cache/conftool/dbconfig/20251127-133223-marostegui.json
- 13:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P85873 and previous config saved to /var/cache/conftool/dbconfig/20251127-132226-marostegui.json
- 13:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P85872 and previous config saved to /var/cache/conftool/dbconfig/20251127-131839-marostegui.json
- 13:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P85871 and previous config saved to /var/cache/conftool/dbconfig/20251127-131715-marostegui.json
- 13:14 taavi@deploy2002: mwscript-k8s job started: initEditCount --wiki=tokwiki
- 13:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P85870 and previous config saved to /var/cache/conftool/dbconfig/20251127-130719-marostegui.json
- 13:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1165 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P85869 and previous config saved to /var/cache/conftool/dbconfig/20251127-130509-marostegui.json
- 13:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 13:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 13:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P85868 and previous config saved to /var/cache/conftool/dbconfig/20251127-130332-marostegui.json
- 13:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet
- 13:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T410531)', diff saved to https://phabricator.wikimedia.org/P85867 and previous config saved to /var/cache/conftool/dbconfig/20251127-130208-marostegui.json
- 12:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet
- 12:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T410531)', diff saved to https://phabricator.wikimedia.org/P85866 and previous config saved to /var/cache/conftool/dbconfig/20251127-124824-marostegui.json
- 12:43 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1199 (T410531)', diff saved to https://phabricator.wikimedia.org/P85865 and previous config saved to /var/cache/conftool/dbconfig/20251127-124353-marostegui.json
- 12:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1199.eqiad.wmnet with reason: Maintenance
- 12:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T410531)', diff saved to https://phabricator.wikimedia.org/P85864 and previous config saved to /var/cache/conftool/dbconfig/20251127-124330-marostegui.json
- 12:34 moritzm: installing sqlite3 security updates
- 12:30 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Handle requests without a token (T411166) (duration: 10m 50s)
- 12:29 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2155 (T410531)', diff saved to https://phabricator.wikimedia.org/P85863 and previous config saved to /var/cache/conftool/dbconfig/20251127-122941-marostegui.json
- 12:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 12:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T410531)', diff saved to https://phabricator.wikimedia.org/P85862 and previous config saved to /var/cache/conftool/dbconfig/20251127-122918-marostegui.json
- 12:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P85861 and previous config saved to /var/cache/conftool/dbconfig/20251127-122822-marostegui.json
- 12:26 kharlan@deploy2002: kharlan: Continuing with sync
- 12:21 kharlan@deploy2002: kharlan: Backport for hCaptcha: Handle requests without a token (T411166) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:19 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2164 gradually with 4 steps - repool after schema change
- 12:19 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Handle requests without a token (T411166)
- 12:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P85859 and previous config saved to /var/cache/conftool/dbconfig/20251127-121410-marostegui.json
- 12:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P85858 and previous config saved to /var/cache/conftool/dbconfig/20251127-121315-marostegui.json
- 11:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P85856 and previous config saved to /var/cache/conftool/dbconfig/20251127-115903-marostegui.json
- 11:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T410531)', diff saved to https://phabricator.wikimedia.org/P85855 and previous config saved to /var/cache/conftool/dbconfig/20251127-115807-marostegui.json
- 11:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 11:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T410531)', diff saved to https://phabricator.wikimedia.org/P85852 and previous config saved to /var/cache/conftool/dbconfig/20251127-114355-marostegui.json
- 11:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1190 (T410531)', diff saved to https://phabricator.wikimedia.org/P85851 and previous config saved to /var/cache/conftool/dbconfig/20251127-114120-marostegui.json
- 11:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1190.eqiad.wmnet with reason: Maintenance
- 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 11:34 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2164 gradually with 4 steps - repool after schema change
- 11:33 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2164', diff saved to https://phabricator.wikimedia.org/P85849 and previous config saved to /var/cache/conftool/dbconfig/20251127-113315-marostegui.json
- 11:33 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) db2164 gradually with 4 steps - repool after schema change
- 11:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2164 gradually with 4 steps - repool after schema change
- 11:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2147 (T410531)', diff saved to https://phabricator.wikimedia.org/P85846 and previous config saved to /var/cache/conftool/dbconfig/20251127-112453-marostegui.json
- 11:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 11:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 11:19 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) db2164.codfw.wmnet gradually with 4 steps - repool after schema change
- 11:19 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2164.codfw.wmnet gradually with 4 steps - repool after schema change
- 11:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 11:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2161.codfw.wmnet with reason: Maintenance
- 11:18 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) db2164.codfw.wmnet gradually with 4 steps - repool after schema change
- 11:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2164.codfw.wmnet gradually with 4 steps - repool after schema change
- 11:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) db2164.codfw.wmnet quickly with 2 steps - repool after schema change
- 11:17 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2164.codfw.wmnet quickly with 2 steps - repool after schema change
- 11:17 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2164', diff saved to https://phabricator.wikimedia.org/P85843 and previous config saved to /var/cache/conftool/dbconfig/20251127-111712-marostegui.json
- 11:16 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) db2164.codfw.wmnet gradually with 4 steps - repool after schema change
- 11:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2164.codfw.wmnet gradually with 4 steps - repool after schema change
- 11:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P85842 and previous config saved to /var/cache/conftool/dbconfig/20251127-110850-marostegui.json
- 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
- 10:59 a-pizzata@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
- 10:58 a-pizzata@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
- 10:57 a-pizzata@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
- 10:57 a-pizzata@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
- 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
- 10:55 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) maps1011.eqiad.wmnet on all recursors
- 10:54 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache maps1011.eqiad.wmnet on all recursors
- 10:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps1011 - ayounsi@cumin1003"
- 10:54 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps1011 - ayounsi@cumin1003"
- 10:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T410531)', diff saved to https://phabricator.wikimedia.org/P85841 and previous config saved to /var/cache/conftool/dbconfig/20251127-105341-marostegui.json
- 10:51 btullis@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
- 10:51 btullis@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: sync
- 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2164 (T410531)', diff saved to https://phabricator.wikimedia.org/P85840 and previous config saved to /var/cache/conftool/dbconfig/20251127-104641-marostegui.json
- 10:46 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 10:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2164.codfw.wmnet with reason: Maintenance
- 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T410531)', diff saved to https://phabricator.wikimedia.org/P85839 and previous config saved to /var/cache/conftool/dbconfig/20251127-104617-marostegui.json
- 10:39 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw2003-dev.codfw.wmnet
- 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
- 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1187 (T299441)', diff saved to https://phabricator.wikimedia.org/P85838 and previous config saved to /var/cache/conftool/dbconfig/20251127-103901-fceratto.json
- 10:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 10:37 taavi@cumin1003: conftool action : set/pooled=yes:weight=100; selector: cluster=wikireplica-db-web,service=x4
- 10:37 taavi@cumin1003: conftool action : set/pooled=yes:weight=100; selector: cluster=wikireplica-db-analytics,service=x4
- 10:35 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
- 10:33 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudgw2003-dev.codfw.wmnet
- 10:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
- 10:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P85837 and previous config saved to /var/cache/conftool/dbconfig/20251127-103109-marostegui.json
- 10:28 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 10:28 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:27 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) maps1012.eqiad.wmnet on all recursors
- 10:27 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache maps1012.eqiad.wmnet on all recursors
- 10:27 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:27 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps1012 - ayounsi@cumin1003"
- 10:27 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps1012 - ayounsi@cumin1003"
- 10:25 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 10:25 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:25 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
- 10:22 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 10:18 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
- 10:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
- 10:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P85836 and previous config saved to /var/cache/conftool/dbconfig/20251127-101601-marostegui.json
- 10:13 arnaudb@cumin1003: END (PASS) - Cookbook sre.gerrit.localbackup (exit_code=0) Prepare local backup on: gerrit2003.wikimedia.org
- 10:13 arnaudb@cumin1003: START - Cookbook sre.gerrit.localbackup Prepare local backup on: gerrit2003.wikimedia.org
- 10:13 arnaudb@cumin1003: END (PASS) - Cookbook sre.gerrit.localbackup (exit_code=0) Prepare local backup on: gerrit2003.wikimedia.org
- 10:10 dcausse@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 10:10 dcausse@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
- 10:08 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
- 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) maps1013.eqiad.wmnet on all recursors
- 10:06 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache maps1013.eqiad.wmnet on all recursors
- 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps1013 - ayounsi@cumin1003"
- 10:06 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps1013 - ayounsi@cumin1003"
- 10:05 kharlan@deploy2002: Finished scap sync-world: Backport for CheckUser/UserInfoCard: Enable by default for some privileged groups on enwiki (T409840) (duration: 09m 11s)
- 10:03 arnaudb@cumin1003: START - Cookbook sre.gerrit.localbackup Prepare local backup on: gerrit2003.wikimedia.org
- 10:02 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 10:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T410531)', diff saved to https://phabricator.wikimedia.org/P85835 and previous config saved to /var/cache/conftool/dbconfig/20251127-100054-marostegui.json
- 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
- 10:00 kharlan@deploy2002: kharlan: Continuing with sync
- 09:58 kharlan@deploy2002: kharlan: Backport for CheckUser/UserInfoCard: Enable by default for some privileged groups on enwiki (T409840) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:56 kharlan@deploy2002: Started scap sync-world: Backport for CheckUser/UserInfoCard: Enable by default for some privileged groups on enwiki (T409840)
- 09:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2163 (T410531)', diff saved to https://phabricator.wikimedia.org/P85834 and previous config saved to /var/cache/conftool/dbconfig/20251127-095402-marostegui.json
- 09:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2163.codfw.wmnet with reason: Maintenance
- 09:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T410531)', diff saved to https://phabricator.wikimedia.org/P85833 and previous config saved to /var/cache/conftool/dbconfig/20251127-095339-marostegui.json
- 09:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
- 09:49 moritzm: installing krb5 security updates
- 09:45 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) maps1014.eqiad.wmnet on all recursors
- 09:45 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache maps1014.eqiad.wmnet on all recursors
- 09:43 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:43 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps1014 - ayounsi@cumin1003"
- 09:43 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps1014 - ayounsi@cumin1003"
- 09:39 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 09:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P85832 and previous config saved to /var/cache/conftool/dbconfig/20251127-093831-marostegui.json
- 09:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P85831 and previous config saved to /var/cache/conftool/dbconfig/20251127-092323-marostegui.json
- 09:21 moritzm: upgrade Envoy on cloudweb* T405808
- 09:11 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.4 refs T408274
- 09:11 moritzm: installing gdk-pixbuf security updates
- 09:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T410531)', diff saved to https://phabricator.wikimedia.org/P85830 and previous config saved to /var/cache/conftool/dbconfig/20251127-090816-marostegui.json
- 09:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2154 (T410531)', diff saved to https://phabricator.wikimedia.org/P85829 and previous config saved to /var/cache/conftool/dbconfig/20251127-090107-marostegui.json
- 09:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2154.codfw.wmnet with reason: Maintenance
- 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T410531)', diff saved to https://phabricator.wikimedia.org/P85828 and previous config saved to /var/cache/conftool/dbconfig/20251127-090043-marostegui.json
- 08:51 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 08:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P85827 and previous config saved to /var/cache/conftool/dbconfig/20251127-084535-marostegui.json
- 08:36 ryankemper: T410573 Rebooting `apifeatureusage[1,2]001*`, one host at a time
- 08:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P85826 and previous config saved to /var/cache/conftool/dbconfig/20251127-083028-marostegui.json
- 08:18 moritzm: installing perl security updates
- 08:17 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 08:17 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 08:16 dcausse: closing the UTC morning backport window
- 08:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T410531)', diff saved to https://phabricator.wikimedia.org/P85825 and previous config saved to /var/cache/conftool/dbconfig/20251127-081520-marostegui.json
- 08:14 dcausse@deploy2002: Finished scap sync-world: Backport for cirrus: enable DWIM wrong keyboard second try on all he & ru wikis (T408734) (duration: 08m 37s)
- 08:10 dcausse@deploy2002: dcausse: Continuing with sync
- 08:08 dcausse@deploy2002: dcausse: Backport for cirrus: enable DWIM wrong keyboard second try on all he & ru wikis (T408734) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2152 (T410531)', diff saved to https://phabricator.wikimedia.org/P85824 and previous config saved to /var/cache/conftool/dbconfig/20251127-080823-marostegui.json
- 08:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2152.codfw.wmnet with reason: Maintenance
- 08:06 dcausse@deploy2002: Started scap sync-world: Backport for cirrus: enable DWIM wrong keyboard second try on all he & ru wikis (T408734)
- 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1209.eqiad.wmnet with reason: Maintenance
- 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Fully repool db1172', diff saved to https://phabricator.wikimedia.org/P85823 and previous config saved to /var/cache/conftool/dbconfig/20251127-075541-marostegui.json
- 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P85822 and previous config saved to /var/cache/conftool/dbconfig/20251127-075259-marostegui.json
- 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P85821 and previous config saved to /var/cache/conftool/dbconfig/20251127-073751-marostegui.json
- 07:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T410531)', diff saved to https://phabricator.wikimedia.org/P85820 and previous config saved to /var/cache/conftool/dbconfig/20251127-072243-marostegui.json
- 07:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1172 (T410531)', diff saved to https://phabricator.wikimedia.org/P85819 and previous config saved to /var/cache/conftool/dbconfig/20251127-071538-marostegui.json
- 07:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1172.eqiad.wmnet with reason: Maintenance
- 07:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T410531)', diff saved to https://phabricator.wikimedia.org/P85818 and previous config saved to /var/cache/conftool/dbconfig/20251127-071004-marostegui.json
- 06:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P85817 and previous config saved to /var/cache/conftool/dbconfig/20251127-065456-marostegui.json
- 06:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P85816 and previous config saved to /var/cache/conftool/dbconfig/20251127-063949-marostegui.json
- 06:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T410531)', diff saved to https://phabricator.wikimedia.org/P85815 and previous config saved to /var/cache/conftool/dbconfig/20251127-062441-marostegui.json
- 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis tokwiki in section s5
- 06:17 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1167 (T410531)', diff saved to https://phabricator.wikimedia.org/P85814 and previous config saved to /var/cache/conftool/dbconfig/20251127-061733-marostegui.json
- 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 06:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2218.codfw.wmnet with reason: Maintenance
- 06:11 marostegui@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis tokwiki in section s5
- 02:28 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 02:27 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T410589)', diff saved to https://phabricator.wikimedia.org/P85812 and previous config saved to /var/cache/conftool/dbconfig/20251127-022749-ladsgroup.json
- 02:12 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P85811 and previous config saved to /var/cache/conftool/dbconfig/20251127-021241-ladsgroup.json
- 01:57 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P85810 and previous config saved to /var/cache/conftool/dbconfig/20251127-015733-ladsgroup.json
- 01:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T410589)', diff saved to https://phabricator.wikimedia.org/P85809 and previous config saved to /var/cache/conftool/dbconfig/20251127-014226-ladsgroup.json
- 01:36 apine@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 01:35 apine@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 01:34 apine@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 01:34 apine@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 01:33 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 01:33 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 01:26 apine@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 01:25 apine@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 01:25 apine@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 01:24 apine@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 01:23 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 01:23 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 12m 55s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:30 eileen: civicrm upgraded from 96715bf8 to 4437a5ef - deployed UI caching improvement
- 00:12 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lswtest-d8-eqiad with reason: doing network tests
- 00:12 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on sretest1006.eqiad.wmnet with reason: doing network tests
- 00:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "import lswtest-d8-eqiad - cmooney@cumin1003"
- 00:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "import lswtest-d8-eqiad - cmooney@cumin1003"
2025-11-26
- 23:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 23:51 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 23:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T410531)', diff saved to https://phabricator.wikimedia.org/P85808 and previous config saved to /var/cache/conftool/dbconfig/20251126-234646-marostegui.json
- 23:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P85807 and previous config saved to /var/cache/conftool/dbconfig/20251126-233138-marostegui.json
- 23:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P85806 and previous config saved to /var/cache/conftool/dbconfig/20251126-231631-marostegui.json
- 23:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T410531)', diff saved to https://phabricator.wikimedia.org/P85805 and previous config saved to /var/cache/conftool/dbconfig/20251126-230123-marostegui.json
- 22:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2222 (T410531)', diff saved to https://phabricator.wikimedia.org/P85804 and previous config saved to /var/cache/conftool/dbconfig/20251126-225556-marostegui.json
- 22:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2222.codfw.wmnet with reason: Maintenance
- 22:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T410531)', diff saved to https://phabricator.wikimedia.org/P85803 and previous config saved to /var/cache/conftool/dbconfig/20251126-225532-marostegui.json
- 22:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P85802 and previous config saved to /var/cache/conftool/dbconfig/20251126-224025-marostegui.json
- 22:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P85801 and previous config saved to /var/cache/conftool/dbconfig/20251126-222517-marostegui.json
- 22:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T410531)', diff saved to https://phabricator.wikimedia.org/P85800 and previous config saved to /var/cache/conftool/dbconfig/20251126-221010-marostegui.json
- 22:04 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2221 (T410531)', diff saved to https://phabricator.wikimedia.org/P85799 and previous config saved to /var/cache/conftool/dbconfig/20251126-220437-marostegui.json
- 22:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2221.codfw.wmnet with reason: Maintenance
- 22:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T410531)', diff saved to https://phabricator.wikimedia.org/P85798 and previous config saved to /var/cache/conftool/dbconfig/20251126-220414-marostegui.json
- 21:50 zabe@deploy2002: Finished scap sync-world: Backport for Remove fundraiseup domains from donatewiki CSP (T410737) (duration: 10m 34s)
- 21:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P85797 and previous config saved to /var/cache/conftool/dbconfig/20251126-214906-marostegui.json
- 21:45 zabe@deploy2002: ejegg, zabe: Continuing with sync
- 21:42 zabe@deploy2002: ejegg, zabe: Backport for Remove fundraiseup domains from donatewiki CSP (T410737) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:39 zabe@deploy2002: Started scap sync-world: Backport for Remove fundraiseup domains from donatewiki CSP (T410737)
- 21:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P85796 and previous config saved to /var/cache/conftool/dbconfig/20251126-213358-marostegui.json
- 21:32 cscott@deploy2002: Finished scap sync-world: Backport for Bump wikimedia/parsoid to 0.23.0-a7 (T204307 T373253 T410826 T410960), Bump wikimedia/parsoid to 0.23.0-a7 (T410960) (duration: 12m 38s)
- 21:28 cscott@deploy2002: cscott: Continuing with sync
- 21:22 cscott@deploy2002: cscott: Backport for Bump wikimedia/parsoid to 0.23.0-a7 (T204307 T373253 T410826 T410960), Bump wikimedia/parsoid to 0.23.0-a7 (T410960) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:19 cscott@deploy2002: Started scap sync-world: Backport for Bump wikimedia/parsoid to 0.23.0-a7 (T204307 T373253 T410826 T410960), Bump wikimedia/parsoid to 0.23.0-a7 (T410960)
- 21:19 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T410573
- 21:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T410531)', diff saved to https://phabricator.wikimedia.org/P85795 and previous config saved to /var/cache/conftool/dbconfig/20251126-211851-marostegui.json
- 21:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2220 (T410531)', diff saved to https://phabricator.wikimedia.org/P85794 and previous config saved to /var/cache/conftool/dbconfig/20251126-211326-marostegui.json
- 21:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2220.codfw.wmnet with reason: Maintenance
- 21:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T410531)', diff saved to https://phabricator.wikimedia.org/P85793 and previous config saved to /var/cache/conftool/dbconfig/20251126-211302-marostegui.json
- 20:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P85792 and previous config saved to /var/cache/conftool/dbconfig/20251126-205754-marostegui.json
- 20:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P85791 and previous config saved to /var/cache/conftool/dbconfig/20251126-204246-marostegui.json
- 20:37 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1006.eqiad.wmnet with OS trixie
- 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T410531)', diff saved to https://phabricator.wikimedia.org/P85790 and previous config saved to /var/cache/conftool/dbconfig/20251126-202739-marostegui.json
- 20:22 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (T410531)', diff saved to https://phabricator.wikimedia.org/P85789 and previous config saved to /var/cache/conftool/dbconfig/20251126-202213-marostegui.json
- 20:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2208.codfw.wmnet with reason: Maintenance
- 20:19 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on sretest1006.eqiad.wmnet with reason: host reimage
- 20:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
- 20:16 zabe@deploy2002: Finished scap sync-world: Backport for RestrictionStore: Check for no up to date cascade protections (T411092) (duration: 06m 56s)
- 20:16 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet with OS trixie
- 20:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
- 20:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T410531)', diff saved to https://phabricator.wikimedia.org/P85788 and previous config saved to /var/cache/conftool/dbconfig/20251126-201315-marostegui.json
- 20:13 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1006.eqiad.wmnet with reason: host reimage
- 20:12 zabe@deploy2002: zabe: Continuing with sync
- 20:11 zabe@deploy2002: zabe: Backport for RestrictionStore: Check for no up to date cascade protections (T411092) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:09 zabe@deploy2002: Started scap sync-world: Backport for RestrictionStore: Check for no up to date cascade protections (T411092)
- 19:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P85787 and previous config saved to /var/cache/conftool/dbconfig/20251126-195807-marostegui.json
- 19:56 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
- 19:53 cmooney@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
- 19:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P85786 and previous config saved to /var/cache/conftool/dbconfig/20251126-194300-marostegui.json
- 19:38 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol2005-dev.codfw.wmnet with reason: host reimage
- 19:32 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol2005-dev.codfw.wmnet with reason: host reimage
- 19:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T410531)', diff saved to https://phabricator.wikimedia.org/P85785 and previous config saved to /var/cache/conftool/dbconfig/20251126-192752-marostegui.json
- 19:22 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2182 (T410531)', diff saved to https://phabricator.wikimedia.org/P85784 and previous config saved to /var/cache/conftool/dbconfig/20251126-192207-marostegui.json
- 19:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2182.codfw.wmnet with reason: Maintenance
- 19:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T410531)', diff saved to https://phabricator.wikimedia.org/P85783 and previous config saved to /var/cache/conftool/dbconfig/20251126-192143-marostegui.json
- 19:21 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
- 19:20 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host sretest1006.eqiad.wmnet
- 19:19 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest1006.eqiad.wmnet on all recursors
- 19:19 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache sretest1006.eqiad.wmnet on all recursors
- 19:19 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:19 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change IPs for sretest1006 - cmooney@cumin1003"
- 19:19 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change IPs for sretest1006 - cmooney@cumin1003"
- 19:15 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 19:15 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2005-dev.codfw.wmnet with OS trixie
- 19:13 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T410573
- 19:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P85782 and previous config saved to /var/cache/conftool/dbconfig/20251126-190636-marostegui.json
- 18:57 taavi@deploy2002: Finished scap sync-world: Backport for Update interwiki cache (duration: 06m 59s)
- 18:53 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
- 18:53 taavi@deploy2002: taavi: Continuing with sync
- 18:53 eevans@deploy2002: helmfile [staging] START helmfile.d/services/data-gateway: apply
- 18:53 taavi@deploy2002: taavi: Backport for Update interwiki cache synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P85781 and previous config saved to /var/cache/conftool/dbconfig/20251126-185129-marostegui.json
- 18:50 taavi@deploy2002: Started scap sync-world: Backport for Update interwiki cache
- 18:50 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
- 18:50 eevans@deploy2002: helmfile [staging] START helmfile.d/services/data-gateway: apply
- 18:46 taavi@deploy2002: Finished scap sync-world: Backport for Allow account creation on tokwiki (T404457) (duration: 07m 45s)
- 18:42 taavi@deploy2002: taavi: Continuing with sync
- 18:41 taavi@deploy2002: taavi: Backport for Allow account creation on tokwiki (T404457) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:39 taavi@deploy2002: Started scap sync-world: Backport for Allow account creation on tokwiki (T404457)
- 18:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T410531)', diff saved to https://phabricator.wikimedia.org/P85780 and previous config saved to /var/cache/conftool/dbconfig/20251126-183622-marostegui.json
- 18:36 taavi: attach imported tokwiki users to CentralAuth T404573
- 18:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
- 18:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2168 (T410531)', diff saved to https://phabricator.wikimedia.org/P85779 and previous config saved to /var/cache/conftool/dbconfig/20251126-183031-marostegui.json
- 18:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 18:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T410531)', diff saved to https://phabricator.wikimedia.org/P85778 and previous config saved to /var/cache/conftool/dbconfig/20251126-183007-marostegui.json
- 18:19 Dreamy_Jazz: Deployed private code change for T410300
- 18:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P85775 and previous config saved to /var/cache/conftool/dbconfig/20251126-181500-marostegui.json
- 18:01 taavi@deploy2002: mwscript-k8s job started: rebuildall.php --wiki=tokwiki # T404573
- 18:00 taavi@deploy2002: mwscript-k8s job started: initSiteStats.php --wiki=tokwiki # T404573
- 18:00 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Add SuggestedInvestigationsRevisionsPager (T410300), Add SuggestedInvestigationsRevisionsPager (T410300) (duration: 07m 03s)
- 17:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P85774 and previous config saved to /var/cache/conftool/dbconfig/20251126-175952-marostegui.json
- 17:56 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 17:55 dreamyjazz@deploy2002: dreamyjazz: Backport for Add SuggestedInvestigationsRevisionsPager (T410300), Add SuggestedInvestigationsRevisionsPager (T410300) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:53 dreamyjazz@deploy2002: Started scap sync-world: Backport for Add SuggestedInvestigationsRevisionsPager (T410300), Add SuggestedInvestigationsRevisionsPager (T410300)
- 17:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T410531)', diff saved to https://phabricator.wikimedia.org/P85773 and previous config saved to /var/cache/conftool/dbconfig/20251126-174445-marostegui.json
- 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2159 (T410531)', diff saved to https://phabricator.wikimedia.org/P85772 and previous config saved to /var/cache/conftool/dbconfig/20251126-173857-marostegui.json
- 17:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2159.codfw.wmnet with reason: Maintenance
- 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T410531)', diff saved to https://phabricator.wikimedia.org/P85771 and previous config saved to /var/cache/conftool/dbconfig/20251126-173833-marostegui.json
- 17:34 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Log the hCaptcha token (T411096) (duration: 08m 15s)
- 17:32 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 17:30 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 17:29 kharlan@deploy2002: kharlan: Continuing with sync
- 17:28 kharlan@deploy2002: kharlan: Backport for hCaptcha: Log the hCaptcha token (T411096) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:25 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Log the hCaptcha token (T411096)
- 17:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P85770 and previous config saved to /var/cache/conftool/dbconfig/20251126-172325-marostegui.json
- 17:18 taavi: taavi@deploy2002 ~ $ mwscript importDump.php --wiki=tokwiki --no-updates --username-prefix="" < /home/taavi/tokwiki/wikipesija-2025-11-26-rewritten.xml # T404573
- 17:15 jgleeson: payments-wiki upgraded from f254a5ab to bb179e9c
- 17:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P85769 and previous config saved to /var/cache/conftool/dbconfig/20251126-170817-marostegui.json
- 17:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1197 (T410589)', diff saved to https://phabricator.wikimedia.org/P85768 and previous config saved to /var/cache/conftool/dbconfig/20251126-170054-ladsgroup.json
- 17:00 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 17:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T410589)', diff saved to https://phabricator.wikimedia.org/P85766 and previous config saved to /var/cache/conftool/dbconfig/20251126-170031-ladsgroup.json
- 16:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T410531)', diff saved to https://phabricator.wikimedia.org/P85765 and previous config saved to /var/cache/conftool/dbconfig/20251126-165309-marostegui.json
- 16:52 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
- 16:52 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet
- 16:52 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1019.eqiad.wmnet
- 16:51 moritzm: installing Perl security updates
- 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet
- 16:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2150 (T410531)', diff saved to https://phabricator.wikimedia.org/P85764 and previous config saved to /var/cache/conftool/dbconfig/20251126-164722-marostegui.json
- 16:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2150.codfw.wmnet with reason: Maintenance
- 16:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1236.eqiad.wmnet with reason: Maintenance
- 16:46 ejegg: donorwiki upgraded from 2a73a08d to bb179e9c
- 16:45 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P85763 and previous config saved to /var/cache/conftool/dbconfig/20251126-164523-ladsgroup.json
- 16:45 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) lswtest-d8-eqiad.mgmt.eqiad.wmnet on all recursors
- 16:45 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache lswtest-d8-eqiad.mgmt.eqiad.wmnet on all recursors
- 16:45 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 16:45 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 16:36 taavi@deploy2002: mwscript-k8s job started: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https # T404571
- 16:36 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lswtest-d8-eqiad
- 16:36 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lswtest-d8-eqiad
- 16:35 taavi@deploy2002: Finished scap sync-world: Backport for Set up tokwiki namespaces (T404457) (duration: 08m 17s)
- 16:30 taavi@deploy2002: taavi: Continuing with sync
- 16:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P85762 and previous config saved to /var/cache/conftool/dbconfig/20251126-163015-ladsgroup.json
- 16:29 taavi@deploy2002: taavi: Backport for Set up tokwiki namespaces (T404457) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:27 taavi@deploy2002: Started scap sync-world: Backport for Set up tokwiki namespaces (T404457)
- 16:24 andrew@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on clouddumps1002.wikimedia.org with reason: T411025
- 16:24 taavi@deploy2002: Finished scap sync-world: Backport for Activate tokwiki (T404457) (duration: 07m 27s)
- 16:23 andrew@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on cloudweb1004.wikimedia.org with reason: T411025
- 16:20 taavi@deploy2002: taavi: Continuing with sync
- 16:19 taavi@deploy2002: taavi: Backport for Activate tokwiki (T404457) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:16 taavi@deploy2002: Started scap sync-world: Backport for Activate tokwiki (T404457)
- 16:15 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T410589)', diff saved to https://phabricator.wikimedia.org/P85761 and previous config saved to /var/cache/conftool/dbconfig/20251126-161508-ladsgroup.json
- 16:12 taavi@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/addWiki.php --wiki=tokwiki # T404457
- 16:10 taavi@deploy2002: Finished scap sync-world: Backport for Initial configuration for tokwiki (T404457) (duration: 07m 15s)
- 16:08 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:07 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 16:06 taavi@deploy2002: taavi: Continuing with sync
- 16:05 taavi@deploy2002: taavi: Backport for Initial configuration for tokwiki (T404457) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:05 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 16:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:03 taavi@deploy2002: Started scap sync-world: Backport for Initial configuration for tokwiki (T404457)
- 16:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 16:01 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 15:56 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
- 15:56 elukey@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
- 15:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 15:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 15:48 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb[1017-1020].eqiad.wmnet with reason: moving to a new switch
- 15:47 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 15:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 15:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 15:39 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 15:28 dpogorzelski@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
- 15:28 dpogorzelski@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop: sync
- 15:27 dpogorzelski@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
- 15:27 dpogorzelski@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: sync
- 15:21 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet
- 15:20 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet
- 15:20 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet
- 15:20 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet
- 15:11 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
- 15:10 elukey@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
- 15:01 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:01 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new link ips to lswtest - cmooney@cumin1003"
- 15:01 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new link ips to lswtest - cmooney@cumin1003"
- 14:58 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 14:53 elukey@deploy2002: Finished scap sync-world: Backport for Add a staging-specific stream for Maps tiles change (T409528) (duration: 06m 41s)
- 14:50 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:49 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new link ips to lswtest - cmooney@cumin1003"
- 14:49 elukey@deploy2002: elukey: Continuing with sync
- 14:49 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new link ips to lswtest - cmooney@cumin1003"
- 14:49 elukey@deploy2002: elukey: Backport for Add a staging-specific stream for Maps tiles change (T409528) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:47 elukey@deploy2002: Started scap sync-world: Backport for Add a staging-specific stream for Maps tiles change (T409528)
- 14:46 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 14:45 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 14:43 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 14:37 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on sretest1006.eqiad.wmnet with reason: changing host to uefi mode boot
- 14:25 sbisson@deploy2002: Finished scap sync-world: Backport for CX3 Build 1.0.0+20251126 (T384485) (duration: 08m 52s)
- 14:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 14:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T410531)', diff saved to https://phabricator.wikimedia.org/P85758 and previous config saved to /var/cache/conftool/dbconfig/20251126-142445-marostegui.json
- 14:21 sbisson@deploy2002: sbisson: Continuing with sync
- 14:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 14:18 sbisson@deploy2002: sbisson: Backport for CX3 Build 1.0.0+20251126 (T384485) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 14:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 14:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 14:16 sbisson@deploy2002: Started scap sync-world: Backport for CX3 Build 1.0.0+20251126 (T384485)
- 14:09 kharlan@deploy2002: Finished scap sync-world: Backport for MonologChannels: Add WikiEditor (T410877), Hooks: Log the status message when responseUnknown occurs (T410877) (duration: 10m 39s)
- 14:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P85757 and previous config saved to /var/cache/conftool/dbconfig/20251126-140937-marostegui.json
- 14:08 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
- 14:05 kharlan@deploy2002: kharlan: Continuing with sync
- 14:03 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
- 14:01 kharlan@deploy2002: kharlan: Backport for MonologChannels: Add WikiEditor (T410877), Hooks: Log the status message when responseUnknown occurs (T410877) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:59 kharlan@deploy2002: Started scap sync-world: Backport for MonologChannels: Add WikiEditor (T410877), Hooks: Log the status message when responseUnknown occurs (T410877)
- 13:57 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2166 gradually with 4 steps - Repooling
- 13:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P85755 and previous config saved to /var/cache/conftool/dbconfig/20251126-135429-marostegui.json
- 13:42 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Only set $wgGlobalBlockingAutoblockExemptions if GlobalBlocking used (T409915), Set $wgGlobalBlockingAutoblockExemptions (T409915), Follow-up: Set $wgGlobalBlockingAutoblockExemptions (T409915) (duration: 06m 51s)
- 13:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T410531)', diff saved to https://phabricator.wikimedia.org/P85753 and previous config saved to /var/cache/conftool/dbconfig/20251126-133922-marostegui.json
- 13:37 dreamyjazz@deploy2002: dreamyjazz, taavi: Continuing with sync
- 13:37 dreamyjazz@deploy2002: dreamyjazz, taavi: Backport for Only set $wgGlobalBlockingAutoblockExemptions if GlobalBlocking used (T409915), Set $wgGlobalBlockingAutoblockExemptions (T409915), Follow-up: Set $wgGlobalBlockingAutoblockExemptions (T409915) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:37 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1253 (T410531)', diff saved to https://phabricator.wikimedia.org/P85752 and previous config saved to /var/cache/conftool/dbconfig/20251126-133709-marostegui.json
- 13:37 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1253.eqiad.wmnet with reason: Maintenance
- 13:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T410531)', diff saved to https://phabricator.wikimedia.org/P85751 and previous config saved to /var/cache/conftool/dbconfig/20251126-133645-marostegui.json
- 13:35 dreamyjazz@deploy2002: Started scap sync-world: Backport for Only set $wgGlobalBlockingAutoblockExemptions if GlobalBlocking used (T409915), Set $wgGlobalBlockingAutoblockExemptions (T409915), Follow-up: Set $wgGlobalBlockingAutoblockExemptions (T409915)
- 13:31 dreamyjazz@deploy2002: sync-world failed: <CalledProcessError> Command '['sudo', '-u', 'mwbuilder', '-n', '--', '/usr/bin/scap', 'mwscript', '--no-local-config', '--directory', '/srv/mediawiki-staging', '--user', 'www-data', '--', 'mergeMessageFileList.php', '--wiki=aawiki', '--force-version', '1.46.0-wmf.3', '--list-file', '/srv/mediawiki-staging/wmf-config/extension-list', '--output', '/tmp/tmp.0iG7i2ezfh']' returne
- 13:31 dreamyjazz@deploy2002: Started scap sync-world: Backport for Set $wgGlobalBlockingAutoblockExemptions (T409915), Only set $wgGlobalBlockingAutoblockExemptions if GlobalBlocking used (T409915)
- 13:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 13:26 dreamyjazz@deploy2002: sync-world failed: <CalledProcessError> Command '['sudo', '-u', 'mwbuilder', '-n', '--', '/usr/bin/scap', 'mwscript', '--no-local-config', '--directory', '/srv/mediawiki-staging', '--user', 'www-data', '--', 'mergeMessageFileList.php', '--wiki=aawiki', '--force-version', '1.46.0-wmf.3', '--list-file', '/srv/mediawiki-staging/wmf-config/extension-list', '--output', '/tmp/tmp.S3QSelNe06']' returne
- 13:26 dreamyjazz@deploy2002: Started scap sync-world: Backport for Set $wgGlobalBlockingAutoblockExemptions (T409915)
- 13:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 13:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P85749 and previous config saved to /var/cache/conftool/dbconfig/20251126-132138-marostegui.json
- 13:20 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s8 codfw T411088', diff saved to https://phabricator.wikimedia.org/P85748 and previous config saved to /var/cache/conftool/dbconfig/20251126-132039-marostegui.json
- 13:20 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s7 codfw T411088', diff saved to https://phabricator.wikimedia.org/P85747 and previous config saved to /var/cache/conftool/dbconfig/20251126-132023-marostegui.json
- 13:20 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s6 codfw T411088', diff saved to https://phabricator.wikimedia.org/P85746 and previous config saved to /var/cache/conftool/dbconfig/20251126-132006-marostegui.json
- 13:19 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s5 codfw T411088', diff saved to https://phabricator.wikimedia.org/P85745 and previous config saved to /var/cache/conftool/dbconfig/20251126-131945-marostegui.json
- 13:19 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s4 codfw T411088', diff saved to https://phabricator.wikimedia.org/P85744 and previous config saved to /var/cache/conftool/dbconfig/20251126-131926-marostegui.json
- 13:18 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s2 codfw T411088', diff saved to https://phabricator.wikimedia.org/P85743 and previous config saved to /var/cache/conftool/dbconfig/20251126-131844-marostegui.json
- 13:18 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s1 codfw T411088', diff saved to https://phabricator.wikimedia.org/P85742 and previous config saved to /var/cache/conftool/dbconfig/20251126-131822-marostegui.json
- 13:18 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s3 codfw T411088', diff saved to https://phabricator.wikimedia.org/P85741 and previous config saved to /var/cache/conftool/dbconfig/20251126-131803-marostegui.json
- 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in x3 codfw T408663', diff saved to https://phabricator.wikimedia.org/P85740 and previous config saved to /var/cache/conftool/dbconfig/20251126-131606-marostegui.json
- 13:15 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s7 and s8 codfw T408663', diff saved to https://phabricator.wikimedia.org/P85739 and previous config saved to /var/cache/conftool/dbconfig/20251126-131512-marostegui.json
- 13:13 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s6 codfw T408663', diff saved to https://phabricator.wikimedia.org/P85738 and previous config saved to /var/cache/conftool/dbconfig/20251126-131304-marostegui.json
- 13:11 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s5 codfw T408663', diff saved to https://phabricator.wikimedia.org/P85736 and previous config saved to /var/cache/conftool/dbconfig/20251126-131110-marostegui.json
- 13:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2166 gradually with 4 steps - Repooling
- 13:10 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s4 codfw T408663', diff saved to https://phabricator.wikimedia.org/P85735 and previous config saved to /var/cache/conftool/dbconfig/20251126-131018-marostegui.json
- 13:08 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s2 codfw T408663', diff saved to https://phabricator.wikimedia.org/P85734 and previous config saved to /var/cache/conftool/dbconfig/20251126-130856-marostegui.json
- 13:07 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s1 codfw T408663', diff saved to https://phabricator.wikimedia.org/P85733 and previous config saved to /var/cache/conftool/dbconfig/20251126-130757-marostegui.json
- 13:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P85731 and previous config saved to /var/cache/conftool/dbconfig/20251126-130630-marostegui.json
- 13:06 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s3 codfw T408663', diff saved to https://phabricator.wikimedia.org/P85730 and previous config saved to /var/cache/conftool/dbconfig/20251126-130620-marostegui.json
- 13:02 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s8 T411088', diff saved to https://phabricator.wikimedia.org/P85729 and previous config saved to /var/cache/conftool/dbconfig/20251126-130255-marostegui.json
- 13:02 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s7 T411088', diff saved to https://phabricator.wikimedia.org/P85728 and previous config saved to /var/cache/conftool/dbconfig/20251126-130237-marostegui.json
- 13:02 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s6 T411088', diff saved to https://phabricator.wikimedia.org/P85727 and previous config saved to /var/cache/conftool/dbconfig/20251126-130220-marostegui.json
- 13:02 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s5 T411088', diff saved to https://phabricator.wikimedia.org/P85726 and previous config saved to /var/cache/conftool/dbconfig/20251126-130202-marostegui.json
- 12:53 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 12:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 12:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 12:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T410531)', diff saved to https://phabricator.wikimedia.org/P85725 and previous config saved to /var/cache/conftool/dbconfig/20251126-125049-marostegui.json
- 12:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1231 (T410531)', diff saved to https://phabricator.wikimedia.org/P85724 and previous config saved to /var/cache/conftool/dbconfig/20251126-124838-marostegui.json
- 12:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
- 12:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1231.eqiad.wmnet with reason: Maintenance
- 12:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T410531)', diff saved to https://phabricator.wikimedia.org/P85723 and previous config saved to /var/cache/conftool/dbconfig/20251126-124815-marostegui.json
- 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
- 12:46 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s4 T411088', diff saved to https://phabricator.wikimedia.org/P85722 and previous config saved to /var/cache/conftool/dbconfig/20251126-124609-marostegui.json
- 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
- 12:44 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s2 T411088', diff saved to https://phabricator.wikimedia.org/P85721 and previous config saved to /var/cache/conftool/dbconfig/20251126-124441-marostegui.json
- 12:43 root@cumin2002: DONE (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for backup2014.codfw.wmnet: Renew puppet certificate - root@cumin2002
- 12:38 root@cumin2002: DONE (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for backup2014.codfw.wmnet: Renew puppet certificate - root@cumin2002
- 12:35 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
- 12:35 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
- 12:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P85720 and previous config saved to /var/cache/conftool/dbconfig/20251126-123307-marostegui.json
- 12:31 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s1 T411088', diff saved to https://phabricator.wikimedia.org/P85719 and previous config saved to /var/cache/conftool/dbconfig/20251126-123131-marostegui.json
- 12:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
- 12:27 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 29357
- 12:27 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
- 12:27 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 29357
- 12:27 marostegui@cumin1003: dbctl commit (dc=all): 'Remove vslow/dump from s3 T411088', diff saved to https://phabricator.wikimedia.org/P85717 and previous config saved to /var/cache/conftool/dbconfig/20251126-122703-marostegui.json
- 12:22 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
- 12:21 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
- 12:20 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
- 12:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P85716 and previous config saved to /var/cache/conftool/dbconfig/20251126-121759-marostegui.json
- 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-all
- 12:12 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
- 12:10 root@cumin2002: DONE (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for backup2014.codfw.wmnet: Renew puppet certificate - root@cumin2002
- 12:09 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 12:09 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
- 12:06 claime: Starting kafka-main rebalance with 30MB/s throttle - T407185
- 12:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T410531)', diff saved to https://phabricator.wikimedia.org/P85713 and previous config saved to /var/cache/conftool/dbconfig/20251126-120252-marostegui.json
- 12:02 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-all
- 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wcqs-public
- 11:59 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wcqs-public
- 11:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (T410531)', diff saved to https://phabricator.wikimedia.org/P85712 and previous config saved to /var/cache/conftool/dbconfig/20251126-115739-marostegui.json
- 11:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 11:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T410531)', diff saved to https://phabricator.wikimedia.org/P85711 and previous config saved to /var/cache/conftool/dbconfig/20251126-115726-marostegui.json
- 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe-eqiad
- 11:52 jmm@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe-eqiad
- 11:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P85710 and previous config saved to /var/cache/conftool/dbconfig/20251126-114218-marostegui.json
- 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe-codfw
- 11:37 jmm@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe-codfw
- 11:33 moritzm: installing libxslt security updates
- 11:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P85709 and previous config saved to /var/cache/conftool/dbconfig/20251126-112710-marostegui.json
- 11:26 ryankemper@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T410573
- 11:24 jynus@cumin2002: dbctl commit (dc=all): 'Depool db2166, perf issue', diff saved to https://phabricator.wikimedia.org/P85708 and previous config saved to /var/cache/conftool/dbconfig/20251126-112422-jynus.json
- 11:21 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 11:21 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 11:12 jynus@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2014.codfw.wmnet with reason: upgrade and restart
- 11:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T410531)', diff saved to https://phabricator.wikimedia.org/P85706 and previous config saved to /var/cache/conftool/dbconfig/20251126-111203-marostegui.json
- 11:10 bwojtowicz@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 11:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1202 (T410531)', diff saved to https://phabricator.wikimedia.org/P85705 and previous config saved to /var/cache/conftool/dbconfig/20251126-110951-marostegui.json
- 11:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1202.eqiad.wmnet with reason: Maintenance
- 11:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T410531)', diff saved to https://phabricator.wikimedia.org/P85704 and previous config saved to /var/cache/conftool/dbconfig/20251126-110928-marostegui.json
- 11:09 bwojtowicz@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 11:06 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 10:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P85702 and previous config saved to /var/cache/conftool/dbconfig/20251126-105420-marostegui.json
- 10:42 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: sync
- 10:42 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: sync
- 10:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P85701 and previous config saved to /var/cache/conftool/dbconfig/20251126-103913-marostegui.json
- 10:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 10:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 10:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T410531)', diff saved to https://phabricator.wikimedia.org/P85700 and previous config saved to /var/cache/conftool/dbconfig/20251126-102405-marostegui.json
- 10:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1194 (T410531)', diff saved to https://phabricator.wikimedia.org/P85699 and previous config saved to /var/cache/conftool/dbconfig/20251126-102153-marostegui.json
- 10:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1194.eqiad.wmnet with reason: Maintenance
- 10:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T410531)', diff saved to https://phabricator.wikimedia.org/P85698 and previous config saved to /var/cache/conftool/dbconfig/20251126-102130-marostegui.json
- 10:14 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
- 10:13 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
- 10:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 10:08 eileen: civicrm upgraded from e7df633d to 96715bf8
- 10:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 10:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P85697 and previous config saved to /var/cache/conftool/dbconfig/20251126-100623-marostegui.json
- 10:05 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
- 10:04 elukey@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
- 09:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 09:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 09:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/ferretdb-growthbook-next: apply
- 09:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/ferretdb-growthbook-next: apply
- 09:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/ferretdb-growthbook-next: apply
- 09:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/ferretdb-growthbook-next: apply
- 09:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P85696 and previous config saved to /var/cache/conftool/dbconfig/20251126-095115-marostegui.json
- 09:48 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
- 09:47 phuedx@deploy2002: Finished scap sync-world: Backport for Hooks: Only add global logging context for pageviews (T409965 T411074) (duration: 13m 29s)
- 09:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 09:42 phuedx@deploy2002: phuedx: Continuing with sync
- 09:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 09:38 elukey@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
- 09:37 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
- 09:36 phuedx@deploy2002: phuedx: Backport for Hooks: Only add global logging context for pageviews (T409965 T411074) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T410531)', diff saved to https://phabricator.wikimedia.org/P85695 and previous config saved to /var/cache/conftool/dbconfig/20251126-093607-marostegui.json
- 09:34 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
- 09:34 phuedx@deploy2002: Started scap sync-world: Backport for Hooks: Only add global logging context for pageviews (T409965 T411074)
- 09:33 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1191 (T410531)', diff saved to https://phabricator.wikimedia.org/P85694 and previous config saved to /var/cache/conftool/dbconfig/20251126-093356-marostegui.json
- 09:33 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1191.eqiad.wmnet with reason: Maintenance
- 09:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T410531)', diff saved to https://phabricator.wikimedia.org/P85693 and previous config saved to /var/cache/conftool/dbconfig/20251126-093332-marostegui.json
- 09:32 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
- 09:31 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
- 09:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P85692 and previous config saved to /var/cache/conftool/dbconfig/20251126-091825-marostegui.json
- 09:18 vgutierrez: repool cp7001
- 09:13 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 09:10 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.4 refs T408274
- 09:08 vgutierrez: depool cp7001
- 09:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P85691 and previous config saved to /var/cache/conftool/dbconfig/20251126-090317-marostegui.json
- 08:57 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7001.*
- 08:57 fabfur: repooling cp7001 (T406545)
- 08:54 elukey: `elukey@cumin1003:~$ sudo cumin 'thanos-fe*' 'systemctl restart swift-proxy' -b 1 -s 30` - Restart swift proxies to pick up the new tegola_staging account
- 08:53 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s8 T408663', diff saved to https://phabricator.wikimedia.org/P85690 and previous config saved to /var/cache/conftool/dbconfig/20251126-085344-marostegui.json
- 08:52 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s7 T408663', diff saved to https://phabricator.wikimedia.org/P85689 and previous config saved to /var/cache/conftool/dbconfig/20251126-085232-marostegui.json
- 08:51 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
- 08:50 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: apply
- 08:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T410531)', diff saved to https://phabricator.wikimedia.org/P85688 and previous config saved to /var/cache/conftool/dbconfig/20251126-084810-marostegui.json
- 08:48 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s6 T408663', diff saved to https://phabricator.wikimedia.org/P85687 and previous config saved to /var/cache/conftool/dbconfig/20251126-084758-marostegui.json
- 08:47 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
- 08:46 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s5 T408663', diff saved to https://phabricator.wikimedia.org/P85686 and previous config saved to /var/cache/conftool/dbconfig/20251126-084635-marostegui.json
- 08:46 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
- 08:45 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1181 (T410531)', diff saved to https://phabricator.wikimedia.org/P85685 and previous config saved to /var/cache/conftool/dbconfig/20251126-084510-marostegui.json
- 08:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1181.eqiad.wmnet with reason: Maintenance
- 08:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T410531)', diff saved to https://phabricator.wikimedia.org/P85684 and previous config saved to /var/cache/conftool/dbconfig/20251126-084447-marostegui.json
- 08:41 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7001.*
- 08:41 fabfur: depooling cp7001 to test known-client feature (T406545)
- 08:35 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s4 T408663', diff saved to https://phabricator.wikimedia.org/P85683 and previous config saved to /var/cache/conftool/dbconfig/20251126-083533-marostegui.json
- 08:35 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s4 T408663', diff saved to https://phabricator.wikimedia.org/P85682 and previous config saved to /var/cache/conftool/dbconfig/20251126-083511-marostegui.json
- 08:32 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
- 08:31 jmm@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
- 08:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P85681 and previous config saved to /var/cache/conftool/dbconfig/20251126-082939-marostegui.json
- 08:15 bvibber@deploy2002: Finished scap sync-world: Backport for mediawiki.util: Add adjustThumbWidthForSteps for step sizing in JS (T411013), Respect wgThumbnailSteps when generating thumbs (T411013), mediawiki.util: Add adjustThumbWidthForSteps for step sizing in JS (T411013), Respect wgThumbnailSteps when generating thumbs (T411013) (duration: 07
- 08:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P85680 and previous config saved to /var/cache/conftool/dbconfig/20251126-081431-marostegui.json
- 08:13 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
- 08:13 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
- 08:11 bvibber@deploy2002: bvibber: Continuing with sync
- 08:10 bvibber@deploy2002: bvibber: Backport for mediawiki.util: Add adjustThumbWidthForSteps for step sizing in JS (T411013), Respect wgThumbnailSteps when generating thumbs (T411013), mediawiki.util: Add adjustThumbWidthForSteps for step sizing in JS (T411013), Respect wgThumbnailSteps when generating thumbs (T411013) synced to the testservers (see
- 08:08 bvibber@deploy2002: Started scap sync-world: Backport for mediawiki.util: Add adjustThumbWidthForSteps for step sizing in JS (T411013), Respect wgThumbnailSteps when generating thumbs (T411013), mediawiki.util: Add adjustThumbWidthForSteps for step sizing in JS (T411013), Respect wgThumbnailSteps when generating thumbs (T411013)
- 07:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T410531)', diff saved to https://phabricator.wikimedia.org/P85679 and previous config saved to /var/cache/conftool/dbconfig/20251126-075924-marostegui.json
- 07:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1174 (T410531)', diff saved to https://phabricator.wikimedia.org/P85678 and previous config saved to /var/cache/conftool/dbconfig/20251126-075807-marostegui.json
- 07:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 07:56 brouberol@dns1004: END - running authdns-update
- 07:55 brouberol@dns1004: START - running authdns-update
- 07:55 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 07:55 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 07:54 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 07:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 07:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T410531)', diff saved to https://phabricator.wikimedia.org/P85677 and previous config saved to /var/cache/conftool/dbconfig/20251126-075345-marostegui.json
- 07:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 07:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P85676 and previous config saved to /var/cache/conftool/dbconfig/20251126-073837-marostegui.json
- 07:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P85675 and previous config saved to /var/cache/conftool/dbconfig/20251126-072330-marostegui.json
- 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s1 T408663', diff saved to https://phabricator.wikimedia.org/P85674 and previous config saved to /var/cache/conftool/dbconfig/20251126-072141-marostegui.json
- 07:20 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s2 T408663', diff saved to https://phabricator.wikimedia.org/P85673 and previous config saved to /var/cache/conftool/dbconfig/20251126-072038-marostegui.json
- 07:19 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in s3 T408663', diff saved to https://phabricator.wikimedia.org/P85672 and previous config saved to /var/cache/conftool/dbconfig/20251126-071947-marostegui.json
- 07:18 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in x3 T408663', diff saved to https://phabricator.wikimedia.org/P85671 and previous config saved to /var/cache/conftool/dbconfig/20251126-071857-marostegui.json
- 07:18 marostegui@cumin1003: dbctl commit (dc=all): 'Unify weights in x1 T408663', diff saved to https://phabricator.wikimedia.org/P85670 and previous config saved to /var/cache/conftool/dbconfig/20251126-071815-marostegui.json
- 07:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T410531)', diff saved to https://phabricator.wikimedia.org/P85669 and previous config saved to /var/cache/conftool/dbconfig/20251126-070822-marostegui.json
- 07:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1170 (T410531)', diff saved to https://phabricator.wikimedia.org/P85668 and previous config saved to /var/cache/conftool/dbconfig/20251126-070243-marostegui.json
- 07:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 07:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T410531)', diff saved to https://phabricator.wikimedia.org/P85667 and previous config saved to /var/cache/conftool/dbconfig/20251126-070219-marostegui.json
- 06:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1188 (T410589)', diff saved to https://phabricator.wikimedia.org/P85666 and previous config saved to /var/cache/conftool/dbconfig/20251126-065154-ladsgroup.json
- 06:51 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 06:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T410589)', diff saved to https://phabricator.wikimedia.org/P85665 and previous config saved to /var/cache/conftool/dbconfig/20251126-065131-ladsgroup.json
- 06:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P85664 and previous config saved to /var/cache/conftool/dbconfig/20251126-064712-marostegui.json
- 06:42 moritzm: upgrade Envoy on puppetboard* T405808
- 06:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P85663 and previous config saved to /var/cache/conftool/dbconfig/20251126-063624-ladsgroup.json
- 06:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P85662 and previous config saved to /var/cache/conftool/dbconfig/20251126-063204-marostegui.json
- 06:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P85661 and previous config saved to /var/cache/conftool/dbconfig/20251126-062116-ladsgroup.json
- 06:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T410531)', diff saved to https://phabricator.wikimedia.org/P85660 and previous config saved to /var/cache/conftool/dbconfig/20251126-061656-marostegui.json
- 06:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1158 (T410531)', diff saved to https://phabricator.wikimedia.org/P85659 and previous config saved to /var/cache/conftool/dbconfig/20251126-061445-marostegui.json
- 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 06:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2192.codfw.wmnet with reason: Maintenance
- 06:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T410589)', diff saved to https://phabricator.wikimedia.org/P85658 and previous config saved to /var/cache/conftool/dbconfig/20251126-060609-ladsgroup.json
- 04:26 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T410573
- 02:14 ejegg: fundraising python tools upgraded from 7efd562a to 8e900e85
- 01:22 rzl@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 01:21 rzl@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 01:21 rzl@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
- 01:20 rzl@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
- 01:19 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
- 01:19 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
- 01:17 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 01:17 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 01:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
- 01:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
- 01:14 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
- 01:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/toolhub: apply
- 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 12m 55s)
- 01:13 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/termbox: apply
- 01:12 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/termbox: apply
- 01:12 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
- 01:11 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
- 01:10 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 01:10 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 01:09 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
- 01:09 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
- 01:07 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 01:07 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 01:06 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
- 01:06 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
- 01:05 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 01:05 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 01:05 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
- 01:04 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
- 01:03 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
- 01:03 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
- 01:02 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply
- 01:02 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/recommendation-api: apply
- 01:01 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
- 01:01 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
- 01:01 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
- 01:00 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 01:00 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
- 00:59 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
- 00:58 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
- 00:58 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
- 00:57 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 00:57 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 00:56 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
- 00:55 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
- 00:55 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 00:53 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 00:52 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
- 00:52 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/media-analytics: apply
- 00:51 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
- 00:50 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mathoid: apply
- 00:49 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: apply
- 00:48 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
- 00:48 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
- 00:47 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
- 00:47 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/image-suggestion: apply
- 00:47 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/image-suggestion: apply
- 00:47 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
- 00:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
- 00:45 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
- 00:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
- 00:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
- 00:43 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
- 00:42 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
- 00:42 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
- 00:41 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
- 00:40 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
- 00:40 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
- 00:39 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
- 00:39 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
- 00:38 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
- 00:38 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
- 00:37 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
- 00:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
- 00:36 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
- 00:36 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/echostore: apply
- 00:35 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/echostore: apply
- 00:34 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
- 00:34 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
- 00:33 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
- 00:33 denisse: Upgrading envoy on titan hosts - T405808
- 00:33 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
- 00:33 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
- 00:33 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/data-gateway: apply
- 00:32 denisse: Upgrading envoy on 'titan1001.eqiad.wmnet' - T405808
- 00:32 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/commons-impact-analytics: apply
- 00:32 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/commons-impact-analytics: apply
- 00:31 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
- 00:31 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
- 00:30 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
- 00:30 denisse: Upgrading envoy on logstash hosts - T405808
- 00:30 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
- 00:30 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 00:29 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 00:29 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
- 00:28 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop: apply
- 00:28 denisse: Upgrading envoy on 'logstash1023.eqiad.wmnet' - T405808
- 00:26 denisse: Upgrading envoy on Graphite hosts - T405808
- 00:24 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
- 00:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/apertium: apply
- 00:24 denisse: Upgrading envoy on prometheus::pop hosts - T405808
- 00:23 denisse: Upgrading envoy on prometheus hosts - T405808
- 00:22 swfrench@deploy2002: Finished scap sync-world: Helmfile-only deployment to clean up migration overrides - T405955 (duration: 04m 10s)
- 00:20 denisse: Upgrading envoy on prometheus1005.eqiad.wmnet - T405808
- 00:20 denisse: Upgrading envoy on Grafana hosts - T405808
- 00:19 swfrench@deploy2002: Started scap sync-world: Helmfile-only deployment to clean up migration overrides - T405955
- 00:12 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
- 00:12 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/zotero: apply
- 00:11 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 00:10 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 00:10 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
- 00:09 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
- 00:08 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
- 00:07 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/toolhub: apply
- 00:07 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
- 00:07 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/termbox: apply
- 00:06 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
- 00:05 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
- 00:05 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
- 00:05 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
- 00:04 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
- 00:04 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
- 00:02 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 00:02 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 00:02 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
- 00:01 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
- 00:01 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
- 00:01 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
2025-11-25
- 23:59 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
- 23:58 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
- 23:57 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
- 23:57 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
- 23:57 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: apply
- 23:56 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/recommendation-api: apply
- 23:54 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
- 23:54 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
- 23:53 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
- 23:53 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
- 23:53 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
- 23:51 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: apply
- 23:51 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
- 23:51 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
- 23:50 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 23:50 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
- 23:49 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
- 23:49 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
- 23:48 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 23:47 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 23:45 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
- 23:45 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
- 23:44 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
- 23:44 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
- 23:43 jhathaway@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1005.eqiad.wmnet with OS bookworm
- 23:38 musikanimal@deploy2002: Finished scap sync-world: Backport for Revert "[metawiki] enable voting on entities with the 'Under review' status" (duration: 06m 44s)
- 23:34 musikanimal@deploy2002: musikanimal: Continuing with sync
- 23:33 musikanimal@deploy2002: musikanimal: Backport for Revert "[metawiki] enable voting on entities with the 'Under review' status" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:31 musikanimal@deploy2002: Started scap sync-world: Backport for Revert "[metawiki] enable voting on entities with the 'Under review' status"
- 23:27 musikanimal@deploy2002: Finished scap sync-world: Backport for [mediawikiwiki] Enable CommunityRequests with translations only (T405694) (duration: 08m 14s)
- 23:23 jhathaway@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1005.eqiad.wmnet with reason: host reimage
- 23:22 musikanimal@deploy2002: musikanimal: Continuing with sync
- 23:21 musikanimal@deploy2002: musikanimal: Backport for [mediawikiwiki] Enable CommunityRequests with translations only (T405694) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:19 musikanimal@deploy2002: Started scap sync-world: Backport for [mediawikiwiki] Enable CommunityRequests with translations only (T405694)
- 23:18 jhathaway@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1005.eqiad.wmnet with reason: host reimage
- 23:18 musikanimal@deploy2002: Sync cancelled.
- 23:06 musikanimal@deploy2002: musikanimal: Backport for [mediawikiwiki] Enable CommunityRequests with translations only (T405694), [metawiki] enable voting on entities with the 'Under review' status (T409613) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:04 musikanimal@deploy2002: Started scap sync-world: Backport for [mediawikiwiki] Enable CommunityRequests with translations only (T405694), [metawiki] enable voting on entities with the 'Under review' status (T409613)
- 22:59 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 22:58 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
- 22:51 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest1005.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 22:44 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host sretest1005.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 22:42 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
- 22:41 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
- 22:39 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
- 22:38 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: apply
- 22:37 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: apply
- 22:36 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
- 22:36 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
- 22:35 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/image-suggestion: apply
- 22:35 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/image-suggestion: apply
- 22:35 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
- 22:34 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
- 22:34 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
- 22:33 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
- 22:32 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
- 22:31 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
- 22:31 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
- 22:30 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
- 22:30 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
- 22:30 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
- 22:29 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
- 22:29 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
- 22:29 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
- 22:28 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
- 22:28 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
- 22:28 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
- 22:27 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
- 22:27 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
- 22:26 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
- 22:25 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/echostore: apply
- 22:25 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
- 22:25 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
- 22:24 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
- 22:24 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
- 22:24 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
- 22:23 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
- 22:22 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/commons-impact-analytics: apply
- 22:22 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/commons-impact-analytics: apply
- 22:21 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
- 22:21 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
- 22:20 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
- 22:19 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
- 22:19 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 22:18 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 22:18 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
- 22:18 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
- 22:16 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/apertium: apply
- 22:16 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/apertium: apply
- 22:09 eileen: civicrm upgraded from 130b07a1 to e7df633d
- 21:59 urbanecm@deploy2002: Finished scap sync-world: Backport for Fix cache expiration time for parsoid usage (T408741) (duration: 12m 58s)
- 21:54 urbanecm@deploy2002: urbanecm: Continuing with sync
- 21:48 urbanecm@deploy2002: urbanecm: Backport for Fix cache expiration time for parsoid usage (T408741) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:46 urbanecm@deploy2002: Started scap sync-world: Backport for Fix cache expiration time for parsoid usage (T408741)
- 21:33 urbanecm@deploy2002: Finished scap sync-world: Backport for Do not add IPInfo buttons when there is no mw-data-target (T410988), hCaptcha: Include AlwaysChallengeSiteKey in list of valid keys (T410863), hCaptcha: Include AlwaysChallengeSiteKey in list of valid keys (T410863), Fix cache expiration time for parsoid usage (T408741) (duration: 15m
- 21:28 urbanecm@deploy2002: kharlan, tchanders, cscott, urbanecm: Continuing with sync
- 21:19 urbanecm@deploy2002: kharlan, tchanders, cscott, urbanecm: Backport for Do not add IPInfo buttons when there is no mw-data-target (T410988), hCaptcha: Include AlwaysChallengeSiteKey in list of valid keys (T410863), hCaptcha: Include AlwaysChallengeSiteKey in list of valid keys (T410863), Fix cache expiration time for parsoid usage (T408741) sy
- 21:17 urbanecm@deploy2002: Started scap sync-world: Backport for Do not add IPInfo buttons when there is no mw-data-target (T410988), hCaptcha: Include AlwaysChallengeSiteKey in list of valid keys (T410863), hCaptcha: Include AlwaysChallengeSiteKey in list of valid keys (T410863), Fix cache expiration time for parsoid usage (T408741)
- 21:17 urbanecm@deploy2002: Finished scap sync-world: Backport for Deploy 2025 Global Readers Survey (T410696) (duration: 08m 39s)
- 21:11 urbanecm@deploy2002: dani, urbanecm: Continuing with sync
- 21:10 urbanecm@deploy2002: dani, urbanecm: Backport for Deploy 2025 Global Readers Survey (T410696) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:08 urbanecm@deploy2002: Started scap sync-world: Backport for Deploy 2025 Global Readers Survey (T410696)
- 20:38 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 20:36 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 20:26 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 20:26 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 20:19 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 20:17 jhathaway@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest1005.eqiad.wmnet']
- 20:08 jhathaway@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1005.eqiad.wmnet']
- 20:05 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['sretest1005.eqiad.wmnet']
- 19:59 jhathaway@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1005.eqiad.wmnet']
- 19:58 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['sretest1005.eqiad.wmnet']
- 19:58 jhathaway@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1005.eqiad.wmnet']
- 19:56 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['sretest1005.eqiad.wmnet']
- 19:53 jhathaway@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1005.eqiad.wmnet']
- 19:52 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 19:44 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 19:28 jhathaway@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1005.eqiad.wmnet with OS bookworm
- 19:22 cstone: payments-wiki upgraded from 5afaf7c8 to f254a5ab
- 19:22 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 58717
- 19:21 jgleeson: civicrm upgraded from bc6236d9 to 130b07a1
- 19:21 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 58717
- 19:07 urandom: upgrading restbase cluster to envoyproxy v1.32.12 — T405808
- 18:43 swfrench@deploy2002: Stopping before sync operations
- 18:42 swfrench@deploy2002: Started scap sync-world: Stop building PHP 8.1 images - T405955
- 18:41 urandom: upgrading envoyproxy to v1.32.12, restbase1031 & restbase2024— T405808
- 18:30 swfrench@deploy2002: Finished scap sync-world: Switch mw-debug/pinkunicorn to PHP 8.3 - T405955 (duration: 02m 54s)
- 18:27 swfrench@deploy2002: Started scap sync-world: Switch mw-debug/pinkunicorn to PHP 8.3 - T405955
- 18:19 mutante: deploying Phabricator config change
- 18:08 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 18:07 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 18:05 swfrench@deploy2002: Stopping before sync operations
- 18:04 swfrench@deploy2002: Started scap sync-world: No-deployment scap run to switch mw-script/main to PHP 8.3 - T405955
- 18:02 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 18:01 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 17:57 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 17:57 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 17:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T410531)', diff saved to https://phabricator.wikimedia.org/P85657 and previous config saved to /var/cache/conftool/dbconfig/20251125-174209-marostegui.json
- 17:41 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 17:41 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 17:36 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 17:36 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P85656 and previous config saved to /var/cache/conftool/dbconfig/20251125-172701-marostegui.json
- 17:23 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 17:23 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 17:17 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 17:15 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P85655 and previous config saved to /var/cache/conftool/dbconfig/20251125-171154-marostegui.json
- 17:09 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 17:09 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 17:07 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 17:07 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 17:04 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 17:04 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 16:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T410531)', diff saved to https://phabricator.wikimedia.org/P85654 and previous config saved to /var/cache/conftool/dbconfig/20251125-165645-marostegui.json
- 16:54 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 16:54 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 16:54 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 16:54 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
- 16:53 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1187 gradually with 4 steps - Repooling due to T410508
- 16:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2228 (T410531)', diff saved to https://phabricator.wikimedia.org/P85652 and previous config saved to /var/cache/conftool/dbconfig/20251125-165231-marostegui.json
- 16:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2228.codfw.wmnet with reason: Maintenance
- 16:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T410531)', diff saved to https://phabricator.wikimedia.org/P85651 and previous config saved to /var/cache/conftool/dbconfig/20251125-165208-marostegui.json
- 16:50 cgoubert@deploy2002: Unlocked for deployment [MediaWiki]: Depooling wikikube-ctrl1003 (duration: 05m 20s)
- 16:50 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
- 16:50 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
- 16:50 ammarpad@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=frwiki --logwiki=metawiki 'Ask Mona' Ch2025 # T411033
- 16:49 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on wikikube-ctrl1003.eqiad.wmnet with reason: C/D Migration
- 16:46 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
- 16:46 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
- 16:46 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 16:46 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 16:45 cgoubert@deploy2002: Locking from deployment [MediaWiki]: Depooling wikikube-ctrl1003
- 16:44 jforrester@deploy2002: Finished scap sync-world: Backport for Select zid after highest if latest zid insertion is taken (T410895) (duration: 12m 07s)
- 16:43 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 16:42 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 16:39 jforrester@deploy2002: jforrester: Continuing with sync
- 16:37 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 16:37 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 16:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P85649 and previous config saved to /var/cache/conftool/dbconfig/20251125-163700-marostegui.json
- 16:37 jforrester@deploy2002: jforrester: Backport for Select zid after highest if latest zid insertion is taken (T410895) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:32 jforrester@deploy2002: Started scap sync-world: Backport for Select zid after highest if latest zid insertion is taken (T410895)
- 16:28 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 16:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
- 16:26 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 16:26 jnuche@deploy2002: Finished scap sync-world: Backport for Add the full set of post-processing options to the ParserOptions array (T411017) (duration: 08m 42s)
- 16:24 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 16:23 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest1005.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 16:22 jnuche@deploy2002: jnuche: Continuing with sync
- 16:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P85646 and previous config saved to /var/cache/conftool/dbconfig/20251125-162152-marostegui.json
- 16:21 jnuche@deploy2002: jnuche: Backport for Add the full set of post-processing options to the ParserOptions array (T411017) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
- 16:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) maps2011.codfw.wmnet on all recursors
- 16:20 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache maps2011.codfw.wmnet on all recursors
- 16:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps2011 - ayounsi@cumin1003"
- 16:19 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps2011 - ayounsi@cumin1003"
- 16:17 jnuche@deploy2002: Started scap sync-world: Backport for Add the full set of post-processing options to the ParserOptions array (T411017)
- 16:16 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on sretest1005.eqiad.wmnet with reason: sleep test
- 16:16 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host sretest1005.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 16:16 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on kafka-main1009.eqiad.wmnet with reason: C/D Migration
- 16:15 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 16:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
- 16:12 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 16:12 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 16:10 vgutierrez: repool cp7001
- 16:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) maps2012.codfw.wmnet on all recursors
- 16:09 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache maps2012.codfw.wmnet on all recursors
- 16:09 moritzm: installing glibc security updates
- 16:08 fceratto@cumin1003: START - Cookbook sre.mysql.pool db1187 gradually with 4 steps - Repooling due to T410508
- 16:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
- 16:07 aikochou@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 16:07 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 16:07 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 16:06 claime: Eviction partition leadership from kafka-main1009 - T405950
- 16:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T410531)', diff saved to https://phabricator.wikimedia.org/P85644 and previous config saved to /var/cache/conftool/dbconfig/20251125-160646-marostegui.json
- 16:05 aikochou@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 16:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 16:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2223 (T410531)', diff saved to https://phabricator.wikimedia.org/P85643 and previous config saved to /var/cache/conftool/dbconfig/20251125-160208-marostegui.json
- 16:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2223.codfw.wmnet with reason: Maintenance
- 16:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T410531)', diff saved to https://phabricator.wikimedia.org/P85642 and previous config saved to /var/cache/conftool/dbconfig/20251125-160155-marostegui.json
- 16:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps2012 - ayounsi@cumin1003"
- 16:01 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps2012 - ayounsi@cumin1003"
- 15:57 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 15:55 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on kafka-main1008.eqiad.wmnet with reason: C/D Migration
- 15:52 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1006.eqiad.wmnet with OS trixie
- 15:51 vgutierrez: depool cp7001 (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1211060)
- 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
- 15:50 claime: Eviction partition leadership from kafka-main1008 - T405950
- 15:47 Dreamy_Jazz: Deploying private code change for T410280
- 15:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P85641 and previous config saved to /var/cache/conftool/dbconfig/20251125-154647-marostegui.json
- 15:45 moritzm: added Blake to pwstore
- 15:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
- 15:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1182 (T410589)', diff saved to https://phabricator.wikimedia.org/P85640 and previous config saved to /var/cache/conftool/dbconfig/20251125-154449-ladsgroup.json
- 15:44 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 15:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T410589)', diff saved to https://phabricator.wikimedia.org/P85639 and previous config saved to /var/cache/conftool/dbconfig/20251125-154426-ladsgroup.json
- 15:42 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) maps2013.codfw.wmnet on all recursors
- 15:42 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache maps2013.codfw.wmnet on all recursors
- 15:42 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:42 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps2013 - ayounsi@cumin1003"
- 15:42 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps2013 - ayounsi@cumin1003"
- 15:39 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 15:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) maps2014.codfw.wmnet on all recursors
- 15:36 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache maps2014.codfw.wmnet on all recursors
- 15:34 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1006.eqiad.wmnet with reason: host reimage
- 15:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
- 15:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P85638 and previous config saved to /var/cache/conftool/dbconfig/20251125-153140-marostegui.json
- 15:31 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 15:31 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1006.eqiad.wmnet with reason: host reimage
- 15:29 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 15:29 marostegui: Add clouddb1023 (s3,x3) to zarcillo T409557
- 15:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P85637 and previous config saved to /var/cache/conftool/dbconfig/20251125-152918-ladsgroup.json
- 15:26 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 15:26 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
- 15:26 jgiannelos@deploy2002: Finished scap sync-world: Backport for Allow proofread page to use parsoid when parsoid render is requested (T278481) (duration: 10m 09s)
- 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
- 15:23 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:23 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps2014 - ayounsi@cumin1003"
- 15:23 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps2014 - ayounsi@cumin1003"
- 15:22 jgiannelos@deploy2002: jgiannelos: Continuing with sync
- 15:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1187 (T299441)', diff saved to https://phabricator.wikimedia.org/P85636 and previous config saved to /var/cache/conftool/dbconfig/20251125-152031-fceratto.json
- 15:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 15:20 jgiannelos@deploy2002: jgiannelos: Backport for Allow proofread page to use parsoid when parsoid render is requested (T278481) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:18 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 15:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T410531)', diff saved to https://phabricator.wikimedia.org/P85634 and previous config saved to /var/cache/conftool/dbconfig/20251125-151632-marostegui.json
- 15:16 jgiannelos@deploy2002: Started scap sync-world: Backport for Allow proofread page to use parsoid when parsoid render is requested (T278481)
- 15:14 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
- 15:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P85633 and previous config saved to /var/cache/conftool/dbconfig/20251125-151411-ladsgroup.json
- 15:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest1006.eqiad.wmnet on all recursors
- 15:13 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache sretest1006.eqiad.wmnet on all recursors
- 15:13 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:13 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cahange sretest1006 IPs - cmooney@cumin1003"
- 15:12 vgutierrez: re-pool cp7001
- 15:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2213 (T410531)', diff saved to https://phabricator.wikimedia.org/P85632 and previous config saved to /var/cache/conftool/dbconfig/20251125-151156-marostegui.json
- 15:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2213.codfw.wmnet with reason: Maintenance
- 15:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T410531)', diff saved to https://phabricator.wikimedia.org/P85631 and previous config saved to /var/cache/conftool/dbconfig/20251125-151143-marostegui.json
- 15:10 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest1006.eqiad.wmnet on all recursors
- 15:10 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache sretest1006.eqiad.wmnet on all recursors
- 15:10 dani@deploy2002: Finished scap sync-world: Backport for Pre-deploy 2025 Global Readers Survey (T410696) (duration: 09m 41s)
- 15:10 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cahange sretest1006 IPs - cmooney@cumin1003"
- 15:06 dani@deploy2002: dani: Continuing with sync
- 15:06 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 15:04 dani@deploy2002: dani: Backport for Pre-deploy 2025 Global Readers Survey (T410696) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:04 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1006.eqiad.wmnet with OS trixie
- 15:03 vgutierrez: depool cp7001 (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1211059)
- 15:00 dani@deploy2002: Started scap sync-world: Backport for Pre-deploy 2025 Global Readers Survey (T410696)
- 15:00 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1187 gradually with 4 steps - Repooling due to T410508, also testing T391581
- 14:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T410589)', diff saved to https://phabricator.wikimedia.org/P85629 and previous config saved to /var/cache/conftool/dbconfig/20251125-145903-ladsgroup.json
- 14:58 dani@deploy2002: Finished scap sync-world: Backport for Revert "Pre-deploy 2025 Global Readers Survey" (duration: 08m 48s)
- 14:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P85628 and previous config saved to /var/cache/conftool/dbconfig/20251125-145636-marostegui.json
- 14:54 dani@deploy2002: dani: Continuing with sync
- 14:54 dani@deploy2002: dani: Backport for Revert "Pre-deploy 2025 Global Readers Survey" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:50 dani@deploy2002: Started scap sync-world: Backport for Revert "Pre-deploy 2025 Global Readers Survey"
- 14:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P85626 and previous config saved to /var/cache/conftool/dbconfig/20251125-144128-marostegui.json
- 14:36 dani@deploy2002: dani: Continuing with sync
- 14:35 dani@deploy2002: dani: Backport for Pre-deploy 2025 Global Readers Survey (T410696) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:34 moritzm: upgrade Envoy on webperf* T405808
- 14:34 moritzm: upgrade Envoy on webperfÜ T405808
- 14:31 dani@deploy2002: Started scap sync-world: Backport for Pre-deploy 2025 Global Readers Survey (T410696)
- 14:28 tchanders@deploy2002: Finished scap sync-world: Backport for FlowMoveBoardsToSubpages: Skip moves that throw exceptions (T402552), Drop $wgCampaignEventsCountrySchemaMigrationStage (T408932) (duration: 08m 54s)
- 14:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T410531)', diff saved to https://phabricator.wikimedia.org/P85623 and previous config saved to /var/cache/conftool/dbconfig/20251125-142621-marostegui.json
- 14:23 tchanders@deploy2002: daimona, esanders, tchanders: Continuing with sync
- 14:23 tchanders@deploy2002: daimona, esanders, tchanders: Backport for FlowMoveBoardsToSubpages: Skip moves that throw exceptions (T402552), Drop $wgCampaignEventsCountrySchemaMigrationStage (T408932) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:23 stephanebisson: Updated recommendation-api to 2025-11-20-132855-production (T410396, T410387)
- 14:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2211 (T410531)', diff saved to https://phabricator.wikimedia.org/P85621 and previous config saved to /var/cache/conftool/dbconfig/20251125-142145-marostegui.json
- 14:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2211.codfw.wmnet with reason: Maintenance
- 14:20 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 14:19 tchanders@deploy2002: Started scap sync-world: Backport for FlowMoveBoardsToSubpages: Skip moves that throw exceptions (T402552), Drop $wgCampaignEventsCountrySchemaMigrationStage (T408932)
- 14:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2201.codfw.wmnet with reason: Maintenance
- 14:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T410531)', diff saved to https://phabricator.wikimedia.org/P85620 and previous config saved to /var/cache/conftool/dbconfig/20251125-141836-marostegui.json
- 14:18 sbisson@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 14:15 tchanders@deploy2002: Finished scap sync-world: Backport for CommonSettings: Swap $wgCheckUserGroupRequirements for $wgRestrictedGroups (T409717), Assign 'ignore-restricted-groups' to steward group on metawiki (T409717) (duration: 10m 24s)
- 14:14 fceratto@cumin1003: START - Cookbook sre.mysql.pool db1187 gradually with 4 steps - Repooling due to T410508, also testing T391581
- 14:11 sbisson@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 14:11 tchanders@deploy2002: reedy, tchanders: Continuing with sync
- 14:09 tchanders@deploy2002: reedy, tchanders: Backport for CommonSettings: Swap $wgCheckUserGroupRequirements for $wgRestrictedGroups (T409717), Assign 'ignore-restricted-groups' to steward group on metawiki (T409717) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1187 (T299441)', diff saved to https://phabricator.wikimedia.org/P85617 and previous config saved to /var/cache/conftool/dbconfig/20251125-140809-fceratto.json
- 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 14:05 tchanders@deploy2002: Started scap sync-world: Backport for CommonSettings: Swap $wgCheckUserGroupRequirements for $wgRestrictedGroups (T409717), Assign 'ignore-restricted-groups' to steward group on metawiki (T409717)
- 14:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P85615 and previous config saved to /var/cache/conftool/dbconfig/20251125-140329-marostegui.json
- 14:02 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 14:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1018.eqiad.wmnet
- 13:57 jynus@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts backup2014.codfw.wmnet
- 13:56 jynus@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts backup2014.codfw.wmnet
- 13:56 sbisson@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1018.eqiad.wmnet
- 13:52 root@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts backup2014.codfw.wmnet
- 13:51 root@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts backup2014.codfw.wmnet
- 13:50 jynus@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2014.codfw.wmnet with reason: bios upgrade
- 13:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P85613 and previous config saved to /var/cache/conftool/dbconfig/20251125-134821-marostegui.json
- 13:43 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
- 13:33 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 13:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T410531)', diff saved to https://phabricator.wikimedia.org/P85612 and previous config saved to /var/cache/conftool/dbconfig/20251125-133314-marostegui.json
- 13:33 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 13:32 cmooney@cumin1003: START - Cookbook sre.hosts.provision for host sretest1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 13:31 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1006.eqiad.wmnet with reason: changing host to uefi mode boot
- 13:29 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2178 (T410531)', diff saved to https://phabricator.wikimedia.org/P85610 and previous config saved to /var/cache/conftool/dbconfig/20251125-132902-marostegui.json
- 13:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2178.codfw.wmnet with reason: Maintenance
- 13:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T410531)', diff saved to https://phabricator.wikimedia.org/P85609 and previous config saved to /var/cache/conftool/dbconfig/20251125-132849-marostegui.json
- 13:26 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1030.eqiad.wmnet with OS trixie
- 13:26 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 13:26 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1031.eqiad.wmnet with OS trixie
- 13:26 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 13:16 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 13:16 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 13:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P85606 and previous config saved to /var/cache/conftool/dbconfig/20251125-131341-marostegui.json
- 13:06 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
- 13:05 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
- 13:05 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
- 13:03 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
- 13:01 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
- 13:01 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 12:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P85605 and previous config saved to /var/cache/conftool/dbconfig/20251125-125834-marostegui.json
- 12:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T410531)', diff saved to https://phabricator.wikimedia.org/P85604 and previous config saved to /var/cache/conftool/dbconfig/20251125-124326-marostegui.json
- 12:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2171 (T410531)', diff saved to https://phabricator.wikimedia.org/P85603 and previous config saved to /var/cache/conftool/dbconfig/20251125-123844-marostegui.json
- 12:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 12:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T410531)', diff saved to https://phabricator.wikimedia.org/P85602 and previous config saved to /var/cache/conftool/dbconfig/20251125-123820-marostegui.json
- 12:34 ladsgroup@deploy2002: Finished scap sync-world: Backport for Revert^4 "rdbms: Dismantle concept of groups"" (T405087) (duration: 10m 08s)
- 12:30 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 12:28 ladsgroup@deploy2002: ladsgroup: Backport for Revert^4 "rdbms: Dismantle concept of groups"" (T405087) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:24 ladsgroup@deploy2002: Started scap sync-world: Backport for Revert^4 "rdbms: Dismantle concept of groups"" (T405087)
- 12:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P85600 and previous config saved to /var/cache/conftool/dbconfig/20251125-122313-marostegui.json
- 12:19 Emperor: depool / upgrade / restart envoy / repool on Apus frontends T405808
- 12:13 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Follow-up: Support edit events in suggested investigations (T410279) (duration: 10m 39s)
- 12:09 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 12:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P85599 and previous config saved to /var/cache/conftool/dbconfig/20251125-120805-marostegui.json
- 12:07 dreamyjazz@deploy2002: dreamyjazz: Backport for Follow-up: Support edit events in suggested investigations (T410279) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:03 dreamyjazz@deploy2002: Started scap sync-world: Backport for Follow-up: Support edit events in suggested investigations (T410279)
- 11:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T410531)', diff saved to https://phabricator.wikimedia.org/P85598 and previous config saved to /var/cache/conftool/dbconfig/20251125-115258-marostegui.json
- 11:50 Emperor: depool / upgrade / restart envoy / repool on ms frontends T405808
- 11:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2157 (T410531)', diff saved to https://phabricator.wikimedia.org/P85597 and previous config saved to /var/cache/conftool/dbconfig/20251125-114819-marostegui.json
- 11:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2157.codfw.wmnet with reason: Maintenance
- 11:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1210.eqiad.wmnet with reason: Maintenance
- 11:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 11:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 11:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T410531)', diff saved to https://phabricator.wikimedia.org/P85596 and previous config saved to /var/cache/conftool/dbconfig/20251125-114117-marostegui.json
- 11:28 Emperor: depool / upgrade / restart envoy / repool on thanos frontends T405808
- 11:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P85595 and previous config saved to /var/cache/conftool/dbconfig/20251125-112610-marostegui.json
- 11:18 Dreamy_Jazz: Deploying private code change for T410280
- 11:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P85594 and previous config saved to /var/cache/conftool/dbconfig/20251125-111102-marostegui.json
- 10:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 10:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T410531)', diff saved to https://phabricator.wikimedia.org/P85592 and previous config saved to /var/cache/conftool/dbconfig/20251125-105554-marostegui.json
- 10:53 jynus: restarting bacula-sd on backup1012, backup2012
- 10:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1230 (T410531)', diff saved to https://phabricator.wikimedia.org/P85591 and previous config saved to /var/cache/conftool/dbconfig/20251125-105253-marostegui.json
- 10:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1230.eqiad.wmnet with reason: Maintenance
- 10:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 10:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T410531)', diff saved to https://phabricator.wikimedia.org/P85590 and previous config saved to /var/cache/conftool/dbconfig/20251125-105112-marostegui.json
- 10:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P85589 and previous config saved to /var/cache/conftool/dbconfig/20251125-103605-marostegui.json
- 10:26 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 10:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P85588 and previous config saved to /var/cache/conftool/dbconfig/20251125-102057-marostegui.json
- 10:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T410531)', diff saved to https://phabricator.wikimedia.org/P85587 and previous config saved to /var/cache/conftool/dbconfig/20251125-100549-marostegui.json
- 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1207 (T410531)', diff saved to https://phabricator.wikimedia.org/P85586 and previous config saved to /var/cache/conftool/dbconfig/20251125-100247-marostegui.json
- 10:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T410531)', diff saved to https://phabricator.wikimedia.org/P85585 and previous config saved to /var/cache/conftool/dbconfig/20251125-100235-marostegui.json
- 09:54 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.4 refs T408274
- 09:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P85584 and previous config saved to /var/cache/conftool/dbconfig/20251125-094727-marostegui.json
- 09:43 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Define valid SiteKeys for account creation and edit triggers (T410657), hCaptcha: Allow providing a set of valid keys for site verify per action (T410657 T410863) (duration: 11m 03s)
- 09:39 kharlan@deploy2002: kharlan: Continuing with sync
- 09:36 kharlan@deploy2002: kharlan: Backport for hCaptcha: Define valid SiteKeys for account creation and edit triggers (T410657), hCaptcha: Allow providing a set of valid keys for site verify per action (T410657 T410863) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P85583 and previous config saved to /var/cache/conftool/dbconfig/20251125-093219-marostegui.json
- 09:32 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Define valid SiteKeys for account creation and edit triggers (T410657), hCaptcha: Allow providing a set of valid keys for site verify per action (T410657 T410863)
- 09:19 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Enable hCaptcha editing on frwiki in 100% passive mode (T405586) (duration: 12m 57s)
- 09:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T410531)', diff saved to https://phabricator.wikimedia.org/P85582 and previous config saved to /var/cache/conftool/dbconfig/20251125-091712-marostegui.json
- 09:15 kharlan@deploy2002: kharlan: Continuing with sync
- 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1200 (T410531)', diff saved to https://phabricator.wikimedia.org/P85581 and previous config saved to /var/cache/conftool/dbconfig/20251125-091412-marostegui.json
- 09:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1200.eqiad.wmnet with reason: Maintenance
- 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T410531)', diff saved to https://phabricator.wikimedia.org/P85580 and previous config saved to /var/cache/conftool/dbconfig/20251125-091400-marostegui.json
- 09:11 kharlan@deploy2002: kharlan: Backport for hCaptcha: Enable hCaptcha editing on frwiki in 100% passive mode (T405586) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:06 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Enable hCaptcha editing on frwiki in 100% passive mode (T405586)
- 09:05 XioNoX: convert Arelion codfw transit to LACP - T401100
- 09:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp2044.codfw.wmnet
- 09:03 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Adjust addurl logic for 100% passive mode (T409957) (duration: 11m 01s)
- 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P85579 and previous config saved to /var/cache/conftool/dbconfig/20251125-085852-marostegui.json
- 08:58 kharlan@deploy2002: kharlan: Continuing with sync
- 08:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp2044.codfw.wmnet
- 08:58 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 08:56 kharlan@deploy2002: kharlan: Backport for hCaptcha: Adjust addurl logic for 100% passive mode (T409957) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:56 XioNoX: drain Arelion codfw transit - T401100
- 08:52 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Adjust addurl logic for 100% passive mode (T409957)
- 08:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P85578 and previous config saved to /var/cache/conftool/dbconfig/20251125-084344-marostegui.json
- 08:41 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Adjust addurl config for zhwiki and jawiki (T410354 T409957) (duration: 13m 49s)
- 08:35 kharlan@deploy2002: kharlan: Continuing with sync
- 08:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 08:34 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 08:32 kharlan@deploy2002: kharlan: Backport for hCaptcha: Adjust addurl config for zhwiki and jawiki (T410354 T409957) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T410531)', diff saved to https://phabricator.wikimedia.org/P85577 and previous config saved to /var/cache/conftool/dbconfig/20251125-082836-marostegui.json
- 08:28 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Adjust addurl config for zhwiki and jawiki (T410354 T409957)
- 08:25 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1185 (T410531)', diff saved to https://phabricator.wikimedia.org/P85576 and previous config saved to /var/cache/conftool/dbconfig/20251125-082537-marostegui.json
- 08:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1185.eqiad.wmnet with reason: Maintenance
- 08:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T410531)', diff saved to https://phabricator.wikimedia.org/P85575 and previous config saved to /var/cache/conftool/dbconfig/20251125-082525-marostegui.json
- 08:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp1005.wikimedia.org
- 08:13 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 08:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp1005.wikimedia.org
- 08:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P85573 and previous config saved to /var/cache/conftool/dbconfig/20251125-081017-marostegui.json
- 08:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb2002-dev.wikimedia.org
- 08:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudweb2002-dev.wikimedia.org
- 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P85572 and previous config saved to /var/cache/conftool/dbconfig/20251125-075509-marostegui.json
- 07:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T410531)', diff saved to https://phabricator.wikimedia.org/P85571 and previous config saved to /var/cache/conftool/dbconfig/20251125-074002-marostegui.json
- 07:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1161 (T410531)', diff saved to https://phabricator.wikimedia.org/P85570 and previous config saved to /var/cache/conftool/dbconfig/20251125-073634-marostegui.json
- 07:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 07:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 07:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T410531)', diff saved to https://phabricator.wikimedia.org/P85569 and previous config saved to /var/cache/conftool/dbconfig/20251125-073549-marostegui.json
- 07:26 moritzm: upgrade Envoy on puppet servers T405808
- 07:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P85568 and previous config saved to /var/cache/conftool/dbconfig/20251125-072041-marostegui.json
- 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P85567 and previous config saved to /var/cache/conftool/dbconfig/20251125-070534-marostegui.json
- 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T410531)', diff saved to https://phabricator.wikimedia.org/P85566 and previous config saved to /var/cache/conftool/dbconfig/20251125-065026-marostegui.json
- 06:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T410531)', diff saved to https://phabricator.wikimedia.org/P85565 and previous config saved to /var/cache/conftool/dbconfig/20251125-064658-marostegui.json
- 06:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1159.eqiad.wmnet with reason: Maintenance
- 06:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2205.codfw.wmnet with reason: Maintenance
- 06:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1162 (T410589)', diff saved to https://phabricator.wikimedia.org/P85564 and previous config saved to /var/cache/conftool/dbconfig/20251125-060708-ladsgroup.json
- 06:07 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 06:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T410589)', diff saved to https://phabricator.wikimedia.org/P85563 and previous config saved to /var/cache/conftool/dbconfig/20251125-060643-ladsgroup.json
- 05:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P85562 and previous config saved to /var/cache/conftool/dbconfig/20251125-055136-ladsgroup.json
- 05:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P85561 and previous config saved to /var/cache/conftool/dbconfig/20251125-053629-ladsgroup.json
- 05:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T410589)', diff saved to https://phabricator.wikimedia.org/P85560 and previous config saved to /var/cache/conftool/dbconfig/20251125-052121-ladsgroup.json
- 05:03 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.1 (duration: 03m 53s)
- 04:58 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.4 refs T408274 (duration: 55m 00s)
- 04:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.4 refs T408274
- 03:38 eileen: tools upgraded from 5649c921 to 7efd562a
- 02:36 rzl: rzl@apt1002:~$ sudo -i reprepro -C component/envoy-future include bullseye-wikimedia /home/rzl/envoyproxy_1.35.6-1_amd64.changes # T410975
- 02:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
- 02:23 rzl@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
- 02:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 02:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 02:22 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
- 02:21 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 02:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
- 02:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
- 02:16 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
- 02:16 rzl@deploy2002: helmfile [staging] START helmfile.d/services/toolhub: apply
- 02:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 02:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/thumbor: apply
- 02:14 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/termbox: apply
- 02:14 rzl@deploy2002: helmfile [staging] START helmfile.d/services/termbox: apply
- 02:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
- 02:13 rzl@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
- 02:12 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
- 02:12 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
- 02:10 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
- 02:09 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
- 02:09 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 02:09 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 02:08 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
- 02:08 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
- 02:07 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
- 02:07 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
- 02:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 02:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
- 02:03 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
- 02:02 rzl@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: apply
- 02:02 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 02:02 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 02:02 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
- 02:01 rzl@deploy2002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
- 02:01 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
- 02:01 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
- 02:00 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
- 02:00 rzl@deploy2002: helmfile [staging] START helmfile.d/services/push-notifications: apply
- 01:59 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
- 01:59 rzl@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
- 01:59 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
- 01:59 rzl@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
- 01:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 01:58 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
- 01:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 01:58 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 01:57 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 01:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 01:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
- 01:55 rzl@deploy2002: helmfile [staging] START helmfile.d/services/media-analytics: apply
- 01:55 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
- 01:54 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mathoid: apply
- 01:54 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
- 01:51 rzl@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
- 01:49 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
- 01:48 rzl@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
- 01:48 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 01:48 rzl@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 01:48 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
- 01:47 rzl@deploy2002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
- 01:46 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
- 01:46 rzl@deploy2002: helmfile [staging] START helmfile.d/services/geo-analytics: apply
- 01:45 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
- 01:44 ryankemper@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T390860
- 01:44 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
- 01:43 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
- 01:42 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
- 01:41 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
- 01:41 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
- 01:41 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
- 01:40 eileen: civicrm upgraded from e4748b9f to bc6236d9
- 01:40 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
- 01:39 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
- 01:39 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
- 01:38 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
- 01:37 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
- 01:37 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
- 01:37 rzl@deploy2002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
- 01:37 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
- 01:36 rzl@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
- 01:36 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/echostore: apply
- 01:35 rzl@deploy2002: helmfile [staging] START helmfile.d/services/echostore: apply
- 01:35 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
- 01:35 eileen: config revision changed from 29b04fbc to 89e8c48b
- 01:35 rzl@deploy2002: helmfile [staging] START helmfile.d/services/device-analytics: apply
- 01:34 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
- 01:34 rzl@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 01:34 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
- 01:34 rzl@deploy2002: helmfile [staging] START helmfile.d/services/data-gateway: apply
- 01:33 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 01:33 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 01:33 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/commons-impact-analytics: apply
- 01:32 rzl@deploy2002: helmfile [staging] START helmfile.d/services/commons-impact-analytics: apply
- 01:32 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 01:32 rzl@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
- 01:31 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
- 01:31 rzl@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
- 01:30 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
- 01:30 rzl@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
- 01:30 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
- 01:30 rzl@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
- 01:28 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/apertium: apply
- 01:28 rzl@deploy2002: helmfile [staging] START helmfile.d/services/apertium: apply
- 01:01 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 01m 14s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:18 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.codfw.wmnet
- 00:12 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.codfw.wmnet
- 00:12 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
- 00:05 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
- 00:05 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet
2025-11-24
- 23:59 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet
- 23:59 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2003-dev.codfw.wmnet
- 23:52 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2003-dev.codfw.wmnet
- 23:52 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet
- 23:44 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet
- 23:44 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2002-dev.codfw.wmnet
- 23:37 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2002-dev.codfw.wmnet
- 23:37 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet
- 23:30 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1029.eqiad.wmnet with OS trixie
- 23:29 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1028.eqiad.wmnet with OS trixie
- 23:29 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet
- 23:29 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet
- 23:22 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2010-dev.codfw.wmnet
- 23:22 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb2003-dev.codfw.wmnet
- 23:13 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb2003-dev.codfw.wmnet
- 23:13 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet
- 23:03 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2006-dev.codfw.wmnet
- 23:03 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb2002-dev.codfw.wmnet
- 22:59 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T390860
- 22:59 ryankemper@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T390860
- 22:54 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb2002-dev.codfw.wmnet
- 22:54 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
- 22:46 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
- 22:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1002-dev.eqiad.wmnet
- 22:45 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T390860
- 22:45 jgleeson: payments-wiki upgraded from f1fe8fa6 to 5afaf7c8
- 22:42 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup1002-dev.eqiad.wmnet
- 22:41 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2001-dev.codfw.wmnet
- 22:40 dani@deploy2002: Finished scap sync-world: Backport for Revert "Deploy experiment for 2025 Global Readers Survey", Revert "Pre-deploy 2025 Global Readers Survey" (duration: 20m 42s)
- 22:40 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1001-dev.eqiad.wmnet
- 22:36 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup1001-dev.eqiad.wmnet
- 22:35 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2001-dev.codfw.wmnet
- 22:34 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1032.eqiad.wmnet with OS trixie
- 22:31 eileen: tools upgraded from 773e8d11 to 5649c921
- 22:28 dani@deploy2002: dani: Continuing with sync
- 22:26 dani@deploy2002: dani: Backport for Revert "Deploy experiment for 2025 Global Readers Survey", Revert "Pre-deploy 2025 Global Readers Survey" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:19 dani@deploy2002: Started scap sync-world: Backport for Revert "Deploy experiment for 2025 Global Readers Survey", Revert "Pre-deploy 2025 Global Readers Survey"
- 22:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2239.codfw.wmnet with reason: Maintenance
- 22:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T410531)', diff saved to https://phabricator.wikimedia.org/P85559 and previous config saved to /var/cache/conftool/dbconfig/20251124-221207-marostegui.json
- 22:10 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
- 22:09 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1028.eqiad.wmnet with OS trixie
- 22:01 dani@deploy2002: dani, bvibber: Backport for Pre-deploy 2025 Global Readers Survey (T410696), Deploy experiment for 2025 Global Readers Survey (T410696), Show "no data" message when tooltip does not contain to show (T401990) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P85558 and previous config saved to /var/cache/conftool/dbconfig/20251124-215659-marostegui.json
- 21:56 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1031.eqiad.wmnet with OS trixie
- 21:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P85557 and previous config saved to /var/cache/conftool/dbconfig/20251124-214151-marostegui.json
- 21:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
- 21:34 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
- 21:33 dani@deploy2002: Started scap sync-world: Backport for Pre-deploy 2025 Global Readers Survey (T410696), Deploy experiment for 2025 Global Readers Survey (T410696), Show "no data" message when tooltip does not contain to show (T401990)
- 21:30 arlolra@deploy2002: Finished scap sync-world: Backport for arwiktionary: make Cite button in main VE bar (T410840) (duration: 08m 54s)
- 21:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T410531)', diff saved to https://phabricator.wikimedia.org/P85556 and previous config saved to /var/cache/conftool/dbconfig/20251124-212643-marostegui.json
- 21:26 arlolra@deploy2002: arlolra, hubaishan: Continuing with sync
- 21:26 arlolra@deploy2002: arlolra, hubaishan: Backport for arwiktionary: make Cite button in main VE bar (T410840) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:21 arlolra@deploy2002: Started scap sync-world: Backport for arwiktionary: make Cite button in main VE bar (T410840)
- 21:17 arlolra@deploy2002: Finished scap sync-world: Backport for Deploy Parsoid Read Views to 18 wikis (T410564), Mark non-wikimedia.org math APIs as deprecated in the sandbox (T409773) (duration: 11m 49s)
- 21:16 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1031.eqiad.wmnet with OS trixie
- 21:14 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1032.eqiad.wmnet with OS trixie
- 21:12 arlolra@deploy2002: arlolra, aaron: Continuing with sync
- 21:10 arlolra@deploy2002: arlolra, aaron: Backport for Deploy Parsoid Read Views to 18 wikis (T410564), Mark non-wikimedia.org math APIs as deprecated in the sandbox (T409773) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:05 arlolra@deploy2002: Started scap sync-world: Backport for Deploy Parsoid Read Views to 18 wikis (T410564), Mark non-wikimedia.org math APIs as deprecated in the sandbox (T409773)
- 21:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2227 (T410531)', diff saved to https://phabricator.wikimedia.org/P85555 and previous config saved to /var/cache/conftool/dbconfig/20251124-210326-marostegui.json
- 21:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2227.codfw.wmnet with reason: Maintenance
- 21:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T410531)', diff saved to https://phabricator.wikimedia.org/P85554 and previous config saved to /var/cache/conftool/dbconfig/20251124-210302-marostegui.json
- 20:56 swfrench-wmf: updated local PHP CLI installation on deploy2002 to 8.3 - T405955
- 20:55 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1030.eqiad.wmnet with OS trixie
- 20:51 swfrench-wmf: updated local PHP CLI installation on deploy1003 to 8.3 - T405955
- 20:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P85553 and previous config saved to /var/cache/conftool/dbconfig/20251124-204754-marostegui.json
- 20:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage
- 20:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P85552 and previous config saved to /var/cache/conftool/dbconfig/20251124-203247-marostegui.json
- 20:32 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage
- 20:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T410531)', diff saved to https://phabricator.wikimedia.org/P85551 and previous config saved to /var/cache/conftool/dbconfig/20251124-201739-marostegui.json
- 20:15 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1030.eqiad.wmnet with OS trixie
- 20:14 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1030.eqiad.wmnet with OS trixie
- 20:08 ejegg: standalone (IPN listener) SmashPig upgraded from 9c00fd50 to a25fbb28
- 20:00 urbanecm@deploy2002: Finished scap sync-world: Backport for Increase AbuseFilter's emergency disable threshold for fawiki (T302227) (duration: 09m 43s)
- 19:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2209 (T410531)', diff saved to https://phabricator.wikimedia.org/P85550 and previous config saved to /var/cache/conftool/dbconfig/20251124-195747-marostegui.json
- 19:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2209.codfw.wmnet with reason: Maintenance
- 19:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T410531)', diff saved to https://phabricator.wikimedia.org/P85549 and previous config saved to /var/cache/conftool/dbconfig/20251124-195723-marostegui.json
- 19:56 urbanecm@deploy2002: huji, urbanecm: Continuing with sync
- 19:55 urbanecm@deploy2002: huji, urbanecm: Backport for Increase AbuseFilter's emergency disable threshold for fawiki (T302227) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 19:50 urbanecm@deploy2002: Started scap sync-world: Backport for Increase AbuseFilter's emergency disable threshold for fawiki (T302227)
- 19:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P85548 and previous config saved to /var/cache/conftool/dbconfig/20251124-194215-marostegui.json
- 19:39 jgleeson: payments-wiki upgraded from 5c6748cc to f1fe8fa6
- 19:29 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1028.eqiad.wmnet with OS bookworm
- 19:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P85547 and previous config saved to /var/cache/conftool/dbconfig/20251124-192707-marostegui.json
- 19:27 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1030.eqiad.wmnet with OS trixie
- 19:25 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs7003*} and A:liberica
- 19:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T410531)', diff saved to https://phabricator.wikimedia.org/P85545 and previous config saved to /var/cache/conftool/dbconfig/20251124-191200-marostegui.json
- 18:58 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs7003*} and A:liberica
- 18:50 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2194 (T410531)', diff saved to https://phabricator.wikimedia.org/P85544 and previous config saved to /var/cache/conftool/dbconfig/20251124-185050-marostegui.json
- 18:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2194.codfw.wmnet with reason: Maintenance
- 18:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T410531)', diff saved to https://phabricator.wikimedia.org/P85543 and previous config saved to /var/cache/conftool/dbconfig/20251124-185026-marostegui.json
- 18:41 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 18:36 swfrench-wmf: deleted EtcdReplicationDown silence. f75c71c9-62d3-449f-860a-9b5e4570717a - T405950
- 18:36 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 18:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P85542 and previous config saved to /var/cache/conftool/dbconfig/20251124-183518-marostegui.json
- 18:34 swfrench-wmf: begin restarts of eqiad-associated confds, navtiming, requestctl - T405950
- 18:32 swfrench@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Hold deployments during etcd ToR switch migration - T405950 (duration: 08m 43s)
- 18:31 swfrench-wmf: manually transferred etcd-mirror replication source back to conf1009 - T405950
- 18:25 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on conf1009.eqiad.wmnet with reason: C/D Migration
- 18:24 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
- 18:23 swfrench@deploy2002: Locking from deployment [ALL REPOSITORIES]: Hold deployments during etcd ToR switch migration - T405950
- 18:21 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1032.eqiad.wmnet with OS trixie
- 18:21 swfrench-wmf: manually transferred etcd-mirror replication source to conf1008 - T405950
- 18:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
- 18:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P85541 and previous config saved to /var/cache/conftool/dbconfig/20251124-182011-marostegui.json
- 18:19 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage
- 18:16 swfrench-wmf: silenced EtcdReplicationDown. f75c71c9-62d3-449f-860a-9b5e4570717a - T405950
- 18:11 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage
- 18:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1028.eqiad.wmnet with OS bookworm
- 18:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T410531)', diff saved to https://phabricator.wikimedia.org/P85540 and previous config saved to /var/cache/conftool/dbconfig/20251124-180503-marostegui.json
- 18:05 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1031.eqiad.wmnet with OS trixie
- 17:55 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1030.eqiad.wmnet with OS trixie
- 17:51 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS bookworm
- 17:45 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2190 (T410531)', diff saved to https://phabricator.wikimedia.org/P85539 and previous config saved to /var/cache/conftool/dbconfig/20251124-174501-marostegui.json
- 17:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2190.codfw.wmnet with reason: Maintenance
- 17:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T410531)', diff saved to https://phabricator.wikimedia.org/P85538 and previous config saved to /var/cache/conftool/dbconfig/20251124-174437-marostegui.json
- 17:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P85537 and previous config saved to /var/cache/conftool/dbconfig/20251124-172929-marostegui.json
- 17:27 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 17:26 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 17:24 urbanecm@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
- 17:23 urbanecm@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
- 17:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P85536 and previous config saved to /var/cache/conftool/dbconfig/20251124-171418-marostegui.json
- 17:11 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1032.eqiad.wmnet with OS trixie
- 17:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1030.eqiad.wmnet with OS trixie
- 17:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1029.eqiad.wmnet with OS trixie
- 17:09 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1028.eqiad.wmnet with OS trixie
- 17:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1019.eqiad.wmnet
- 17:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1019.eqiad.wmnet
- 17:02 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1013.eqiad.wmnet
- 17:01 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1032.eqiad.wmnet with OS trixie
- 17:00 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1031.eqiad.wmnet with OS trixie
- 16:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T410531)', diff saved to https://phabricator.wikimedia.org/P85535 and previous config saved to /var/cache/conftool/dbconfig/20251124-165910-marostegui.json
- 16:56 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1013.eqiad.wmnet
- 16:43 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 59s)
- 16:41 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 08m 44s)
- 16:36 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on an-master1004.eqiad.wmnet,an-redacteddb1001.eqiad.wmnet,an-test-coord1001.eqiad.wmnet with reason: Prepping for switch swap
- 16:34 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on stat1011.eqiad.wmnet with reason: Prepping for switch swap
- 16:34 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on an-test-master1002.eqiad.wmnet with reason: Prepping for switch swap
- 16:33 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2177 (T410531)', diff saved to https://phabricator.wikimedia.org/P85534 and previous config saved to /var/cache/conftool/dbconfig/20251124-163345-marostegui.json
- 16:33 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2177.codfw.wmnet with reason: Maintenance
- 16:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T410531)', diff saved to https://phabricator.wikimedia.org/P85533 and previous config saved to /var/cache/conftool/dbconfig/20251124-163320-marostegui.json
- 16:33 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1169.eqiad.wmnet with OS bookworm
- 16:32 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dse-k8s-worker[1011,1013,1019].eqiad.wmnet with reason: Prepping for switch swap
- 16:30 moritzm: installing usb.ids updates from Bookworm point release
- 16:28 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 52s)
- 16:23 urbanecm: Delete job/growthexperiments-refreshlinkrecommendations-s2-29399967 and job/growthexperiments-refreshlinkrecommendations-s3-29399607 (T407818)
- 16:23 jmm@dns1004: END - running authdns-update
- 16:22 jmm@dns1004: START - running authdns-update
- 16:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P85532 and previous config saved to /var/cache/conftool/dbconfig/20251124-161813-marostegui.json
- 16:16 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 16:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 16:14 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 16:14 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 16:14 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/thumbor: apply
- 16:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Testing latency
- 16:06 marostegui@cumin1003: dbctl commit (dc=all): 'db1184 depool for testing', diff saved to https://phabricator.wikimedia.org/P85531 and previous config saved to /var/cache/conftool/dbconfig/20251124-160601-marostegui.json
- 16:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P85530 and previous config saved to /var/cache/conftool/dbconfig/20251124-160305-marostegui.json
- 16:00 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] Enable Add Link task pool generation for 3 wikis (T407818) (duration: 12m 04s)
- 15:56 urbanecm@deploy2002: urbanecm: Continuing with sync
- 15:53 urbanecm@deploy2002: urbanecm: Backport for [Growth] Enable Add Link task pool generation for 3 wikis (T407818) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:51 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1032.eqiad.wmnet with OS trixie
- 15:51 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1031.eqiad.wmnet with OS trixie
- 15:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1030.eqiad.wmnet with OS trixie
- 15:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
- 15:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1028.eqiad.wmnet with OS trixie
- 15:48 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] Enable Add Link task pool generation for 3 wikis (T407818)
- 15:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T410531)', diff saved to https://phabricator.wikimedia.org/P85529 and previous config saved to /var/cache/conftool/dbconfig/20251124-154758-marostegui.json
- 15:47 urbanecm@deploy2002: Finished scap sync-world: Backport for testwiki: enable ReviseTone experiment (T407029) (duration: 12m 19s)
- 15:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1032.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:43 urbanecm@deploy2002: urbanecm, migr: Continuing with sync
- 15:40 bking@cumin2002: START - Cookbook sre.hosts.provision for host wdqs1032.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:39 urbanecm@deploy2002: urbanecm, migr: Backport for testwiki: enable ReviseTone experiment (T407029) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:39 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1031.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:38 bking@cumin2002: START - Cookbook sre.hosts.provision for host wdqs1031.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:37 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1030.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:36 bking@cumin2002: START - Cookbook sre.hosts.provision for host wdqs1030.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1029.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:34 urbanecm@deploy2002: Started scap sync-world: Backport for testwiki: enable ReviseTone experiment (T407029)
- 15:34 bking@cumin2002: START - Cookbook sre.hosts.provision for host wdqs1029.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:30 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for tests: Make data providers static methods (T410731) (duration: 09m 15s)
- 15:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2156 (T410531)', diff saved to https://phabricator.wikimedia.org/P85528 and previous config saved to /var/cache/conftool/dbconfig/20251124-152805-marostegui.json
- 15:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2156.codfw.wmnet with reason: Maintenance
- 15:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T410531)', diff saved to https://phabricator.wikimedia.org/P85527 and previous config saved to /var/cache/conftool/dbconfig/20251124-152741-marostegui.json
- 15:27 bking@cumin2002: START - Cookbook sre.hosts.provision for host wdqs1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:26 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, d3r1ck01: Continuing with sync
- 15:25 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, d3r1ck01: Backport for tests: Make data providers static methods (T410731) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:21 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for tests: Make data providers static methods (T410731)
- 15:19 Amir1: cumin2024@db2205.codfw.wmnet[(none)]> drop database if exists blocker; drop database if exists defoundation; drop database if exists oai; drop database if exists steward; (T297297)
- 15:16 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for rowiki: Redefine AbuseFilter permission model (T407978) (duration: 13m 02s)
- 15:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P85526 and previous config saved to /var/cache/conftool/dbconfig/20251124-151233-marostegui.json
- 15:12 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, dragoniez: Continuing with sync
- 15:10 Amir1: cumin2024@db2191.codfw.wmnet[wikishared]> drop table if exists wikimedia_editor_tasks_counts; drop table if exists wikimedia_editor_tasks_edit_streak; drop table if exists wikimedia_editor_tasks_keys; (T410692)
- 15:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1184.eqiad.wmnet with reason: Testing latency
- 15:08 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, dragoniez: Backport for rowiki: Redefine AbuseFilter permission model (T407978) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:03 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for rowiki: Redefine AbuseFilter permission model (T407978)
- 15:01 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1156 (T410589)', diff saved to https://phabricator.wikimedia.org/P85524 and previous config saved to /var/cache/conftool/dbconfig/20251124-150146-ladsgroup.json
- 15:01 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 14:58 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for jawiki: Disallow sysops from granting temporary-account-viewer (T409687) (duration: 11m 33s)
- 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P85523 and previous config saved to /var/cache/conftool/dbconfig/20251124-145726-marostegui.json
- 14:52 lucaswerkmeister-wmde@deploy2002: dragoniez, lucaswerkmeister-wmde: Continuing with sync
- 14:51 lucaswerkmeister-wmde@deploy2002: dragoniez, lucaswerkmeister-wmde: Backport for jawiki: Disallow sysops from granting temporary-account-viewer (T409687) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:46 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for jawiki: Disallow sysops from granting temporary-account-viewer (T409687)
- 14:43 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 14:42 tappof: Remove unused md2 and add its devices to vg0 on titan2002 T410152
- 14:42 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 14:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T410531)', diff saved to https://phabricator.wikimedia.org/P85521 and previous config saved to /var/cache/conftool/dbconfig/20251124-144218-marostegui.json
- 14:38 sukhe: sudo cumin "A:cp" "disable-puppet 'merging CR 1207978'": T409780
- 14:35 ladsgroup@deploy2002: Finished scap sync-world: Backport for Enable DiscussionTools visual enhancements on ruwiki & svwiki (T379264) (duration: 19m 18s)
- 14:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host db1169.eqiad.wmnet with OS bookworm
- 14:31 ladsgroup@deploy2002: esanders, ladsgroup: Continuing with sync
- 14:28 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 14:28 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 14:27 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 14:27 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 14:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 14:25 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 14:22 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2149 (T410531)', diff saved to https://phabricator.wikimedia.org/P85520 and previous config saved to /var/cache/conftool/dbconfig/20251124-142221-marostegui.json
- 14:22 tappof: Remove unused md2 and add its devices to vg0 on titan1002 T410152
- 14:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2149.codfw.wmnet with reason: Maintenance
- 14:21 ladsgroup@deploy2002: esanders, ladsgroup: Backport for Enable DiscussionTools visual enhancements on ruwiki & svwiki (T379264) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:15 ladsgroup@deploy2002: Started scap sync-world: Backport for Enable DiscussionTools visual enhancements on ruwiki & svwiki (T379264)
- 14:15 ladsgroup@deploy2002: Finished scap sync-world: Backport for Revert "tcywikisource: throttle exception" (T410507) (duration: 09m 44s)
- 14:12 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 14:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 14:11 ladsgroup@deploy2002: anzx, ladsgroup: Continuing with sync
- 14:10 ladsgroup@deploy2002: anzx, ladsgroup: Backport for Revert "tcywikisource: throttle exception" (T410507) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:05 ladsgroup@deploy2002: Started scap sync-world: Backport for Revert "tcywikisource: throttle exception" (T410507)
- 14:05 ladsgroup@deploy2002: Finished scap sync-world: Backport for Fix db config for offline maint scripts (T410738 T405087), Fix fix db config for offline maint scripts (T410738 T405087) (duration: 09m 07s)
- 14:00 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 14:00 ladsgroup@deploy2002: ladsgroup: Backport for Fix db config for offline maint scripts (T410738 T405087), Fix fix db config for offline maint scripts (T410738 T405087) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:55 ladsgroup@deploy2002: Started scap sync-world: Backport for Fix db config for offline maint scripts (T410738 T405087), Fix fix db config for offline maint scripts (T410738 T405087)
- 13:55 dpogorzelski@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: sync
- 13:55 dpogorzelski@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: sync
- 13:54 dpogorzelski@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
- 13:54 dpogorzelski@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop: sync
- 13:53 dpogorzelski@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
- 13:53 dpogorzelski@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: sync
- 13:43 ladsgroup@deploy2002: sync-world failed: <CalledProcessError> Command '['sudo', '-u', 'mwbuilder', '-n', '--', '/usr/bin/scap', 'mwscript', '--no-local-config', '--directory', '/srv/mediawiki-staging', '--user', 'www-data', '--', 'mergeMessageFileList.php', '--wiki=aawiki', '--force-version', '1.46.0-wmf.3', '--list-file', '/srv/mediawiki-staging/wmf-config/extension-list', '--output', '/tmp/tmp.Seyz9S1dDd']' returned
- 13:43 ladsgroup@deploy2002: Started scap sync-world: Backport for Fix db config for offline maint scripts (T410738 T405087)
- 13:42 ladsgroup@deploy2002: sync-world failed: <CalledProcessError> Command '['sudo', '-u', 'mwbuilder', '-n', '--', '/usr/bin/scap', 'mwscript', '--no-local-config', '--directory', '/srv/mediawiki-staging', '--user', 'www-data', '--', 'mergeMessageFileList.php', '--wiki=aawiki', '--force-version', '1.46.0-wmf.3', '--list-file', '/srv/mediawiki-staging/wmf-config/extension-list', '--output', '/tmp/tmp.1aRzXHW4OP']' returned
- 13:41 ladsgroup@deploy2002: Started scap sync-world: Backport for Fix db config for offline maint scripts (T410738 T405087)
- 13:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1032.eqiad.wmnet with OS trixie
- 13:36 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1031.eqiad.wmnet with OS trixie
- 13:33 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1028.eqiad.wmnet with OS trixie
- 13:28 Amir1: cleaning up watchlist of deceased User:JarrahTree in enwiki and commonswiki
- 13:16 gehel@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
- 13:15 gehel@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
- 13:08 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 13:07 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 13:05 aikochou@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 13:00 aikochou@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 12:55 gehel@cumin2002: START - Cookbook sre.hosts.reboot-cluster
- 12:55 gehel@cumin2002: START - Cookbook sre.hosts.reboot-cluster
- 12:54 gehel@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
- 12:54 gehel@cumin2002: START - Cookbook sre.hosts.reboot-cluster
- 12:49 gehel@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2025.codfw.wmnet
- 12:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1223.eqiad.wmnet with reason: Maintenance
- 12:45 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:42 gehel@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2025.codfw.wmnet
- 12:32 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 12:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 12:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1187 gradually with 4 steps - repool after schema change test
- 12:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 12:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T410531)', diff saved to https://phabricator.wikimedia.org/P85515 and previous config saved to /var/cache/conftool/dbconfig/20251124-121341-marostegui.json
- 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp2005.wikimedia.org
- 12:01 moritzm: installing Squid security updates
- 11:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp2005.wikimedia.org
- 11:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P85513 and previous config saved to /var/cache/conftool/dbconfig/20251124-115834-marostegui.json
- 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
- 11:56 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 11:56 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 11:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
- 11:53 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 11:52 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 11:46 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 11:46 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 11:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test2005.wikimedia.org
- 11:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P85511 and previous config saved to /var/cache/conftool/dbconfig/20251124-114326-marostegui.json
- 11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org
- 11:31 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1187 gradually with 4 steps - repool after schema change test
- 11:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1187 (T299441)', diff saved to https://phabricator.wikimedia.org/P85509 and previous config saved to /var/cache/conftool/dbconfig/20251124-112850-marostegui.json
- 11:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 11:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T410531)', diff saved to https://phabricator.wikimedia.org/P85508 and previous config saved to /var/cache/conftool/dbconfig/20251124-112819-marostegui.json
- 11:26 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) db1187 gradually with 4 steps - repool after schema change test
- 11:25 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 11:25 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 11:24 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
- 11:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1187 gradually with 4 steps - repool after schema change test
- 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1212 (T410531)', diff saved to https://phabricator.wikimedia.org/P85506 and previous config saved to /var/cache/conftool/dbconfig/20251124-112306-marostegui.json
- 11:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on 6 hosts with reason: Maintenance
- 11:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1212.eqiad.wmnet with reason: Maintenance
- 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T410531)', diff saved to https://phabricator.wikimedia.org/P85505 and previous config saved to /var/cache/conftool/dbconfig/20251124-112231-marostegui.json
- 11:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1187 (T299441)', diff saved to https://phabricator.wikimedia.org/P85504 and previous config saved to /var/cache/conftool/dbconfig/20251124-112111-marostegui.json
- 11:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 11:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P85503 and previous config saved to /var/cache/conftool/dbconfig/20251124-110723-marostegui.json
- 10:57 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:56 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:56 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:56 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:56 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:56 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:56 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:55 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:52 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
- 10:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P85502 and previous config saved to /var/cache/conftool/dbconfig/20251124-105216-marostegui.json
- 10:51 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
- 10:51 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 10:51 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 10:48 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:48 gehel@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
- 10:47 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:46 gehel@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
- 10:46 claime: Deploying envoy 1.32 to api-gateway
- 10:39 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
- 10:37 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 10:37 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 10:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T410531)', diff saved to https://phabricator.wikimedia.org/P85501 and previous config saved to /var/cache/conftool/dbconfig/20251124-103708-marostegui.json
- 10:36 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 10:36 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 10:34 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
- 10:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1198 (T410531)', diff saved to https://phabricator.wikimedia.org/P85500 and previous config saved to /var/cache/conftool/dbconfig/20251124-103218-marostegui.json
- 10:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1198.eqiad.wmnet with reason: Maintenance
- 10:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T410531)', diff saved to https://phabricator.wikimedia.org/P85499 and previous config saved to /var/cache/conftool/dbconfig/20251124-103155-marostegui.json
- 10:27 gehel@cumin2002: START - Cookbook sre.hosts.reboot-cluster
- 10:26 gehel@cumin1003: START - Cookbook sre.hosts.reboot-cluster
- 10:25 gehel@cumin2002: START - Cookbook sre.hosts.reboot-cluster
- 10:25 gehel@cumin1003: START - Cookbook sre.hosts.reboot-cluster
- 10:22 gehel@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
- 10:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P85498 and previous config saved to /var/cache/conftool/dbconfig/20251124-101647-marostegui.json
- 10:14 gehel@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
- 10:14 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
- 10:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P85497 and previous config saved to /var/cache/conftool/dbconfig/20251124-100139-marostegui.json
- 10:00 gehel@cumin2002: START - Cookbook sre.hosts.reboot-cluster
- 09:59 gehel@cumin1003: START - Cookbook sre.hosts.reboot-cluster
- 09:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 09:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 09:56 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 09:55 gehel@cumin1003: START - Cookbook sre.hosts.reboot-cluster
- 09:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 09:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T410531)', diff saved to https://phabricator.wikimedia.org/P85496 and previous config saved to /var/cache/conftool/dbconfig/20251124-094632-marostegui.json
- 09:42 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 09:42 jmm@dns1004: END - running authdns-update
- 09:41 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1189 (T410531)', diff saved to https://phabricator.wikimedia.org/P85495 and previous config saved to /var/cache/conftool/dbconfig/20251124-094141-marostegui.json
- 09:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1189.eqiad.wmnet with reason: Maintenance
- 09:41 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T410531)', diff saved to https://phabricator.wikimedia.org/P85494 and previous config saved to /var/cache/conftool/dbconfig/20251124-094117-marostegui.json
- 09:41 jmm@dns1004: START - running authdns-update
- 09:40 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 09:40 cgoubert@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 09:39 cgoubert@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 09:38 cgoubert@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 09:38 cgoubert@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P85492 and previous config saved to /var/cache/conftool/dbconfig/20251124-092609-marostegui.json
- 09:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P85491 and previous config saved to /var/cache/conftool/dbconfig/20251124-091102-marostegui.json
- 09:09 taavi: taavi@puppetserver1001 ~ $ sudo puppet node deactivate cloudidp2001-dev.wikimedia.org # leftover from move to private addresses T410294
- 09:05 gehel@cumin2002: START - Cookbook sre.hosts.reboot-cluster
- 09:03 gehel@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
- 09:03 gehel@cumin2002: START - Cookbook sre.hosts.reboot-cluster
- 08:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T410531)', diff saved to https://phabricator.wikimedia.org/P85490 and previous config saved to /var/cache/conftool/dbconfig/20251124-085554-marostegui.json
- 08:51 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1175 (T410531)', diff saved to https://phabricator.wikimedia.org/P85489 and previous config saved to /var/cache/conftool/dbconfig/20251124-085104-marostegui.json
- 08:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 08:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T410531)', diff saved to https://phabricator.wikimedia.org/P85488 and previous config saved to /var/cache/conftool/dbconfig/20251124-085042-marostegui.json
- 08:44 moritzm: installing jinja2 security updates
- 08:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P85487 and previous config saved to /var/cache/conftool/dbconfig/20251124-083535-marostegui.json
- 08:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P85485 and previous config saved to /var/cache/conftool/dbconfig/20251124-082027-marostegui.json
- 08:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet
- 08:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet
- 08:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T410531)', diff saved to https://phabricator.wikimedia.org/P85484 and previous config saved to /var/cache/conftool/dbconfig/20251124-080519-marostegui.json
- 07:51 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1166 (T410531)', diff saved to https://phabricator.wikimedia.org/P85483 and previous config saved to /var/cache/conftool/dbconfig/20251124-075126-marostegui.json
- 07:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 07:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T410531)', diff saved to https://phabricator.wikimedia.org/P85482 and previous config saved to /var/cache/conftool/dbconfig/20251124-075103-marostegui.json
- 07:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P85481 and previous config saved to /var/cache/conftool/dbconfig/20251124-073555-marostegui.json
- 07:35 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 07:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P85480 and previous config saved to /var/cache/conftool/dbconfig/20251124-072047-marostegui.json
- 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T410531)', diff saved to https://phabricator.wikimedia.org/P85479 and previous config saved to /var/cache/conftool/dbconfig/20251124-070539-marostegui.json
- 07:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T410531)', diff saved to https://phabricator.wikimedia.org/P85478 and previous config saved to /var/cache/conftool/dbconfig/20251124-070050-marostegui.json
- 07:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1157.eqiad.wmnet with reason: Maintenance
- 06:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 06:37 marostegui: Deploy schema change on s6 on the master with replication T410531
- 03:31 tstarling@deploy2002: Finished scap sync-world: Backport for Revert "Authorize self for Google Search Console" (duration: 40m 16s)
- 03:18 tstarling@deploy2002: tstarling: Continuing with sync
- 03:17 tstarling@deploy2002: tstarling: Backport for Revert "Authorize self for Google Search Console" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 02:51 tstarling@deploy2002: Started scap sync-world: Backport for Revert "Authorize self for Google Search Console"
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:02 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 00:01 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T410589)', diff saved to https://phabricator.wikimedia.org/P85477 and previous config saved to /var/cache/conftool/dbconfig/20251124-000144-ladsgroup.json
2025-11-23
- 23:46 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P85476 and previous config saved to /var/cache/conftool/dbconfig/20251123-234636-ladsgroup.json
- 23:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P85475 and previous config saved to /var/cache/conftool/dbconfig/20251123-233128-ladsgroup.json
- 23:16 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T410589)', diff saved to https://phabricator.wikimedia.org/P85474 and previous config saved to /var/cache/conftool/dbconfig/20251123-231621-ladsgroup.json
- 21:00 andrew@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1041']
- 20:51 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1041']
- 20:48 andrew@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1040']
- 20:39 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1040']
- 20:36 andrew@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1039']
- 20:27 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1039']
- 20:25 andrew@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cloudcephosd1038']
- 20:25 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1038']
- 20:24 andrew@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1038']
- 20:12 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1038']
- 20:08 andrew@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1037']
- 19:59 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1037']
- 19:57 andrew@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1036']
- 19:45 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1036']
- 19:39 andrew@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1035']
- 19:29 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1035']
- 19:24 andrew@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd2004-dev']
- 19:16 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd2004-dev']
- 16:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1230 (T410589)', diff saved to https://phabricator.wikimedia.org/P85473 and previous config saved to /var/cache/conftool/dbconfig/20251123-163658-ladsgroup.json
- 16:36 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1230.eqiad.wmnet with reason: Maintenance
- 09:07 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 09:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T410589)', diff saved to https://phabricator.wikimedia.org/P85472 and previous config saved to /var/cache/conftool/dbconfig/20251123-090710-ladsgroup.json
- 08:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P85471 and previous config saved to /var/cache/conftool/dbconfig/20251123-085202-ladsgroup.json
- 08:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P85470 and previous config saved to /var/cache/conftool/dbconfig/20251123-083655-ladsgroup.json
- 08:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T410589)', diff saved to https://phabricator.wikimedia.org/P85469 and previous config saved to /var/cache/conftool/dbconfig/20251123-082147-ladsgroup.json
- 01:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1207 (T410589)', diff saved to https://phabricator.wikimedia.org/P85468 and previous config saved to /var/cache/conftool/dbconfig/20251123-014247-ladsgroup.json
- 01:42 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 01:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T410589)', diff saved to https://phabricator.wikimedia.org/P85467 and previous config saved to /var/cache/conftool/dbconfig/20251123-014223-ladsgroup.json
- 01:27 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P85466 and previous config saved to /var/cache/conftool/dbconfig/20251123-012716-ladsgroup.json
- 01:12 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P85465 and previous config saved to /var/cache/conftool/dbconfig/20251123-011208-ladsgroup.json
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:57 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T410589)', diff saved to https://phabricator.wikimedia.org/P85464 and previous config saved to /var/cache/conftool/dbconfig/20251123-005700-ladsgroup.json
2025-11-22
- 17:19 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1200 (T410589)', diff saved to https://phabricator.wikimedia.org/P85463 and previous config saved to /var/cache/conftool/dbconfig/20251122-171858-ladsgroup.json
- 17:18 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
- 17:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T410589)', diff saved to https://phabricator.wikimedia.org/P85462 and previous config saved to /var/cache/conftool/dbconfig/20251122-171834-ladsgroup.json
- 17:03 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P85461 and previous config saved to /var/cache/conftool/dbconfig/20251122-170327-ladsgroup.json
- 16:48 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P85460 and previous config saved to /var/cache/conftool/dbconfig/20251122-164819-ladsgroup.json
- 16:33 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T410589)', diff saved to https://phabricator.wikimedia.org/P85459 and previous config saved to /var/cache/conftool/dbconfig/20251122-163311-ladsgroup.json
- 09:17 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1185 (T410589)', diff saved to https://phabricator.wikimedia.org/P85458 and previous config saved to /var/cache/conftool/dbconfig/20251122-091726-ladsgroup.json
- 09:17 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
- 09:17 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T410589)', diff saved to https://phabricator.wikimedia.org/P85457 and previous config saved to /var/cache/conftool/dbconfig/20251122-091703-ladsgroup.json
- 09:01 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P85456 and previous config saved to /var/cache/conftool/dbconfig/20251122-090155-ladsgroup.json
- 08:46 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P85455 and previous config saved to /var/cache/conftool/dbconfig/20251122-084647-ladsgroup.json
- 08:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T410589)', diff saved to https://phabricator.wikimedia.org/P85454 and previous config saved to /var/cache/conftool/dbconfig/20251122-083140-ladsgroup.json
- 01:01 ejegg: fundraising python tools rolled back from fe42b9a2 to 773e8d11
- 01:01 ejegg: fundraising civicrm rolled back from 11e95839 to e4748b9f
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:16 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 00:16 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 00:12 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 00:12 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 00:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1161 (T410589)', diff saved to https://phabricator.wikimedia.org/P85453 and previous config saved to /var/cache/conftool/dbconfig/20251122-000026-ladsgroup.json
- 00:00 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 00:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T410589)', diff saved to https://phabricator.wikimedia.org/P85452 and previous config saved to /var/cache/conftool/dbconfig/20251122-000001-ladsgroup.json
2025-11-21
- 23:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P85451 and previous config saved to /var/cache/conftool/dbconfig/20251121-234454-ladsgroup.json
- 23:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P85450 and previous config saved to /var/cache/conftool/dbconfig/20251121-232946-ladsgroup.json
- 23:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T410589)', diff saved to https://phabricator.wikimedia.org/P85449 and previous config saved to /var/cache/conftool/dbconfig/20251121-231439-ladsgroup.json
- 22:53 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/apertium: apply
- 22:53 rzl@deploy2002: helmfile [staging] START helmfile.d/services/apertium: apply
- 22:49 inflatador: bking@wdqs2007 roll-restart wdqs CODFW for high lag https://w.wiki/GDad
- 22:24 inflatador: bking@wdqs1011 `systemctl restart wdqs-blazegraph.service` (responding to ProbeDown)
- 22:19 ejegg: fundraising python tools upgraded from 773e8d11 to fe42b9a2
- 22:16 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
- 22:14 ejegg: civicrm upgraded from e4748b9f to 11e95839
- 22:04 bking@cumin2002: START - Cookbook sre.wdqs.restart
- 22:03 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.restart (exit_code=97)
- 22:03 bking@cumin2002: START - Cookbook sre.wdqs.restart
- 21:03 andrew@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host cloudidp2001-dev.codfw.wmnet
- 21:03 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudidp2001-dev.codfw.wmnet with OS trixie
- 20:45 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts sretest1002.eqiad.wmnet
- 20:45 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
- 20:45 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudidp2001-dev.codfw.wmnet with OS trixie
- 20:45 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM cloudidp2001-dev.codfw.wmnet - andrew@cumin2002"
- 20:44 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM cloudidp2001-dev.codfw.wmnet - andrew@cumin2002"
- 20:44 andrew@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudidp2001-dev.codfw.wmnet on all recursors
- 20:44 andrew@cumin2002: START - Cookbook sre.dns.wipe-cache cloudidp2001-dev.codfw.wmnet on all recursors
- 20:44 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:44 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM cloudidp2001-dev.codfw.wmnet - andrew@cumin2002"
- 20:42 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM cloudidp2001-dev.codfw.wmnet - andrew@cumin2002"
- 20:40 mutante: zuul2002 - rm /lib/systemd/system/zuul* ; systemctl daemon-reload ; systemctl reset-failed - fixes T410756
- 20:35 andrew@cumin2002: START - Cookbook sre.dns.netbox
- 20:35 andrew@cumin2002: START - Cookbook sre.ganeti.makevm for new host cloudidp2001-dev.codfw.wmnet
- 18:38 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudidp2001-dev.codfw.wmnet with OS trixie
- 18:27 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host relforge1010.eqiad.wmnet with OS bookworm
- 18:24 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudidp2001-dev.codfw.wmnet with OS trixie
- 18:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host relforge1009.eqiad.wmnet with OS bookworm
- 18:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
- 18:12 bking@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host relforge1008.eqiad.wmnet with OS bookworm
- 18:08 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
- 18:08 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudidp2001-dev.codfw.wmnet with OS trixie
- 18:04 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
- 18:01 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
- 17:57 bking@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
- 17:54 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudidp2001-dev.codfw.wmnet with OS trixie
- 17:53 andrew@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host cloudidp2001-dev.codfw.wmnet
- 17:53 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudidp2001-dev.codfw.wmnet with OS trixie
- 17:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS bookworm
- 17:52 bking@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
- 17:51 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS bookworm
- 17:42 bking@cumin1003: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS bookworm
- 17:37 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudidp2001-dev.codfw.wmnet with OS trixie
- 17:37 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM cloudidp2001-dev.codfw.wmnet - andrew@cumin2002"
- 17:37 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM cloudidp2001-dev.codfw.wmnet - andrew@cumin2002"
- 17:36 andrew@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudidp2001-dev.codfw.wmnet on all recursors
- 17:36 andrew@cumin2002: START - Cookbook sre.dns.wipe-cache cloudidp2001-dev.codfw.wmnet on all recursors
- 17:36 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:36 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM cloudidp2001-dev.codfw.wmnet - andrew@cumin2002"
- 17:36 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM cloudidp2001-dev.codfw.wmnet - andrew@cumin2002"
- 17:33 sukhe: homer "as*" commit "remove IPv6 for hcaptcha-proxy group": T409780
- 17:32 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host relforge1010.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 17:32 andrew@cumin2002: START - Cookbook sre.dns.netbox
- 17:32 andrew@cumin2002: START - Cookbook sre.ganeti.makevm for new host cloudidp2001-dev.codfw.wmnet
- 17:27 sukhe: homer "cr*" commit "remove IPv6 for hcaptcha-proxy group": T409780
- 17:27 cmooney@cumin1003: START - Cookbook sre.hosts.provision for host relforge1010.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 17:25 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host relforge1010.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 17:24 cmooney@cumin1003: START - Cookbook sre.hosts.provision for host relforge1010.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 17:20 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host relforge1010.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 17:15 cmooney@cumin1003: START - Cookbook sre.hosts.provision for host relforge1010.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 17:03 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Homer release v0.11.0 minor update - cmooney@cumin1003
- 17:01 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Homer release v0.11.0 minor update - cmooney@cumin1003
- 16:54 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host relforge1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:48 cmooney@cumin1003: START - Cookbook sre.hosts.provision for host relforge1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:16 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host relforge1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:11 cmooney@cumin1003: START - Cookbook sre.hosts.provision for host relforge1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 15:40 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host relforge1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 15:40 cmooney@cumin1003: START - Cookbook sre.hosts.provision for host relforge1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 15:37 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host relforge1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 15:32 cmooney@cumin1003: START - Cookbook sre.hosts.provision for host relforge1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 14:48 sukhe: homer "asw*drmrs*" commit "bring up hcaptcha-proxy600[12]": T409780
- 14:48 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS bookworm
- 14:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T410589)', diff saved to https://phabricator.wikimedia.org/P85441 and previous config saved to /var/cache/conftool/dbconfig/20251121-144238-ladsgroup.json
- 14:42 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance
- 14:35 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS bookworm
- 14:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS bookworm
- 14:27 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS bookworm
- 14:26 jmm@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging West1 out of all services on: 2410 hosts
- 14:25 sukhe: homer "cr*eqsin*" commit "bring up hcaptcha-proxy500[12]": T409780
- 14:25 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS bookworm
- 14:25 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc8 (T405942)', diff saved to https://phabricator.wikimedia.org/P85440 and previous config saved to /var/cache/conftool/dbconfig/20251121-142500-ladsgroup.json
- 14:24 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS bookworm
- 14:23 sukhe: homer "cr*ulsfo*" commit "bring up hcaptcha-proxy400[12]": T409780
- 14:21 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1018.eqiad.wmnet with reason: Maint
- 14:21 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2018.codfw.wmnet with reason: Maint
- 14:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc8 (T405942)', diff saved to https://phabricator.wikimedia.org/P85439 and previous config saved to /var/cache/conftool/dbconfig/20251121-142059-ladsgroup.json
- 14:18 sukhe: homer "cr*codfw*" commit "bring up hcaptcha-proxy200[12]": T409780
- 14:17 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc7 (T405942)', diff saved to https://phabricator.wikimedia.org/P85438 and previous config saved to /var/cache/conftool/dbconfig/20251121-141747-ladsgroup.json
- 14:14 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on pc2017.codfw.wmnet with reason: Maint
- 14:14 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on pc1017.eqiad.wmnet with reason: Maint
- 14:13 sukhe: homer "cr*eqiad*" commit "bring up hcaptcha-proxy100[12]": T409780
- 14:13 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc7 (T405942)', diff saved to https://phabricator.wikimedia.org/P85437 and previous config saved to /var/cache/conftool/dbconfig/20251121-141345-ladsgroup.json
- 14:09 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc6 (T405942)', diff saved to https://phabricator.wikimedia.org/P85436 and previous config saved to /var/cache/conftool/dbconfig/20251121-140903-ladsgroup.json
- 14:05 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on pc2016.codfw.wmnet with reason: Maint
- 14:05 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on pc1016.eqiad.wmnet with reason: Maint
- 14:03 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc6 (T405942)', diff saved to https://phabricator.wikimedia.org/P85435 and previous config saved to /var/cache/conftool/dbconfig/20251121-140327-ladsgroup.json
- 13:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 13:52 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 13:24 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 13:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1007.eqiad.wmnet
- 13:12 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1007.eqiad.wmnet
- 12:26 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Homer release v0.11.0 minor update - cmooney@cumin1003
- 12:24 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Homer release v0.11.0 minor update - cmooney@cumin1003
- 10:42 ayounsi@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1005.eqiad.wmnet
- 10:19 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@a809ec3] (releasing): T410680 (duration: 02m 13s)
- 10:16 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@a809ec3] (releasing): T410680
- 09:45 dpogorzelski@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: sync
- 09:45 dpogorzelski@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: sync
- 09:37 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 09:34 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 09:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cumin1002.eqiad.wmnet
- 09:16 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:16 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cumin1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cumin1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:46 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@f3216ec] (releasing): testing issue with instance (duration: 01m 48s)
- 08:44 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@f3216ec] (releasing): testing issue with instance
- 08:41 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:36 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts cumin1002.eqiad.wmnet
- 07:58 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 07:51 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 07:50 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 04:18 ryankemper@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T390860
- 02:09 ejegg: payments-wiki upgraded from 40f6f252 to 2a73a08d
- 02:08 ejegg: donorwiki upgraded from 40f6f252 to 2a73a08d
- 01:54 cstone: SmashPig upgraded from 21d54736 to 9c00fd50
- 01:47 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 01:47 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T410589)', diff saved to https://phabricator.wikimedia.org/P85431 and previous config saved to /var/cache/conftool/dbconfig/20251121-014701-ladsgroup.json
- 01:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P85430 and previous config saved to /var/cache/conftool/dbconfig/20251121-013153-ladsgroup.json
- 01:16 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P85429 and previous config saved to /var/cache/conftool/dbconfig/20251121-011646-ladsgroup.json
- 01:01 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T410589)', diff saved to https://phabricator.wikimedia.org/P85428 and previous config saved to /var/cache/conftool/dbconfig/20251121-010138-ladsgroup.json
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:19 bd808@deploy2002: Finished scap sync-world: Backport for wikitech: Put indicators in title with vector-2022, wikitech: Enable page protection indicators (T409785) (duration: 09m 50s)
- 00:15 bd808@deploy2002: bd808: Continuing with sync
- 00:13 bd808@deploy2002: bd808: Backport for wikitech: Put indicators in title with vector-2022, wikitech: Enable page protection indicators (T409785) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:09 bd808@deploy2002: Started scap sync-world: Backport for wikitech: Put indicators in title with vector-2022, wikitech: Enable page protection indicators (T409785)
- 00:04 musikanimal@deploy2002: Finished scap sync-world: Backport for ChangesListHooks: show entity titles in recent changes and watchlists (T406957) (duration: 10m 58s)
2025-11-20
- 23:58 musikanimal@deploy2002: musikanimal: Continuing with sync
- 23:57 musikanimal@deploy2002: musikanimal: Backport for ChangesListHooks: show entity titles in recent changes and watchlists (T406957) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:53 musikanimal@deploy2002: Started scap sync-world: Backport for ChangesListHooks: show entity titles in recent changes and watchlists (T406957)
- 23:40 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS bookworm
- 23:39 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host relforge1008.eqiad.wmnet with OS bookworm
- 23:39 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS bookworm
- 23:38 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS bookworm
- 23:32 sbassett@deploy2002: Finished scap sync-world: Backport for ActionApi: Remove the xslt option (T401987 T401995) (duration: 49m 18s)
- 23:25 tappof: /srv/thanos-store cleanup on titan2001 (end)
- 23:20 sbassett@deploy2002: sbassett: Continuing with sync
- 23:19 sbassett@deploy2002: sbassett: Backport for ActionApi: Remove the xslt option (T401987 T401995) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host relforge1010.eqiad.wmnet with OS bookworm
- 23:10 swfrench-wmf: restarted postfix on mx-in1001, mx-in2001 at ~ 23:00 UTC for config change
- 23:09 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS bookworm
- 23:07 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS bookworm
- 23:07 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS bookworm
- 23:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
- 22:59 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
- 22:58 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS bookworm
- 22:57 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS bookworm
- 22:57 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS bookworm
- 22:50 inflatador: bking@apt1002 reprepro --component thirdparty/opensearch2 update bookworm-wikimedia
- 22:49 inflatador: bking@apt1002 reprepro --component thirdparty/opensearch3 update bookworm-wikimedia
- 22:48 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS bookworm
- 22:48 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS bookworm
- 22:47 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS bookworm
- 22:46 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host relforge1008.eqiad.wmnet with OS trixie
- 22:46 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host relforge1009.eqiad.wmnet with OS trixie
- 22:46 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host relforge1010.eqiad.wmnet with OS trixie
- 22:45 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T390860
- 22:43 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
- 22:43 sbassett@deploy2002: Started scap sync-world: Backport for ActionApi: Remove the xslt option (T401987 T401995)
- 22:43 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
- 22:42 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
- 22:39 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge[1008-1010].eqiad.wmnet with reason: T410681
- 22:24 logmsgbot: mstyles Deployed security patch for T407157
- 22:00 reedy@deploy2002: Finished scap sync-world: Backport for Fix wgMediaViewerThumbnailBucketSizes on prod (T372165), AccountRecovery: Allow temp users to access Special:AccountRecovery (duration: 12m 06s)
- 21:54 reedy@deploy2002: bvibber, reedy: Continuing with sync
- 21:54 reedy@deploy2002: bvibber, reedy: Backport for Fix wgMediaViewerThumbnailBucketSizes on prod (T372165), AccountRecovery: Allow temp users to access Special:AccountRecovery synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:48 reedy@deploy2002: Started scap sync-world: Backport for Fix wgMediaViewerThumbnailBucketSizes on prod (T372165), AccountRecovery: Allow temp users to access Special:AccountRecovery
- 21:42 bvibber@deploy2002: Finished scap sync-world: Backport for Fix wgMediaViewerThumbnailBucketSizes to match wgThumbnailSteps (T372165), Undeploy the WikimediaEditorTasks extension (T376954) (duration: 39m 55s)
- 21:37 tappof: /srv/thanos-store cleanup on titan2001 (start)
- 21:30 bvibber@deploy2002: bvibber, jforrester: Continuing with sync
- 21:29 bvibber@deploy2002: bvibber, jforrester: Backport for Fix wgMediaViewerThumbnailBucketSizes to match wgThumbnailSteps (T372165), Undeploy the WikimediaEditorTasks extension (T376954) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:03 bvibber@deploy2002: Started scap sync-world: Backport for Fix wgMediaViewerThumbnailBucketSizes to match wgThumbnailSteps (T372165), Undeploy the WikimediaEditorTasks extension (T376954)
- 20:48 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1187 (T410589)', diff saved to https://phabricator.wikimedia.org/P85426 and previous config saved to /var/cache/conftool/dbconfig/20251120-204852-ladsgroup.json
- 20:48 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 20:48 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T410589)', diff saved to https://phabricator.wikimedia.org/P85425 and previous config saved to /var/cache/conftool/dbconfig/20251120-204827-ladsgroup.json
- 20:33 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P85424 and previous config saved to /var/cache/conftool/dbconfig/20251120-203320-ladsgroup.json
- 20:24 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy6002.wikimedia.org with OS bookworm
- 20:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P85423 and previous config saved to /var/cache/conftool/dbconfig/20251120-201812-ladsgroup.json
- 20:15 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy7001.wikimedia.org with OS bookworm
- 20:12 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy7002.wikimedia.org with OS bookworm
- 20:11 reedy@deploy2002: Finished scap sync-world: Backport for AccountRecovery: Log more data for account recovery submissions (duration: 09m 42s)
- 20:07 reedy@deploy2002: reedy: Continuing with sync
- 20:07 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy6001.wikimedia.org with OS bookworm
- 20:06 reedy@deploy2002: reedy: Backport for AccountRecovery: Log more data for account recovery submissions synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:04 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy6002.wikimedia.org with reason: host reimage
- 20:03 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T410589)', diff saved to https://phabricator.wikimedia.org/P85422 and previous config saved to /var/cache/conftool/dbconfig/20251120-200304-ladsgroup.json
- 20:01 reedy@deploy2002: Started scap sync-world: Backport for AccountRecovery: Log more data for account recovery submissions
- 19:57 swfrench@deploy2002: Finished scap sync-world: Backport for De-configure cookie-based enrollment in PHP 8.3 (T405955) (duration: 10m 03s)
- 19:56 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy7001.wikimedia.org with reason: host reimage
- 19:55 wfan: donorwiki upgraded from 6388fb1f to 40f6f252
- 19:53 swfrench@deploy2002: swfrench: Continuing with sync
- 19:52 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy7002.wikimedia.org with reason: host reimage
- 19:52 swfrench@deploy2002: swfrench: Backport for De-configure cookie-based enrollment in PHP 8.3 (T405955) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 19:48 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy6001.wikimedia.org with reason: host reimage
- 19:47 swfrench@deploy2002: Started scap sync-world: Backport for De-configure cookie-based enrollment in PHP 8.3 (T405955)
- 19:44 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy6002.wikimedia.org with reason: host reimage
- 19:44 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy7001.wikimedia.org with reason: host reimage
- 19:43 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy7002.wikimedia.org with reason: host reimage
- 19:43 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy6001.wikimedia.org with reason: host reimage
- 19:37 sukhe: sudo cumin 'A:lvs-eqiad or A:lvs-codfw' 'run-puppet-agent --enable "set druid-coordinator to state lvs_setup"'
- 19:35 sukhe: sukhe@lvs1020:~$ sudo systemctl restart pybal.service
- 19:29 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy5002.wikimedia.org with OS bookworm
- 19:24 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy5001.wikimedia.org with OS bookworm
- 19:16 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy7002.wikimedia.org with OS bookworm
- 19:16 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy7001.wikimedia.org with OS bookworm
- 19:16 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.3 refs T408273
- 19:16 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy6002.wikimedia.org with OS bookworm
- 19:16 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy6001.wikimedia.org with OS bookworm
- 19:08 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy4002.wikimedia.org with OS bookworm
- 19:07 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy5002.wikimedia.org with reason: host reimage
- 19:04 brennen: 1.46.0-wmf.3 train status (T408273): no current blockers, rolling to all wikis
- 19:04 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy3001.wikimedia.org with OS bookworm
- 19:03 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy5001.wikimedia.org with reason: host reimage
- 19:02 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy3002.wikimedia.org with OS bookworm
- 19:00 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy5002.wikimedia.org with reason: host reimage
- 19:00 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy5001.wikimedia.org with reason: host reimage
- 18:59 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy4001.wikimedia.org with OS bookworm
- 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy4002.wikimedia.org with reason: host reimage
- 18:46 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy3001.wikimedia.org with reason: host reimage
- 18:44 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy3002.wikimedia.org with reason: host reimage
- 18:39 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy4001.wikimedia.org with reason: host reimage
- 18:34 swfrench@deploy2002: Finished scap sync-world: Normal scap run after switching scap mwscript to PHP 8.3 - T405955 (duration: 05m 34s)
- 18:34 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy3001.wikimedia.org with reason: host reimage
- 18:33 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy4002.wikimedia.org with reason: host reimage
- 18:33 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy3002.wikimedia.org with reason: host reimage
- 18:33 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy4001.wikimedia.org with reason: host reimage
- 18:28 swfrench@deploy2002: Started scap sync-world: Normal scap run after switching scap mwscript to PHP 8.3 - T405955
- 18:27 swfrench@deploy2002: Stopping before sync operations
- 18:27 sukhe: sukhe@lvs1020:~$ sudo systemctl restart pybal.service: T406222
- 18:26 swfrench@deploy2002: Started scap sync-world: No-deployment scap run after switching scap mwscript to PHP 8.3 - T405955
- 18:24 swfrench@deploy2002: Installation of scap version "4.228.0" completed for 2 hosts
- 18:22 swfrench@deploy2002: Installing scap version "4.228.0" for 2 host(s)
- 18:21 sukhe: sudo cumin 'A:lvs-eqiad or A:lvs-codfw' 'disable-puppet "set druid-coordinator to state lvs_setup"'
- 18:18 urbanecm@deploy2002: Finished scap sync-world: Backport for hotfix: Disable Urdu alias for Special:Homepage (T410666), hotfix: Disable Urdu alias for Special:Homepage (T410666) (duration: 41m 57s)
- 18:14 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy5002.wikimedia.org with OS bookworm
- 18:13 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy5001.wikimedia.org with OS bookworm
- 18:11 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy4002.wikimedia.org with OS bookworm
- 18:09 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy4001.wikimedia.org with OS bookworm
- 18:09 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy3002.wikimedia.org with OS bookworm
- 18:08 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy3001.wikimedia.org with OS bookworm
- 18:05 urbanecm@deploy2002: urbanecm: Continuing with sync
- 18:04 urbanecm@deploy2002: urbanecm: Backport for hotfix: Disable Urdu alias for Special:Homepage (T410666), hotfix: Disable Urdu alias for Special:Homepage (T410666) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:59 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy2001.wikimedia.org with OS bookworm
- 17:58 robh@cumin2002: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
- 17:58 robh@cumin2002: START - Cookbook sre.mysql.parsercache
- 17:54 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on aux-k8s-worker1006.eqiad.wmnet with reason: C/D Migration
- 17:53 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy2002.wikimedia.org with OS bookworm
- 17:53 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on aux-k8s-worker1007.eqiad.wmnet with reason: C/D Migration
- 17:51 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1004,1019-1020,1037,1067-1071,1096-1097].eqiad.wmnet
- 17:50 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1107-1110,1164-1165,1167-1168].eqiad.wmnet
- 17:50 robh: wikikube migrations in eqiad complete, repooling d8
- 17:50 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1004,1019-1020,1037,1067-1071,1096-1097].eqiad.wmnet
- 17:50 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1107-1110,1164-1165,1167-1168].eqiad.wmnet
- 17:49 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy1002.wikimedia.org with OS bookworm
- 17:45 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1071.eqiad.wmnet with reason: C/D Migration
- 17:45 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1070.eqiad.wmnet with reason: C/D Migration
- 17:44 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1069.eqiad.wmnet with reason: C/D Migration
- 17:44 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1068.eqiad.wmnet with reason: C/D Migration
- 17:42 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1067.eqiad.wmnet with reason: C/D Migration
- 17:42 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1165.eqiad.wmnet with reason: C/D Migration
- 17:41 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy2001.wikimedia.org with reason: host reimage
- 17:39 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1164.eqiad.wmnet with reason: C/D Migration
- 17:38 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1037.eqiad.wmnet with reason: C/D Migration
- 17:37 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy2002.wikimedia.org with reason: host reimage
- 17:36 urbanecm@deploy2002: Started scap sync-world: Backport for hotfix: Disable Urdu alias for Special:Homepage (T410666), hotfix: Disable Urdu alias for Special:Homepage (T410666)
- 17:36 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1020.eqiad.wmnet with reason: C/D Migration
- 17:34 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1097.eqiad.wmnet with reason: C/D Migration
- 17:33 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy2001.wikimedia.org with reason: host reimage
- 17:33 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy1002.wikimedia.org with reason: host reimage
- 17:33 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy2002.wikimedia.org with reason: host reimage
- 17:32 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1096.eqiad.wmnet with reason: C/D Migration
- 17:31 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1019.eqiad.wmnet with reason: C/D Migration
- 17:30 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1004.eqiad.wmnet with reason: C/D Migration
- 17:29 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1110.eqiad.wmnet with reason: C/D Migration
- 17:28 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1109.eqiad.wmnet with reason: C/D Migration
- 17:27 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy1002.wikimedia.org with reason: host reimage
- 17:26 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1108.eqiad.wmnet with reason: C/D Migration
- 17:26 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1107.eqiad.wmnet with reason: C/D Migration
- 17:25 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1107.eqiad.wmnet with reason: C/D Migration
- 17:24 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1168.eqiad.wmnet with reason: C/D Migration
- 17:23 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1167.eqiad.wmnet with reason: C/D Migration
- 17:22 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1004,1019-1020,1037,1067-1071,1096-1097].eqiad.wmnet
- 17:20 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1107-1110,1164-1165,1167-1168].eqiad.wmnet
- 17:16 robh: eqiad wikikube d3 repooled, depooling d8 wikikube hosts
- 17:15 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1107-1110,1164-1165,1167-1168].eqiad.wmnet
- 17:15 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1004,1019-1020,1037,1067-1071,1096-1097].eqiad.wmnet
- 17:14 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy2002.wikimedia.org with OS bookworm
- 17:14 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy2001.wikimedia.org with OS bookworm
- 17:13 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy1002.wikimedia.org with OS bookworm
- 17:12 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1159,1162-1163].eqiad.wmnet
- 17:12 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1159,1162-1163].eqiad.wmnet
- 17:12 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1034.eqiad.wmnet
- 17:12 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1034.eqiad.wmnet
- 17:10 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1034.eqiad.wmnet with reason: C/D Migration
- 17:10 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1163.eqiad.wmnet with reason: C/D Migration
- 17:10 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1162.eqiad.wmnet with reason: C/D Migration
- 17:03 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1159.eqiad.wmnet with reason: C/D Migration
- 17:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1006.eqiad.wmnet
- 16:58 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1159,1162-1163].eqiad.wmnet
- 16:56 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1034.eqiad.wmnet
- 16:56 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1159,1162-1163].eqiad.wmnet
- 16:55 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1034.eqiad.wmnet
- 16:54 robh: draining eqiad d3 wikikube hosts for network migration
- 16:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1006.eqiad.wmnet
- 16:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1005.eqiad.wmnet
- 16:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1005.eqiad.wmnet
- 16:16 papaul: rebooting sretest1005 to chek LLDP settings
- 16:12 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 16:05 mforns@deploy2002: Finished deploy [analytics/refinery@4df475f] (thin): Regular analytics weekly train THIN [analytics/refinery@4df475f3] (duration: 01m 16s)
- 16:04 mforns@deploy2002: Started deploy [analytics/refinery@4df475f] (thin): Regular analytics weekly train THIN [analytics/refinery@4df475f3]
- 16:00 dcausse@deploy2002: mwscript-k8s job started: extensions/CirrusSearch/maintenance/UpdateSuggesterIndex.php hewiki --masterTimeout=10m --replicationTimeout=5400 --indexChunkSize=3000 --cluster=eqiad --optimize # T410602 reindexing search suggestions on hewiki
- 15:59 mforns@deploy2002: Finished deploy [analytics/refinery@4df475f]: Regular analytics weekly train [analytics/refinery@4df475f3] (duration: 02m 25s)
- 15:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 15:57 mforns@deploy2002: Started deploy [analytics/refinery@4df475f]: Regular analytics weekly train [analytics/refinery@4df475f3]
- 15:56 mforns@deploy2002: Finished deploy [analytics/refinery@4df475f] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4df475f3] (duration: 01m 01s)
- 15:55 mforns@deploy2002: Started deploy [analytics/refinery@4df475f] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4df475f3]
- 15:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 15:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 15:45 ladsgroup@deploy2002: Finished scap sync-world: Backport for Revert "Revert^2 "rdbms: Dismantle concept of groups"" (duration: 10m 09s)
- 15:40 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1180 (T410589)', diff saved to https://phabricator.wikimedia.org/P85419 and previous config saved to /var/cache/conftool/dbconfig/20251120-154014-ladsgroup.json
- 15:40 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 15:40 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T410589)', diff saved to https://phabricator.wikimedia.org/P85418 and previous config saved to /var/cache/conftool/dbconfig/20251120-154002-ladsgroup.json
- 15:40 ladsgroup@deploy2002: ladsgroup, trainbranchbot: Continuing with sync
- 15:39 ladsgroup@deploy2002: ladsgroup, trainbranchbot: Backport for Revert "Revert^2 "rdbms: Dismantle concept of groups"" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:39 dcausse@deploy2002: mwscript-k8s job started: extensions/CirrusSearch/maintenance/UpdateSuggesterIndex.php frwiki --masterTimeout=10m --replicationTimeout=5400 --indexChunkSize=3000 --cluster=eqiad --optimize # T410602 reindexing search suggestions on frwiki
- 15:35 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 15:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 15:35 ladsgroup@deploy2002: Started scap sync-world: Backport for Revert "Revert^2 "rdbms: Dismantle concept of groups""
- 15:34 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 15:34 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 15:32 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 15:31 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 15:30 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 15:26 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 15:26 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 15:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P85417 and previous config saved to /var/cache/conftool/dbconfig/20251120-152454-ladsgroup.json
- 15:24 ejegg: donorwiki upgraded from 36d362c6 to 6388fb1f
- 15:19 ladsgroup@deploy2002: sync-world failed: <CalledProcessError> Command '['sudo', '-u', 'mwbuilder', '-n', '--', '/usr/bin/scap', 'mwscript', '--no-local-config', '--directory', '/srv/mediawiki-staging', '--user', 'www-data', '--', 'mergeMessageFileList.php', '--wiki=aawiki', '--force-version', '1.46.0-wmf.3', '--list-file', '/srv/mediawiki-staging/wmf-config/extension-list', '--output', '/tmp/tmp.RV8eoygq6j']' returned
- 15:19 ladsgroup@deploy2002: Started scap sync-world: Backport for Revert^2 "rdbms: Dismantle concept of groups"
- 15:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 15:14 dcausse@deploy2002: mwscript-k8s job started: extensions/CirrusSearch/maintenance/UpdateSuggesterIndex.php hewiki --masterTimeout=10m --replicationTimeout=5400 --indexChunkSize=3000 --cluster=eqiad --optimize # T410602 reindexing search suggestions on hewiki
- 15:13 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 15:09 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P85416 and previous config saved to /var/cache/conftool/dbconfig/20251120-150946-ladsgroup.json
- 14:55 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy1001.wikimedia.org with OS bookworm
- 14:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T410589)', diff saved to https://phabricator.wikimedia.org/P85415 and previous config saved to /var/cache/conftool/dbconfig/20251120-145439-ladsgroup.json
- 14:53 dcausse@deploy2002: mwscript-k8s job started: extensions/CirrusSearch/maintenance/UpdateSuggesterIndex.php frwiki --masterTimeout=10m --replicationTimeout=5400 --indexChunkSize=3000 --cluster=eqiad --optimize # T410602 reindexing search suggestions on frwiki
- 14:49 dcausse@deploy2002: mwscript-k8s job started: extensions/CirrusSearch/maintenance/UpdateSuggesterIndex.php enwiki --masterTimeout=10m --replicationTimeout=5400 --indexChunkSize=3000 --cluster=eqiad --optimize # T410602 reindexing search suggestions on enwiki
- 14:49 dcausse@deploy2002: mwscript-k8s job started: extensions/CirrusSearch/maintenance/UpdateSuggesterIndex.php enwiki --masterTimeout=10m --replicationTimeout=5400 --indexChunkSize=3000 --cluster=eqiad --optimize # T410602 reindexing search suggestions on enwiki
- 14:45 ladsgroup@deploy2002: Finished scap sync-world: Backport for Fix filtering of relevant default sort suggestions (T410602), Fix filtering of relevant default sort suggestions (T410602) (duration: 10m 25s)
- 14:41 ladsgroup@deploy2002: ladsgroup, dcausse: Continuing with sync
- 14:40 ladsgroup@deploy2002: ladsgroup, dcausse: Backport for Fix filtering of relevant default sort suggestions (T410602), Fix filtering of relevant default sort suggestions (T410602) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:36 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Robert Timm out of all services on: 2413 hosts
- 14:34 ladsgroup@deploy2002: Started scap sync-world: Backport for Fix filtering of relevant default sort suggestions (T410602), Fix filtering of relevant default sort suggestions (T410602)
- 14:28 ladsgroup@deploy2002: Finished scap sync-world: Backport for Enable $wgCampaignEventsEnableContributionTracking in production (T404904) (duration: 18m 53s)
- 14:24 ladsgroup@deploy2002: daimona, ladsgroup: Continuing with sync
- 14:23 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy1001.wikimedia.org with reason: host reimage
- 14:17 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy1001.wikimedia.org with reason: host reimage
- 14:14 ladsgroup@deploy2002: daimona, ladsgroup: Backport for Enable $wgCampaignEventsEnableContributionTracking in production (T404904) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:09 ladsgroup@deploy2002: Started scap sync-world: Backport for Enable $wgCampaignEventsEnableContributionTracking in production (T404904)
- 14:05 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy1001.wikimedia.org with OS bookworm
- 13:35 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
- 13:34 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
- 13:34 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
- 13:33 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
- 13:33 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 13:33 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 12:32 jayme@cumin2002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "[not really into teleological thinking] - jayme@cumin2002 - T402014"
- 12:31 jayme@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: [not really into teleological thinking] - jayme@cumin2002 - T402014
- 12:31 jayme@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: [not really into teleological thinking] - jayme@cumin2002 - T402014
- 12:31 jayme@cumin2002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "[not really into teleological thinking] - jayme@cumin2002 - T402014"
- 12:22 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: sync
- 12:21 claime: roll-restart of mobileapps codfw - T410296
- 12:21 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: sync
- 12:14 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for tcywikisource: Migrate $wgAccountCreationThrottle to throttle.php (T410507) (duration: 08m 57s)
- 12:13 ryankemper@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T390860
- 12:10 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Continuing with sync
- 12:10 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for tcywikisource: Migrate $wgAccountCreationThrottle to throttle.php (T410507) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:05 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for tcywikisource: Migrate $wgAccountCreationThrottle to throttle.php (T410507)
- 11:12 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 11:09 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:50 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:50 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:47 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:47 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:41 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1173 (T410589)', diff saved to https://phabricator.wikimedia.org/P85412 and previous config saved to /var/cache/conftool/dbconfig/20251120-104142-ladsgroup.json
- 10:41 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
- 10:41 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T410589)', diff saved to https://phabricator.wikimedia.org/P85411 and previous config saved to /var/cache/conftool/dbconfig/20251120-104129-ladsgroup.json
- 10:26 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P85410 and previous config saved to /var/cache/conftool/dbconfig/20251120-102622-ladsgroup.json
- 10:11 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P85409 and previous config saved to /var/cache/conftool/dbconfig/20251120-101114-ladsgroup.json
- 09:56 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T410589)', diff saved to https://phabricator.wikimedia.org/P85408 and previous config saved to /var/cache/conftool/dbconfig/20251120-095606-ladsgroup.json
- 08:47 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Log the risk score for null edits differently (T410550) (duration: 10m 51s)
- 08:45 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@f3216ec] (releasing): testing deploy to failover host (duration: 00m 30s)
- 08:45 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@f3216ec] (releasing): testing deploy to failover host
- 08:43 kharlan@deploy2002: kharlan: Continuing with sync
- 08:40 kharlan@deploy2002: kharlan: Backport for hCaptcha: Log the risk score for null edits differently (T410550) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:36 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Log the risk score for null edits differently (T410550)
- 08:17 dcausse@deploy2002: Finished scap sync-world: Backport for Revert "cirrus: start A/B test on completion with default_sort" (T404858) (duration: 12m 54s)
- 08:13 moritzm: installing squid security updates
- 08:12 dcausse@deploy2002: dcausse: Continuing with sync
- 08:09 dcausse@deploy2002: dcausse: Backport for Revert "cirrus: start A/B test on completion with default_sort" (T404858) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:04 dcausse@deploy2002: Started scap sync-world: Backport for Revert "cirrus: start A/B test on completion with default_sort" (T404858)
- 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms2 T410480', diff saved to https://phabricator.wikimedia.org/P85407 and previous config saved to /var/cache/conftool/dbconfig/20251120-072110-marostegui.json
- 06:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1168 (T410589)', diff saved to https://phabricator.wikimedia.org/P85406 and previous config saved to /var/cache/conftool/dbconfig/20251120-060041-ladsgroup.json
- 06:00 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 06:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T410589)', diff saved to https://phabricator.wikimedia.org/P85405 and previous config saved to /var/cache/conftool/dbconfig/20251120-060017-ladsgroup.json
- 05:45 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P85404 and previous config saved to /var/cache/conftool/dbconfig/20251120-054509-ladsgroup.json
- 05:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P85403 and previous config saved to /var/cache/conftool/dbconfig/20251120-053002-ladsgroup.json
- 05:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T410589)', diff saved to https://phabricator.wikimedia.org/P85402 and previous config saved to /var/cache/conftool/dbconfig/20251120-051454-ladsgroup.json
- 01:45 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1185* gradually with 4 steps - Work done
- 01:45 wfan: payments-wiki upgraded from d72930e6 to 36d362c6
- 01:40 ladsgroup@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
- 01:39 ladsgroup@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
- 01:34 wfan: civicrm upgraded from f471a3ec to e4748b9f
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 32s)
- 01:03 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1165 (T410589)', diff saved to https://phabricator.wikimedia.org/P85397 and previous config saved to /var/cache/conftool/dbconfig/20251120-010322-ladsgroup.json
- 01:03 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:59 ladsgroup@cumin1003: START - Cookbook sre.mysql.pool db1185* gradually with 4 steps - Work done
- 00:59 ladsgroup@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 3 days, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance
- 00:41 brett@dns1006: END - running authdns-update
- 00:40 brett@dns1006: START - running authdns-update
2025-11-19
- 22:49 ryankemper@cumin2002: END (FAIL) - Cookbook sre.wdqs.reboot (exit_code=99)
- 22:48 aaron@deploy2002: Finished scap sync-world: Backport for Sandbox cleanup for the Wikimedia REST APIs (T409776 T402426) (duration: 13m 43s)
- 22:44 aaron@deploy2002: aaron: Continuing with sync
- 22:43 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T410563, transfer main graph to lagged host) xfer wikidata_main from wdqs1015.eqiad.wmnet -> wdqs1011.eqiad.wmnet, repooling both afterwards
- 22:39 aaron@deploy2002: aaron: Backport for Sandbox cleanup for the Wikimedia REST APIs (T409776 T402426) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:34 aaron@deploy2002: Started scap sync-world: Backport for Sandbox cleanup for the Wikimedia REST APIs (T409776 T402426)
- 22:28 ryankemper@cumin2002: START - Cookbook sre.wdqs.reboot
- 22:22 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T390860
- 22:16 kemayo@deploy2002: Finished scap sync-world: Backport for TextMatchEditCheck: undo duplicate sub-type logging (T407286), Remove action_context from page_load events in ReadingList A/B test (T410535), Remove action_context from page_load events in ReadingList A/B test (T410535), README: remove outdated advice about dblists (duration: 10m 57s)
- 22:16 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T390860
- 22:16 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T390860
- 22:12 kemayo@deploy2002: aude, kemayo, novemlinguae: Continuing with sync
- 22:10 kemayo@deploy2002: aude, kemayo, novemlinguae: Backport for TextMatchEditCheck: undo duplicate sub-type logging (T407286), Remove action_context from page_load events in ReadingList A/B test (T410535), Remove action_context from page_load events in ReadingList A/B test (T410535), README: remove outdated advice about dblists synced to the tests
- 22:09 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
- 22:08 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
- 22:05 kemayo@deploy2002: Started scap sync-world: Backport for TextMatchEditCheck: undo duplicate sub-type logging (T407286), Remove action_context from page_load events in ReadingList A/B test (T410535), Remove action_context from page_load events in ReadingList A/B test (T410535), README: remove outdated advice about dblists
- 22:03 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
- 22:03 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
- 22:02 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 22:02 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 22:00 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 22:00 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 21:59 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 21:59 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 21:59 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 21:59 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 21:53 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Enable A/B edit test on zhwiki and jawiki (T410354) (duration: 10m 55s)
- 21:49 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T410563, transfer main graph to lagged host) xfer wikidata_main from wdqs1015.eqiad.wmnet -> wdqs1011.eqiad.wmnet, repooling both afterwards
- 21:49 kharlan@deploy2002: kharlan: Continuing with sync
- 21:47 kharlan@deploy2002: kharlan: Backport for hCaptcha: Enable A/B edit test on zhwiki and jawiki (T410354) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:42 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Enable A/B edit test on zhwiki and jawiki (T410354)
- 21:24 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Record A/B test experiment group (T410354), hCaptcha: Record A/B test experiment group (T410354) (duration: 11m 16s)
- 21:20 kharlan@deploy2002: kharlan: Continuing with sync
- 21:17 kharlan@deploy2002: kharlan: Backport for hCaptcha: Record A/B test experiment group (T410354), hCaptcha: Record A/B test experiment group (T410354) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:13 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Record A/B test experiment group (T410354), hCaptcha: Record A/B test experiment group (T410354)
- 20:33 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1070.eqiad.wmnet with OS trixie
- 20:07 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1070.eqiad.wmnet with reason: host reimage
- 20:07 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 20:07 ammarpad@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=loginwiki --logwiki=metawiki Manueldinardo08 'Renamed user 7fd4cfd08628d295620b39574c59750f' # T410545
- 20:01 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1070.eqiad.wmnet with reason: host reimage
- 19:52 denisse@deploy2002: Finished deploy [librenms/librenms@d152b36]: Upgrade LibreNMS to 25.11.0 - T410519 (duration: 00m 16s)
- 19:52 denisse@deploy2002: Started deploy [librenms/librenms@d152b36]: Upgrade LibreNMS to 25.11.0 - T410519
- 19:50 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 19:50 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 19:50 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
- 19:49 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
- 19:46 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1070.eqiad.wmnet with OS trixie
- 19:27 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol2010-dev.codfw.wmnet with reason: host reimage
- 19:24 brennen@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.3 refs T408273
- 19:23 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol2010-dev.codfw.wmnet with reason: host reimage
- 19:21 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1270-1275].eqiad.wmnet
- 19:21 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1270-1275].eqiad.wmnet
- 19:21 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1140-1141,1160-1161].eqiad.wmnet
- 19:21 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1140-1141,1160-1161].eqiad.wmnet
- 19:18 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1161.eqiad.wmnet with reason: C/D Migration
- 19:16 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1160.eqiad.wmnet with reason: C/D Migration
- 19:13 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1275.eqiad.wmnet with reason: C/D Migration
- 19:13 brennen: 1.46.0-wmf.3 train status (T408273): no current blockers, logs clean, rolling to group1
- 19:10 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1274.eqiad.wmnet with reason: C/D Migration
- 19:07 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1273.eqiad.wmnet with reason: C/D Migration
- 19:04 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1272.eqiad.wmnet with reason: C/D Migration
- 19:03 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 19:03 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1271.eqiad.wmnet with reason: C/D Migration
- 19:00 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1270.eqiad.wmnet with reason: C/D Migration
- 18:58 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1141.eqiad.wmnet with reason: C/D Migration
- 18:57 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 18:52 brett@dns1006: END - running authdns-update
- 18:51 brett@dns1006: START - running authdns-update
- 18:50 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1270-1275].eqiad.wmnet
- 18:49 brett: import purged 0.24+deb13u1 into trixie-wikimedia - T401832
- 18:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs7003*} and A:liberica
- 18:46 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1270-1275].eqiad.wmnet
- 18:45 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1140.eqiad.wmnet with reason: C/D Migration
- 18:43 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1140-1141,1160-1161].eqiad.wmnet
- 18:40 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1140-1141,1160-1161].eqiad.wmnet
- 18:27 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol2010-dev.codfw.wmnet with reason: host reimage
- 18:23 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol2010-dev.codfw.wmnet with reason: host reimage
- 18:23 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1036,1051-1052,1054-1055,1083].eqiad.wmnet
- 18:23 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1260-1269].eqiad.wmnet
- 18:23 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1036,1051-1052,1054-1055,1083].eqiad.wmnet
- 18:23 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1260-1269].eqiad.wmnet
- 18:21 brett: import prometheus-rdkafka-exporter 0.4~deb13u1 into trixie-wikimedia - T401832
- 18:21 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1036.eqiad.wmnet with reason: C/D Migration
- 18:14 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1269.eqiad.wmnet with reason: C/D Migration
- 18:12 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1268.eqiad.wmnet with reason: C/D Migration
- 18:12 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs7003*} and A:liberica
- 18:10 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1083.eqiad.wmnet with reason: C/D Migration
- 18:09 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases1003.eqiad.wmnet with OS bookworm
- 18:09 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1055.eqiad.wmnet with reason: C/D Migration
- 18:07 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1054.eqiad.wmnet with reason: C/D Migration
- 18:05 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1053.eqiad.wmnet with reason: C/D Migration
- 18:04 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1052.eqiad.wmnet with reason: C/D Migration
- 18:03 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 18:01 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1051.eqiad.wmnet with reason: C/D Migration
- 18:00 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1267.eqiad.wmnet with reason: C/D Migration
- 17:58 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1266.eqiad.wmnet with reason: C/D Migration
- 17:56 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1265.eqiad.wmnet with reason: C/D Migration
- 17:55 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1264.eqiad.wmnet with reason: C/D Migration
- 17:53 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1263.eqiad.wmnet with reason: C/D Migration
- 17:50 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1262.eqiad.wmnet with reason: C/D Migration
- 17:48 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1261.eqiad.wmnet with reason: C/D Migration
- 17:43 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1260.eqiad.wmnet with reason: C/D Migration
- 17:43 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
- 17:42 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1036,1051-1052,1054-1055,1083].eqiad.wmnet
- 17:39 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1036,1051-1052,1054-1055,1083].eqiad.wmnet
- 17:38 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1260-1269].eqiad.wmnet
- 17:38 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
- 17:33 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1260-1269].eqiad.wmnet
- 17:32 robh: wikikube c6 hosts depooling for migration
- 17:22 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on moss-be1002.eqiad.wmnet with reason: C/D Migration
- 17:21 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases1003.eqiad.wmnet with OS bookworm
- 17:16 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet with OS bookworm
- 17:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 T405942', diff saved to https://phabricator.wikimedia.org/P85395 and previous config saved to /var/cache/conftool/dbconfig/20251119-171622-marostegui.json
- 17:14 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: sync
- 17:14 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on dbstore1007.eqiad.wmnet with reason: C/D Migration
- 17:13 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: sync
- 17:12 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
- 17:12 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
- 17:10 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
- 17:10 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
- 17:10 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
- 17:09 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
- 17:08 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
- 17:08 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
- 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Testing all optimize (T410401)', diff saved to https://phabricator.wikimedia.org/P85394 and previous config saved to /var/cache/conftool/dbconfig/20251119-170814-ladsgroup.json
- 17:05 filippo@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 17:03 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on pc1014.eqiad.wmnet with reason: C/D Migration
- 17:03 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 39 hosts
- 17:03 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for 39 hosts
- 17:01 kamila@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool site drmrs [reason: no reason specified, ]
- 17:00 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol2010-dev.codfw.wmnet with reason: host reimage
- 17:00 kamila@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool site drmrs [reason: no reason specified, ]
- 17:00 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 17:00 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr[1-2]-drmrs IPv6,cr[1-2]-drmrs.mgmt
- 17:00 pt1979@cumin2002: START - Cookbook sre.hosts.remove-downtime for cr[1-2]-drmrs IPv6,cr[1-2]-drmrs.mgmt
- 16:59 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 16:58 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 16:58 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 16:57 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1008-dev.eqiad.wmnet with reason: host reimage
- 16:56 filippo@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol2010-dev.codfw.wmnet with reason: host reimage
- 16:52 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1008-dev.eqiad.wmnet with reason: host reimage
- 16:39 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 39 hosts with reason: site depool
- 16:36 filippo@cumin1003: START - Cookbook sre.hosts.reimage for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 16:35 filippo@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 16:34 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS bookworm
- 16:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
- 16:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
- 16:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 16:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 16:27 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
- 16:27 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
- 16:27 bking@deploy2002: helmfile [default] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
- 16:27 bking@deploy2002: helmfile [default] START helmfile.d/dse-k8s-services/opensearch-test: apply
- 16:06 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: router upgrade
- 16:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
- 16:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
- 16:03 moritzm: installing libvirt bugfix updates on trixie hosts
- 15:59 moritzm: installing brltty bugfix updates on trixie hosts
- 15:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 15:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 15:52 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-b[12-13]-drmrs,cr[1-2]-drmrs,mr1-drmrs with reason: router upgrade
- 15:48 ladsgroup@deploy2002: Finished scap sync-world: Backport for Revert "rdbms: Dismantle concept of groups" (duration: 09m 14s)
- 15:44 ladsgroup@deploy2002: trainbranchbot, ladsgroup: Continuing with sync
- 15:43 ladsgroup@deploy2002: trainbranchbot, ladsgroup: Backport for Revert "rdbms: Dismantle concept of groups" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:42 daphnesmit@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:41 daphnesmit@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:41 daphnesmit@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:40 daphnesmit@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:39 daphnesmit@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:39 ladsgroup@deploy2002: Started scap sync-world: Backport for Revert "rdbms: Dismantle concept of groups"
- 15:39 daphnesmit@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:34 daphnesmit@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:33 daphnesmit@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:33 daphnesmit@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:33 moritzm: installing console-setup bugfix updates on trixie hosts
- 15:33 daphnesmit@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:32 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool site drmrs [reason: no reason specified, T390813]
- 15:32 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool site drmrs [reason: no reason specified, T390813]
- 15:31 daphnesmit@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:31 daphnesmit@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:31 moritzm: installing wtmpdb bugfix updates on trixie hosts
- 15:30 papaul: rebooting sretest2004 to check LLDP settings
- 15:22 ladsgroup@deploy2002: sync-world failed: <CalledProcessError> Command '['sudo', '-u', 'mwbuilder', '-n', '--', '/usr/bin/scap', 'mwscript', '--no-local-config', '--directory', '/srv/mediawiki-staging', '--user', 'www-data', '--', 'mergeMessageFileList.php', '--wiki=aawiki', '--force-version', '1.46.0-wmf.3', '--list-file', '/srv/mediawiki-staging/wmf-config/extension-list', '--output', '/tmp/tmp.IxZM23pYxK']' returned
- 15:22 ladsgroup@deploy2002: Started scap sync-world: Backport for rdbms: Dismantle concept of groups (T405087)
- 15:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
- 15:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
- 15:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
- 15:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
- 15:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 15:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 15:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
- 15:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
- 15:15 tgr@deploy2002: Finished scap sync-world: Backport for Use prefixed 'sub' field in OAuth 2 access tokens (T399199) (duration: 16m 43s)
- 15:13 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:13 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: push changes - cmooney@cumin1003"
- 15:12 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: push changes - cmooney@cumin1003"
- 15:11 tgr@deploy2002: tgr: Continuing with sync
- 15:08 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 15:03 tgr@deploy2002: tgr: Backport for Use prefixed 'sub' field in OAuth 2 access tokens (T399199) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:58 tgr@deploy2002: Started scap sync-world: Backport for Use prefixed 'sub' field in OAuth 2 access tokens (T399199)
- 14:55 Lucas_WMDE: (T410507 maintenance script failed, --ip is required and we don’t have it. oh well)
- 14:55 lucaswerkmeister-wmde@deploy2002: mwscript-k8s job started: resetAuthenticationThrottle tcywikisource --signup # T410507
- 14:52 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for tcywikisource: Temporary increase of AccountCreationThrottle (T410507) (duration: 09m 32s)
- 14:48 lucaswerkmeister-wmde@deploy2002: anzx, lucaswerkmeister-wmde: Continuing with sync
- 14:48 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe-eqiad
- 14:47 lucaswerkmeister-wmde@deploy2002: anzx, lucaswerkmeister-wmde: Backport for tcywikisource: Temporary increase of AccountCreationThrottle (T410507) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:47 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
- 14:47 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
- 14:43 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for tcywikisource: Temporary increase of AccountCreationThrottle (T410507)
- 14:40 sgimeno@deploy2002: Finished scap sync-world: Backport for fix(ReviseToneExperimentInteractionLogger): prevent breaking homepage for unsampled users (T405177), fix(MigrateMentorStatusAway): ensure migration respects date format (T409170) (duration: 09m 09s)
- 14:35 sgimeno@deploy2002: sgimeno: Continuing with sync
- 14:35 sgimeno@deploy2002: sgimeno: Backport for fix(ReviseToneExperimentInteractionLogger): prevent breaking homepage for unsampled users (T405177), fix(MigrateMentorStatusAway): ensure migration respects date format (T409170) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:30 sgimeno@deploy2002: Started scap sync-world: Backport for fix(ReviseToneExperimentInteractionLogger): prevent breaking homepage for unsampled users (T405177), fix(MigrateMentorStatusAway): ensure migration respects date format (T409170)
- 14:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1187 gradually with 4 steps - repool after schema change test
- 14:16 esanders@deploy2002: Finished scap sync-world: Backport for Freeze LiquidThreads on ptwikibooks (T402532) (duration: 12m 13s)
- 14:13 Amir1: cumin2024@db2205.codfw.wmnet[(none)]> drop database if exists tlhwiki; drop database if exists tlhwiktionary; drop database if exists ukwikimedia; drop database if exists zerowiki; drop database if exists zh_cnwiki; drop database if exists zh_twwiki; (T297297)
- 14:12 esanders@deploy2002: esanders: Continuing with sync
- 14:08 esanders@deploy2002: esanders: Backport for Freeze LiquidThreads on ptwikibooks (T402532) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:04 esanders@deploy2002: Started scap sync-world: Backport for Freeze LiquidThreads on ptwikibooks (T402532)
- 13:59 moritzm: installing monitoring-plugins bugfix updates on trixie hosts
- 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy7002.wikimedia.org
- 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy7002.wikimedia.org with OS trixie
- 13:50 Amir1: cumin2024@db2205.codfw.wmnet[(none)]> drop database if exists noboardwiki; drop database if exists ru_sibwiki; drop database if exists sep11wiki; drop database if exists strategyappswiki; (T297297)
- 13:43 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1187 gradually with 4 steps - repool after schema change test
- 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy7002.wikimedia.org with reason: host reimage
- 13:37 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 13:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1187 (T299441)', diff saved to https://phabricator.wikimedia.org/P85388 and previous config saved to /var/cache/conftool/dbconfig/20251119-133358-marostegui.json
- 13:33 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy7002.wikimedia.org with reason: host reimage
- 13:33 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 13:33 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 13:33 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe-eqiad
- 13:14 moritzm: installing systemd bugfix updates on trixie hosts
- 13:04 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy7002.wikimedia.org with OS trixie
- 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy7002.wikimedia.org - jmm@cumin2002"
- 12:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy7002.wikimedia.org - jmm@cumin2002"
- 12:58 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy7002.wikimedia.org on all recursors
- 12:58 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy7002.wikimedia.org on all recursors
- 12:58 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7002.wikimedia.org - jmm@cumin2002"
- 12:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7002.wikimedia.org - jmm@cumin2002"
- 12:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 12:54 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy7002.wikimedia.org
- 12:53 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy7002.wikimedia.org
- 12:53 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy7002.wikimedia.org on all recursors
- 12:53 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy7002.wikimedia.org on all recursors
- 12:53 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy7002.wikimedia.org - jmm@cumin2002"
- 12:53 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy7002.wikimedia.org - jmm@cumin2002"
- 12:49 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy7002.wikimedia.org on all recursors
- 12:49 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy7002.wikimedia.org on all recursors
- 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7002.wikimedia.org - jmm@cumin2002"
- 12:49 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7002.wikimedia.org - jmm@cumin2002"
- 12:45 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy7002.wikimedia.org
- 12:43 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy7002.wikimedia.org
- 12:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy7002.wikimedia.org on all recursors
- 12:43 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy7002.wikimedia.org on all recursors
- 12:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy7002.wikimedia.org - jmm@cumin2002"
- 12:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy7002.wikimedia.org - jmm@cumin2002"
- 12:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 12:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy7002.wikimedia.org on all recursors
- 12:39 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy7002.wikimedia.org on all recursors
- 12:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7002.wikimedia.org - jmm@cumin2002"
- 12:37 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7002.wikimedia.org - jmm@cumin2002"
- 12:35 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1169.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 12:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 12:33 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy7002.wikimedia.org
- 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy6002.wikimedia.org
- 12:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy6002.wikimedia.org
- 12:25 cmooney@cumin1003: START - Cookbook sre.hosts.provision for host db1169.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy7001.wikimedia.org
- 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy7001.wikimedia.org with OS trixie
- 12:22 filippo@cumin1003: START - Cookbook sre.hosts.reimage for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy6001.wikimedia.org
- 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy6001.wikimedia.org
- 12:10 kevinbazira@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy5002.wikimedia.org
- 12:07 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
- 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy7001.wikimedia.org with reason: host reimage
- 12:07 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
- 12:07 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
- 12:06 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
- 12:05 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
- 12:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy5002.wikimedia.org
- 12:05 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 12:04 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 12:03 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy7001.wikimedia.org with reason: host reimage
- 11:57 hnowlan: routing /api/rest_v1/page/lint/ via the rest-gateway for group1
- 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy5001.wikimedia.org
- 11:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy5001.wikimedia.org
- 11:32 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: sync
- 11:30 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: sync
- 11:30 claime: Roll restarting mobileapps in codfw - unavailable replicas - T410296
- 11:30 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy7001.wikimedia.org with OS trixie
- 11:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 11:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 11:24 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
- 11:19 vgutierrez: restarting purged on cp1115
- 11:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 11:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 11:11 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 11:10 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 11:10 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 11:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 11:03 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
- 11:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 10:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy4001.wikimedia.org
- 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy4001.wikimedia.org
- 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy7001.wikimedia.org - jmm@cumin2002"
- 10:53 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy7001.wikimedia.org - jmm@cumin2002"
- 10:52 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy7001.wikimedia.org on all recursors
- 10:52 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy7001.wikimedia.org on all recursors
- 10:52 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7001.wikimedia.org - jmm@cumin2002"
- 10:52 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7001.wikimedia.org - jmm@cumin2002"
- 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy3001.wikimedia.org
- 10:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy3001.wikimedia.org
- 10:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 10:32 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy7001.wikimedia.org
- 10:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 10:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 10:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on clouddb[1024-1025].eqiad.wmnet with reason: cloning
- 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy7001.wikimedia.org
- 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy7001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:16 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy7001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:08 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 10:04 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy7001.wikimedia.org
- 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy2002.wikimedia.org
- 09:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy2002.wikimedia.org
- 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy2001.wikimedia.org
- 09:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy2001.wikimedia.org
- 09:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=93) for new host hcaptcha-proxy7001.wikimedia.org
- 09:44 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 09:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy7001.wikimedia.org on all recursors
- 09:36 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy7001.wikimedia.org on all recursors
- 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7001.wikimedia.org - jmm@cumin2002"
- 09:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy1002.wikimedia.org
- 09:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy1002.wikimedia.org
- 09:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7001.wikimedia.org - jmm@cumin2002"
- 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy1001.wikimedia.org
- 09:20 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 09:20 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy7001.wikimedia.org
- 09:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy1001.wikimedia.org
- 09:04 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Validate sitekey of /siteverify API call (T410024) (duration: 10m 32s)
- 09:00 kharlan@deploy2002: kharlan: Continuing with sync
- 08:58 kharlan@deploy2002: kharlan: Backport for hCaptcha: Validate sitekey of /siteverify API call (T410024) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:58 filippo@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 08:54 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Validate sitekey of /siteverify API call (T410024)
- 08:35 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup[1006-1007].eqiad.wmnet,ms-backup[1001-1002].eqiad.wmnet
- 08:35 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup[1006-1007].eqiad.wmnet,ms-backup[1001-1002].eqiad.wmnet
- 08:17 dcausse@deploy2002: Finished scap sync-world: Backport for cirrus: index field to sort on title (T40403) (duration: 13m 42s)
- 08:13 filippo@cumin1003: START - Cookbook sre.hosts.reimage for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 08:12 dcausse@deploy2002: dcausse: Continuing with sync
- 08:12 filippo@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 08:09 dcausse@deploy2002: dcausse: Backport for cirrus: index field to sort on title (T40403) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:09 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1169.eqiad.wmnet']
- 08:08 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1169.eqiad.wmnet']
- 08:04 dcausse@deploy2002: Started scap sync-world: Backport for cirrus: index field to sort on title (T40403)
- 07:59 moritzm: started OSM import on maps-test2001 T409528
- 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1189 gradually with 4 steps - Repooling after switchover
- 07:06 marostegui@cumin1003: dbctl commit (dc=all): 'Depool pc4', diff saved to https://phabricator.wikimedia.org/P85380 and previous config saved to /var/cache/conftool/dbconfig/20251119-070656-marostegui.json
- 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet,pc2014.codfw.wmnet,pc1014.eqiad.wmnet with reason: network maintenance
- 06:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1189 gradually with 4 steps - Repooling after switchover
- 06:52 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) db1189 gradually with 4 steps - Repooling after switchover
- 06:48 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1189 gradually with 4 steps - Repooling after switchover
- 06:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1189 T410283', diff saved to https://phabricator.wikimedia.org/P85378 and previous config saved to /var/cache/conftool/dbconfig/20251119-064838-marostegui.json
- 06:47 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1223 to s3 primary T410283', diff saved to https://phabricator.wikimedia.org/P85377 and previous config saved to /var/cache/conftool/dbconfig/20251119-064755-marostegui.json
- 06:47 marostegui: Starting s3 eqiad failover from db1189 to db1223 - T410283
- 06:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 T410283
- 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1223 with weight 0 T410283', diff saved to https://phabricator.wikimedia.org/P85376 and previous config saved to /var/cache/conftool/dbconfig/20251119-064055-marostegui.json
- 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc1 after network maint', diff saved to https://phabricator.wikimedia.org/P85375 and previous config saved to /var/cache/conftool/dbconfig/20251119-063522-marostegui.json
- 06:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: db2144 went down
- 06:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depool ms2', diff saved to https://phabricator.wikimedia.org/P85374 and previous config saved to /var/cache/conftool/dbconfig/20251119-062728-marostegui.json
- 06:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms3 T405942', diff saved to https://phabricator.wikimedia.org/P85373 and previous config saved to /var/cache/conftool/dbconfig/20251119-062634-marostegui.json
- 06:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms3 T405942', diff saved to https://phabricator.wikimedia.org/P85372 and previous config saved to /var/cache/conftool/dbconfig/20251119-062509-marostegui.json
- 03:09 eileen: civicrm upgraded from bc100d63 to f471a3ec
- 02:59 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1074.eqiad.wmnet with OS trixie
- 02:46 eileen: config revision changed from c3e95b76 to 8b1a290c
- 01:53 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1074.eqiad.wmnet with reason: host reimage
- 01:50 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1074.eqiad.wmnet with reason: host reimage
- 01:35 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1074.eqiad.wmnet with OS trixie
- 01:23 andrew@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1074.eqiad.wmnet']
- 01:23 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1074.eqiad.wmnet']
- 01:18 andrew@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1074.eqiad.wmnet']
- 01:18 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1074.eqiad.wmnet']
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 18s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:48 andrew@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['cloudvirt1074.eqiad.wmnet']
- 00:48 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1074.eqiad.wmnet']
2025-11-18
- 23:52 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1074.eqiad.wmnet with OS trixie
- 23:42 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1076.eqiad.wmnet with OS trixie
- 22:57 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1075.eqiad.wmnet with OS trixie
- 22:56 sfaci@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
- 22:55 sfaci@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
- 22:29 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1076.eqiad.wmnet with reason: host reimage
- 22:22 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1076.eqiad.wmnet with reason: host reimage
- 22:21 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 22:21 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 22:09 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 22:08 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1075.eqiad.wmnet with reason: host reimage
- 22:08 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 22:07 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1074.eqiad.wmnet with OS trixie
- 22:07 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1074.eqiad.wmnet with OS trixie
- 22:07 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1076.eqiad.wmnet with OS trixie
- 22:05 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1075.eqiad.wmnet with reason: host reimage
- 21:58 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1073.eqiad.wmnet with OS trixie
- 21:50 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1075.eqiad.wmnet with OS trixie
- 21:44 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1074.eqiad.wmnet with OS trixie
- 21:43 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1072.eqiad.wmnet with OS trixie
- 21:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1073.eqiad.wmnet with reason: host reimage
- 21:30 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1073.eqiad.wmnet with reason: host reimage
- 21:27 brett: import trafficserver 9.2.11-1wm1 into trixie-wikimedia - T401832
- 21:24 dzahn@dns1004: END - running authdns-update
- 21:23 mutante: switching backend of releases.wikimedia.org to codfw
- 21:23 dzahn@dns1004: START - running authdns-update
- 21:19 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1072.eqiad.wmnet with reason: host reimage
- 21:15 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1072.eqiad.wmnet with reason: host reimage
- 21:15 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1073.eqiad.wmnet with OS trixie
- 21:13 bvibber@deploy2002: Finished scap sync-world: Backport for MediaViewer buckets reduction to all groups (T372165) (duration: 09m 17s)
- 21:09 bvibber@deploy2002: bvibber: Continuing with sync
- 21:08 bvibber@deploy2002: bvibber: Backport for MediaViewer buckets reduction to all groups (T372165) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:06 brett: import pcre3 8.45-1~deb13+wmf1 into trixie-wikimedia - T401832
- 21:04 bvibber@deploy2002: Started scap sync-world: Backport for MediaViewer buckets reduction to all groups (T372165)
- 21:00 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1069.eqiad.wmnet with OS trixie
- 20:59 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1072.eqiad.wmnet with OS trixie
- 20:57 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1071.eqiad.wmnet with OS trixie
- 20:16 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1069.eqiad.wmnet with reason: host reimage
- 20:14 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1071.eqiad.wmnet with reason: host reimage
- 20:07 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1071.eqiad.wmnet with reason: host reimage
- 20:06 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1069.eqiad.wmnet with reason: host reimage
- 19:52 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1071.eqiad.wmnet with OS trixie
- 19:51 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1069.eqiad.wmnet with OS trixie
- 19:51 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 19:51 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 19:51 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1067.eqiad.wmnet with OS trixie
- 19:49 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1068.eqiad.wmnet with OS trixie
- 19:48 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 19:47 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 19:46 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1016.eqiad.wmnet
- 19:46 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1016.eqiad.wmnet
- 19:44 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1016.eqiad.wmnet
- 19:44 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1016.eqiad.wmnet
- 19:42 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-high-traffic1-codfw (T352245)
- 19:42 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-high-traffic1-codfw (T352245)
- 19:37 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-high-traffic2-codfw (T352245)
- 19:37 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-high-traffic2-codfw (T352245)
- 19:30 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-codfw (T352245)
- 19:29 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-codfw (T352245)
- 19:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1067.eqiad.wmnet with reason: host reimage
- 19:25 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1306.eqiad.wmnet
- 19:25 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1306.eqiad.wmnet
- 19:25 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1254-1256].eqiad.wmnet
- 19:25 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1254-1256].eqiad.wmnet
- 19:24 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudvirt1068.eqiad.wmnet with reason: host reimage
- 19:24 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1067.eqiad.wmnet with reason: host reimage
- 19:24 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-codfw (T352245)
- 19:23 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1306.eqiad.wmnet with reason: C/D Migration
- 19:23 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1068.eqiad.wmnet with reason: host reimage
- 19:23 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-codfw (T352245)
- 19:22 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1156.eqiad.wmnet with reason: C/D Migration
- 19:18 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1155.eqiad.wmnet with reason: C/D Migration
- 19:17 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudidp2001-dev.wikimedia.org with OS trixie
- 19:17 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1154.eqiad.wmnet with reason: C/D Migration
- 19:16 swfrench-wmf: disable puppet on A:lvs-codfw for pybal config change - T352245
- 19:15 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1139.eqiad.wmnet with reason: C/D Migration
- 19:14 brennen@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.3 refs T408273
- 19:14 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1138.eqiad.wmnet with reason: C/D Migration
- 19:12 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1137.eqiad.wmnet with reason: C/D Migration
- 19:11 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1136.eqiad.wmnet with reason: C/D Migration
- 19:09 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1135.eqiad.wmnet with reason: C/D Migration
- 19:09 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1254-1256].eqiad.wmnet
- 19:08 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1068.eqiad.wmnet with OS trixie
- 19:08 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1067.eqiad.wmnet with OS trixie
- 19:07 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1254-1256].eqiad.wmnet
- 19:07 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1306.eqiad.wmnet
- 19:06 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1306.eqiad.wmnet
- 19:06 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1306.eqiad.wmnet
- 19:06 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1306.eqiad.wmnet
- 19:05 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1254-1256].eqiad.wmnet
- 19:05 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1254-1256].eqiad.wmnet
- 19:05 mstyles@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 19:04 mstyles@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 19:04 mstyles@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 19:04 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1157.eqiad.wmnet
- 19:04 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1157.eqiad.wmnet
- 19:04 mstyles@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 19:03 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1157.eqiad.wmnet with reason: C/D Migration
- 19:03 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1157.eqiad.wmnet
- 19:03 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1157.eqiad.wmnet
- 19:03 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1313.eqiad.wmnet
- 19:03 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1313.eqiad.wmnet
- 19:02 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudidp2001-dev.wikimedia.org with reason: host reimage
- 19:02 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1313.eqiad.wmnet with reason: C/D Migration
- 19:02 brennen: 1.46.0-wmf.3 train status (T408273): no current blockers, rolling to group0
- 19:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1065.eqiad.wmnet with OS trixie
- 19:01 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1305.eqiad.wmnet
- 19:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1066.eqiad.wmnet with OS trixie
- 19:01 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1305.eqiad.wmnet
- 18:57 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudidp2001-dev.wikimedia.org with reason: host reimage
- 18:56 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1305.eqiad.wmnet with reason: C/D Migration
- 18:55 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1157.eqiad.wmnet
- 18:55 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1157.eqiad.wmnet
- 18:55 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1313.eqiad.wmnet
- 18:54 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1313.eqiad.wmnet
- 18:54 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1305.eqiad.wmnet
- 18:53 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1305.eqiad.wmnet
- 18:53 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1063.eqiad.wmnet
- 18:53 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1063.eqiad.wmnet
- 18:51 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1063.eqiad.wmnet with reason: C/D Migration
- 18:50 robh@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1063.eqiad.wmnet
- 18:49 robh@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1063.eqiad.wmnet
- 18:47 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1062.eqiad.wmnet with reason: C/D Migration
- 18:42 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on lists1004.wikimedia.org with reason: C/D Migration
- 18:39 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudidp2001-dev.wikimedia.org with OS trixie
- 18:38 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gitlab-runner1004.eqiad.wmnet with reason: C/D Migration
- 18:37 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1066.eqiad.wmnet with reason: host reimage
- 18:34 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on es1033.eqiad.wmnet with reason: C/D Migration
- 18:33 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1065.eqiad.wmnet with reason: host reimage
- 18:30 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1066.eqiad.wmnet with reason: host reimage
- 18:29 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1065.eqiad.wmnet with reason: host reimage
- 18:29 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1044.eqiad.wmnet with OS trixie
- 18:14 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1066.eqiad.wmnet with OS trixie
- 18:14 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1065.eqiad.wmnet with OS trixie
- 18:06 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1016.eqiad.wmnet
- 18:06 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1016.eqiad.wmnet
- 18:05 claime: sudo cumin 'A:wikikube-worker' "enable-puppet 'deploying network topology detection change - ${USER}'"
- 18:05 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1016.eqiad.wmnet
- 18:05 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1016.eqiad.wmnet
- 18:03 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1044.eqiad.wmnet with reason: host reimage
- 18:02 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on es1033.eqiad.wmnet with reason: C/D Migration
- 18:00 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1044.eqiad.wmnet with reason: host reimage
- 17:58 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1184.eqiad.wmnet with reason: C/D Migration
- 17:56 andrew@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=94) for new host cloudidp2001-dev.wikimedia.org
- 17:56 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on backup1007.eqiad.wmnet with reason: C/D Migration
- 17:56 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM cloudidp2001-dev.wikimedia.org - andrew@cumin2002"
- 17:56 claime: cgoubert@cumin1003:~$ sudo cumin 'A:wikikube-worker' "disable-puppet 'deploying network topology detection change - ${USER}'"
- 17:56 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM cloudidp2001-dev.wikimedia.org - andrew@cumin2002"
- 17:56 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1064.eqiad.wmnet with OS trixie
- 17:55 andrew@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudidp2001-dev.wikimedia.org on all recursors
- 17:55 andrew@cumin2002: START - Cookbook sre.dns.wipe-cache cloudidp2001-dev.wikimedia.org on all recursors
- 17:55 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:53 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on backup1006.eqiad.wmnet with reason: C/D Migration
- 17:52 andrew@cumin2002: START - Cookbook sre.dns.netbox
- 17:52 andrew@cumin2002: START - Cookbook sre.ganeti.makevm for new host cloudidp2001-dev.wikimedia.org
- 17:52 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-backup1002.eqiad.wmnet with reason: C/D Migration
- 17:49 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1222.eqiad.wmnet with reason: C/D Migration
- 17:46 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1221.eqiad.wmnet with reason: C/D Migration
- 17:44 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1153.eqiad.wmnet with reason: C/D Migration
- 17:43 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1044.eqiad.wmnet with OS trixie
- 17:42 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases1003.eqiad.wmnet with reason: failover
- 17:42 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on pc1013.eqiad.wmnet with reason: C/D Migration
- 17:42 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases2003.codfw.wmnet with reason: failover
- 17:40 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1189.eqiad.wmnet with reason: C/D Migration
- 17:37 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1181.eqiad.wmnet with reason: C/D Migration
- 17:36 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1167.eqiad.wmnet with reason: C/D Migration
- 17:33 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1065.eqiad.wmnet']
- 17:29 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1064.eqiad.wmnet with reason: host reimage
- 17:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns1006.wikimedia.org
- 17:27 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1063.eqiad.wmnet with OS trixie
- 17:26 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1016.eqiad.wmnet
- 17:25 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1064.eqiad.wmnet with reason: host reimage
- 17:25 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1016.eqiad.wmnet
- 17:25 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wikikube-worker1016.eqiad.wmnet with reason: C/D Migration
- 17:10 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on dns1006.wikimedia.org with reason: C/D Migration
- 17:10 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1064.eqiad.wmnet with OS trixie
- 17:08 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: failover
- 17:08 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=dns1006.wikimedia.org [reason: T405623 eqiad row C/D host migration]
- 17:08 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: failover
- 17:05 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1062.eqiad.wmnet with OS trixie
- 17:04 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1063.eqiad.wmnet with reason: host reimage
- 16:59 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1063.eqiad.wmnet with reason: host reimage
- 16:56 hnowlan: silenced wikifeeds codfw swagger alert for 24h T410296
- 16:55 swfrench-wmf: deleted EtcdReplicationDown silence db7447af-851f-4faa-a4fd-b535ee9fbcdb - T352245
- 16:49 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudidp2001-dev.wikimedia.org with OS bookworm
- 16:46 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM cloudidp2001-dev.wikimedia.org - andrew@cumin2002"
- 16:46 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM cloudidp2001-dev.wikimedia.org - andrew@cumin2002"
- 16:45 andrew@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudidp2001-dev.wikimedia.org on all recursors
- 16:45 andrew@cumin2002: START - Cookbook sre.dns.wipe-cache cloudidp2001-dev.wikimedia.org on all recursors
- 16:45 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:45 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM cloudidp2001-dev.wikimedia.org - andrew@cumin2002"
- 16:45 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM cloudidp2001-dev.wikimedia.org - andrew@cumin2002"
- 16:44 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1063.eqiad.wmnet with OS trixie
- 16:43 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1062.eqiad.wmnet with reason: host reimage
- 16:41 andrew@cumin2002: START - Cookbook sre.dns.netbox
- 16:41 andrew@cumin2002: START - Cookbook sre.ganeti.makevm for new host cloudidp2001-dev.wikimedia.org
- 16:38 swfrench-wmf: transferred etcd-mirror replication back to conf2005 - T352245
- 16:37 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1062.eqiad.wmnet with reason: host reimage
- 16:28 swfrench-wmf: restarted navtiming on webperf2003 - T352245
- 16:25 swfrench-wmf: begin rolling restarts of codfw-associated confds - T352245
- 16:23 swfrench@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Hold deployments during etcd certificate change - T352245 (duration: 04m 32s)
- 16:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul2002.codfw.wmnet
- 16:22 swfrench-wmf: migrating etcd to PKI certs on conf2005 - T352245
- 16:22 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1062.eqiad.wmnet with OS trixie
- 16:21 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1060.eqiad.wmnet with OS trixie
- 16:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host zuul2002.codfw.wmnet
- 16:18 swfrench@deploy2002: Locking from deployment [ALL REPOSITORIES]: Hold deployments during etcd certificate change - T352245
- 16:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on releases2003.codfw.wmnet with reason: releases
- 16:10 swfrench-wmf: migrating etcd to PKI certs on conf2004 - T352245
- 16:08 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on releases1003.eqiad.wmnet with reason: releases
- 16:04 brennen@deploy2002: Finished deploy [phabricator/deployment@8b1bc09]: deploy phab1004 for T409947 (duration: 00m 56s)
- 16:03 brennen@deploy2002: Started deploy [phabricator/deployment@8b1bc09]: deploy phab1004 for T409947
- 16:03 brennen@deploy2002: Finished deploy [phabricator/deployment@8b1bc09]: deploy phab2002 for T409947 (duration: 00m 31s)
- 16:02 Amir1: cumin2024@db2205.codfw.wmnet[(none)]> drop database if exists fixcopyrightwiki; drop database if exists langcomwiki; drop database if exists mowiki; drop database if exists mowiktionary; (T297297)
- 16:02 brennen@deploy2002: Started deploy [phabricator/deployment@8b1bc09]: deploy phab2002 for T409947
- 16:02 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab1004.eqiad.wmnet with reason: deployment
- 16:01 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab2002.codfw.wmnet with reason: deployment
- 15:55 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1060.eqiad.wmnet with reason: host reimage
- 15:53 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: sync
- 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul2001.codfw.wmnet
- 15:52 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: sync
- 15:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host zuul2001.codfw.wmnet
- 15:51 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1060.eqiad.wmnet with reason: host reimage
- 15:48 swfrench-wmf: transferred etcd-mirror replication to conf2006 - T352245
- 15:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1088.eqiad.wmnet
- 15:48 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy7002.wikimedia.org
- 15:48 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:48 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy7002.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
- 15:47 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy7002.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
- 15:44 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 15:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1088.eqiad.wmnet
- 15:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul1002.eqiad.wmnet
- 15:40 sukhe@cumin1003: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy7002.wikimedia.org
- 15:39 swfrench-wmf: silenced EtcdReplicationDown db7447af-851f-4faa-a4fd-b535ee9fbcdb - T352245
- 15:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host zuul1002.eqiad.wmnet
- 15:35 swfrench-wmf: disable puppet on A:conf-codfw - T352245
- 15:34 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1060.eqiad.wmnet with OS trixie
- 15:30 sgimeno@deploy2002: Finished scap sync-world: Backport for undeploy Extension:Capiunto (T410172) (duration: 46m 51s)
- 15:29 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1059.eqiad.wmnet with OS trixie
- 15:16 sgimeno@deploy2002: novemlinguae, sgimeno: Continuing with sync
- 15:11 sgimeno@deploy2002: novemlinguae, sgimeno: Backport for undeploy Extension:Capiunto (T410172) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 15:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 15:03 btullis@dns1004: END - running authdns-update
- 15:03 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1059.eqiad.wmnet with reason: host reimage
- 15:02 btullis@dns1004: START - running authdns-update
- 14:57 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1059.eqiad.wmnet with reason: host reimage
- 14:43 sgimeno@deploy2002: Started scap sync-world: Backport for undeploy Extension:Capiunto (T410172)
- 14:41 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1059.eqiad.wmnet with OS trixie
- 14:40 sgimeno@deploy2002: Finished scap sync-world: Backport for EventStreamConfig: add stream for Growth and Editing team edit rates (T405177) (duration: 10m 05s)
- 14:37 sgimeno@deploy2002: sgimeno: Continuing with sync
- 14:36 sgimeno@deploy2002: sgimeno: Backport for EventStreamConfig: add stream for Growth and Editing team edit rates (T405177) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:30 sgimeno@deploy2002: Started scap sync-world: Backport for EventStreamConfig: add stream for Growth and Editing team edit rates (T405177)
- 14:28 esanders@deploy2002: Finished scap sync-world: Backport for [kywiki] Add new rollbacker and eliminator usergroups (T410121), [dewiki] Enable SandboxLink extension (T410199) (duration: 10m 52s)
- 14:23 esanders@deploy2002: superpes, esanders: Continuing with sync
- 14:22 esanders@deploy2002: superpes, esanders: Backport for [kywiki] Add new rollbacker and eliminator usergroups (T410121), [dewiki] Enable SandboxLink extension (T410199) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:17 esanders@deploy2002: Started scap sync-world: Backport for [kywiki] Add new rollbacker and eliminator usergroups (T410121), [dewiki] Enable SandboxLink extension (T410199)
- 14:16 filippo@cumin1003: START - Cookbook sre.hosts.reimage for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 14:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) es1033 gradually with 4 steps - Repooling after upgrade
- 14:15 esanders@deploy2002: Finished scap sync-world: Backport for Hackaround 2015 broken convert on ptwikibooks (T402549), Hackaround 2015 broken convert on ptwikibooks (T402549) (duration: 10m 09s)
- 14:10 esanders@deploy2002: esanders: Continuing with sync
- 14:10 filippo@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 14:09 esanders@deploy2002: esanders: Backport for Hackaround 2015 broken convert on ptwikibooks (T402549), Hackaround 2015 broken convert on ptwikibooks (T402549) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:04 esanders@deploy2002: Started scap sync-world: Backport for Hackaround 2015 broken convert on ptwikibooks (T402549), Hackaround 2015 broken convert on ptwikibooks (T402549)
- 13:59 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host hcaptcha-proxy7002.wikimedia.org with OS trixie
- 13:50 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul1001.eqiad.wmnet
- 13:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host zuul1001.eqiad.wmnet
- 13:35 kart_: Update Recommendation API to 2025-11-17-092813-production (T406854)
- 13:34 kartik@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:31 marostegui@cumin1003: START - Cookbook sre.mysql.pool es1033 gradually with 4 steps - Repooling after upgrade
- 13:31 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) es1033 gradually with 4 steps - Repooling after upgrade
- 13:31 marostegui@cumin1003: START - Cookbook sre.mysql.pool es1033 gradually with 4 steps - Repooling after upgrade
- 13:29 kartik@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:28 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for es1033.eqiad.wmnet
- 13:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) es1033 - Upgrading es1033.eqiad.wmnet
- 13:16 marostegui@cumin1003: START - Cookbook sre.mysql.depool es1033 - Upgrading es1033.eqiad.wmnet
- 13:16 marostegui@cumin1003: START - Cookbook sre.mysql.upgrade for es1033.eqiad.wmnet
- 13:14 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy7002.wikimedia.org with OS trixie
- 13:02 marostegui: Reboot es1033 (Debian trixie) for kernel upgrade
- 13:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1033 kernel reboot', diff saved to https://phabricator.wikimedia.org/P85363 and previous config saved to /var/cache/conftool/dbconfig/20251118-130200-marostegui.json
- 12:53 cgoubert@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on 74 hosts with reason: up for decom
- 12:49 cgoubert@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on 74 hosts with reason: up for decom
- 12:43 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:19 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
- 12:19 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: sync
- 12:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people1005.eqiad.wmnet
- 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people1005.eqiad.wmnet
- 12:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
- 12:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
- 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1003.eqiad.wmnet
- 11:59 moritzm: installing rabbitmq-server security updates
- 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1003.eqiad.wmnet
- 11:53 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: sync
- 11:53 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: sync
- 11:34 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1169.eqiad.wmnet with OS trixie
- 11:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2003.codfw.wmnet
- 11:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 11:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1169.eqiad.wmnet with OS trixie
- 11:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 11:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1169.eqiad.wmnet']
- 11:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 11:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 11:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2003.codfw.wmnet
- 11:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
- 11:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
- 11:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
- 11:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
- 11:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 11:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 11:18 marostegui@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1169.eqiad.wmnet']
- 11:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1169.eqiad.wmnet']
- 11:14 moritzm: installing qemu security updates
- 11:09 marostegui@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1169.eqiad.wmnet']
- 11:09 marostegui@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db1169.eqiad.wmnet']
- 11:09 marostegui@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1169.eqiad.wmnet']
- 11:09 sfaci@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
- 11:09 marostegui@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db1169.eqiad.wmnet']
- 11:09 sfaci@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
- 11:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
- 11:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
- 11:01 marostegui@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1169.eqiad.wmnet']
- 10:53 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1169.eqiad.wmnet with OS trixie
- 10:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1169.eqiad.wmnet with OS trixie
- 10:40 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1169.eqiad.wmnet with OS trixie
- 10:30 filippo@cumin1003: START - Cookbook sre.hosts.reimage for host cloudcontrol2010-dev.codfw.wmnet with OS trixie
- 10:27 moritzm: installing libxml2 security updates
- 10:25 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on backup[1006-1007].eqiad.wmnet,ms-backup[1001-1002].eqiad.wmnet with reason: Network maintenance
- 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1169.eqiad.wmnet with OS trixie
- 10:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1169.eqiad.wmnet with OS trixie
- 10:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be1067.eqiad.wmnet
- 10:15 mvernon@cumin2002: START - Cookbook sre.hosts.remove-downtime for ms-be1067.eqiad.wmnet
- 10:13 hashar@deploy2002: Finished deploy [integration/docroot@a7f5910]: build: Updating npm dependencies (linting only) (duration: 00m 11s)
- 10:13 hashar@deploy2002: Started deploy [integration/docroot@a7f5910]: build: Updating npm dependencies (linting only)
- 10:13 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1169.eqiad.wmnet with OS trixie
- 10:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Reimage to trixie
- 10:05 jmm@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "fixup sretest2010 - jmm@cumin1003"
- 10:05 jmm@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "fixup sretest2010 - jmm@cumin1003"
- 09:48 Emperor: VACUUM large container dbs on ms-be1067 T377827
- 09:48 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1067.eqiad.wmnet with reason: vacuum overlarge container dbs
- 09:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be1070.eqiad.wmnet
- 09:48 mvernon@cumin2002: START - Cookbook sre.hosts.remove-downtime for ms-be1070.eqiad.wmnet
- 09:42 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1169 T410369', diff saved to https://phabricator.wikimedia.org/P85361 and previous config saved to /var/cache/conftool/dbconfig/20251118-094246-marostegui.json
- 09:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1001.eqiad.wmnet
- 09:38 moritzm: installing curl security updates
- 09:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy1001.eqiad.wmnet
- 09:33 jmm@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "fixup sretest1006 - jmm@cumin1003"
- 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1002.eqiad.wmnet
- 09:30 jmm@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "fixup sretest1006 - jmm@cumin1003"
- 09:28 kharlan@deploy2002: Finished scap sync-world: Backport for [arwikimedia] Change the logo/icon and update the wordmark (T353218) (duration: 11m 02s)
- 09:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy1002.eqiad.wmnet
- 09:24 kharlan@deploy2002: kharlan, superpes: Continuing with sync
- 09:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2094.codfw.wmnet with OS bullseye
- 09:22 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1070.eqiad.wmnet with reason: vacuum overlarge container dbs
- 09:21 kharlan@deploy2002: kharlan, superpes: Backport for [arwikimedia] Change the logo/icon and update the wordmark (T353218) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:17 kharlan@deploy2002: Started scap sync-world: Backport for [arwikimedia] Change the logo/icon and update the wordmark (T353218)
- 09:14 kharlan@deploy2002: Finished scap sync-world: Backport for [arwikimedia] Disable local file uploading (T353218) (duration: 10m 22s)
- 09:10 Emperor: VACUUM large container dbs on ms-be1070 T377827
- 09:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 397715
- 09:10 kharlan@deploy2002: superpes, kharlan: Continuing with sync
- 09:09 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 397715
- 09:09 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 138915
- 09:08 kharlan@deploy2002: superpes, kharlan: Backport for [arwikimedia] Disable local file uploading (T353218) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:05 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 138915
- 09:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2094.codfw.wmnet with reason: host reimage
- 09:03 kharlan@deploy2002: Started scap sync-world: Backport for [arwikimedia] Disable local file uploading (T353218)
- 09:02 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Update passive mode config for addurl trigger (T409957) (duration: 12m 47s)
- 09:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 08:59 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2094.codfw.wmnet with reason: host reimage
- 08:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 08:57 kharlan@deploy2002: kharlan: Continuing with sync
- 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "fix up sretest1005 - jmm@cumin2002"
- 08:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "fix up sretest1005 - jmm@cumin2002"
- 08:53 kharlan@deploy2002: kharlan: Backport for hCaptcha: Update passive mode config for addurl trigger (T409957) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:49 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Update passive mode config for addurl trigger (T409957)
- 08:46 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Enable hCaptcha editing for fawiki, trwiki (T405586) (duration: 14m 26s)
- 08:40 kharlan@deploy2002: kharlan: Continuing with sync
- 08:37 kharlan@deploy2002: kharlan: Backport for hCaptcha: Enable hCaptcha editing for fawiki, trwiki (T405586) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:37 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2094.codfw.wmnet with OS bullseye
- 08:32 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Enable hCaptcha editing for fawiki, trwiki (T405586)
- 08:29 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 08:23 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 08:15 moritzm: installing openssl bugfix updates on trixie hosts
- 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2001.codfw.wmnet
- 07:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2001.codfw.wmnet
- 07:55 jmm@cumin1003: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: dbprov1003.eqiad.wmnet
- 07:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2002.codfw.wmnet
- 07:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2002.codfw.wmnet
- 07:50 moritzm: installing libssh security updates
- 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3001.esams.wmnet
- 07:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3001.esams.wmnet
- 07:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1184 gradually with 4 steps - Repooling after switchover
- 06:38 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1058.eqiad.wmnet with OS trixie
- 06:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 20 hosts with reason: Network maint
- 06:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 18 hosts with reason: Network maint
- 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3002.esams.wmnet
- 06:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depool ms3 T405942', diff saved to https://phabricator.wikimedia.org/P85356 and previous config saved to /var/cache/conftool/dbconfig/20251118-063048-marostegui.json
- 06:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depool pc1 T405942', diff saved to https://phabricator.wikimedia.org/P85355 and previous config saved to /var/cache/conftool/dbconfig/20251118-063010-marostegui.json
- 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1184 gradually with 4 steps - Repooling after switchover
- 06:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3002.esams.wmnet
- 06:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1184 T410282', diff saved to https://phabricator.wikimedia.org/P85353 and previous config saved to /var/cache/conftool/dbconfig/20251118-062720-marostegui.json
- 06:26 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1163 to s1 primary T410282', diff saved to https://phabricator.wikimedia.org/P85351 and previous config saved to /var/cache/conftool/dbconfig/20251118-062645-marostegui.json
- 06:26 marostegui: Starting s1 eqiad failover from db1184 to db1163 - T410282
- 06:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 32 hosts with reason: Primary switchover s1 T410282
- 06:22 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1163 with weight 0 T410282', diff saved to https://phabricator.wikimedia.org/P85350 and previous config saved to /var/cache/conftool/dbconfig/20251118-062209-marostegui.json
- 06:20 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
- 06:13 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 05:47 kart_: Update cxserver to 2025-11-18-043632-production (T409688, T408515)
- 05:46 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 05:46 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 05:46 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 05:45 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 05:39 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 05:38 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 05:38 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1058.eqiad.wmnet with reason: host reimage
- 05:31 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1058.eqiad.wmnet with reason: host reimage
- 05:15 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1058.eqiad.wmnet with OS trixie
- 05:14 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1057.eqiad.wmnet with OS trixie
- 05:02 mwpresync@deploy2002: Pruned MediaWiki: 1.45.0-wmf.25 (duration: 02m 38s)
- 04:50 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.3 refs T408273 (duration: 46m 58s)
- 04:48 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1057.eqiad.wmnet with reason: host reimage
- 04:42 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1057.eqiad.wmnet with reason: host reimage
- 04:26 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1057.eqiad.wmnet with OS trixie
- 04:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1055.eqiad.wmnet with OS trixie
- 04:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.3 refs T408273
- 03:58 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1055.eqiad.wmnet with reason: host reimage
- 03:55 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1055.eqiad.wmnet with reason: host reimage
- 03:54 eileen: civicrm upgraded from f4e4553c to bc100d63
- 03:39 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1055.eqiad.wmnet with OS trixie
- 03:33 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1056.eqiad.wmnet with OS trixie
- 02:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1056.eqiad.wmnet with reason: host reimage
- 02:24 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1056.eqiad.wmnet with reason: host reimage
- 02:08 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1056.eqiad.wmnet with OS trixie
- 02:06 andrew@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1055.eqiad.wmnet']
- 02:06 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1055.eqiad.wmnet']
- 01:02 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 01m 35s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:56 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1053.eqiad.wmnet with OS trixie
2025-11-17
- 23:58 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
- 23:55 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
- 23:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2004.codfw.wmnet with OS trixie
- 23:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 23:50 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 23:38 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1053.eqiad.wmnet with OS trixie
- 23:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2004.codfw.wmnet with reason: host reimage
- 23:27 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2004.codfw.wmnet with reason: host reimage
- 23:15 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1052.eqiad.wmnet with OS trixie
- 23:07 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2004.codfw.wmnet with OS trixie
- 22:43 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
- 22:38 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
- 22:21 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1052.eqiad.wmnet with OS trixie
- 22:21 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1051.eqiad.wmnet with OS trixie
- 22:11 papaul: reboot sretest2004 to troubleshoot LLDP issue
- 21:55 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1051.eqiad.wmnet with reason: host reimage
- 21:51 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1051.eqiad.wmnet with reason: host reimage
- 21:35 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1051.eqiad.wmnet with OS trixie
- 21:27 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1051.eqiad.wmnet with OS trixie
- 21:22 Amir1: DROP table if exists securepoll_u4c2025_edits; on all wikis (T355594)
- 21:20 eileen: civicrm upgraded from 0afa5c8f to f4e4553c
- 21:17 bvibber@deploy2002: Finished scap sync-world: Backport for Reduced MediaViewer bucket sizes list to group1 (T372165), Release CampaignEvents extension to all remaining wikis (T409760) (duration: 12m 43s)
- 21:12 bvibber@deploy2002: bvibber, cmelo: Continuing with sync
- 21:11 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1051.eqiad.wmnet with reason: host reimage
- 21:09 bvibber@deploy2002: bvibber, cmelo: Backport for Reduced MediaViewer bucket sizes list to group1 (T372165), Release CampaignEvents extension to all remaining wikis (T409760) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:08 mutante: LDAP - added ankita97531 to group nda - T409894
- 21:08 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1051.eqiad.wmnet with reason: host reimage
- 21:04 bvibber@deploy2002: Started scap sync-world: Backport for Reduced MediaViewer bucket sizes list to group1 (T372165), Release CampaignEvents extension to all remaining wikis (T409760)
- 20:51 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1051.eqiad.wmnet with OS trixie
- 20:49 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1049.eqiad.wmnet with OS trixie
- 20:43 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2094.codfw.wmnet with reason: host reimage
- 20:39 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2094.codfw.wmnet with reason: host reimage
- 20:28 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host hcaptcha-proxy7002.wikimedia.org with OS trixie
- 20:20 mutante: codesearch9.codesearch - systemctl restart hound-search (T410310)
- 20:19 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2094.codfw.wmnet with OS bullseye
- 20:19 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1049.eqiad.wmnet with reason: host reimage
- 20:15 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1049.eqiad.wmnet with reason: host reimage
- 19:59 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1049.eqiad.wmnet with OS trixie
- 19:58 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1048.eqiad.wmnet with OS trixie
- 19:46 ejegg: payments-wiki upgraded from 235a82db to d72930e6
- 19:42 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy7002.wikimedia.org with OS trixie
- 19:33 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy7002.wikimedia.org
- 19:33 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host hcaptcha-proxy7002.wikimedia.org with OS trixie
- 19:31 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-high-traffic1-codfw (T352245)
- 19:30 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-high-traffic1-codfw (T352245)
- 19:27 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-high-traffic2-codfw (T352245)
- 19:27 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-high-traffic2-codfw (T352245)
- 19:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7002.magru.wmnet
- 19:21 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-codfw (T352245)
- 19:20 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy3002.wikimedia.org with OS trixie
- 19:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7002.magru.wmnet
- 19:15 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-codfw (T352245)
- 19:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7001.magru.wmnet
- 19:11 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-codfw (T352245)
- 19:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7001.magru.wmnet
- 19:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6002.drmrs.wmnet
- 19:09 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1048.eqiad.wmnet with reason: host reimage
- 19:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6002.drmrs.wmnet
- 19:05 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-codfw (T352245)
- 19:05 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1048.eqiad.wmnet with reason: host reimage
- 19:03 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy3002.wikimedia.org with reason: host reimage
- 19:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6001.drmrs.wmnet
- 18:58 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy3002.wikimedia.org with reason: host reimage
- 18:57 swfrench-wmf: disable puppet on A:lvs-codfw for pybal config change - T352245
- 18:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6001.drmrs.wmnet
- 18:48 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1048.eqiad.wmnet with OS trixie
- 18:47 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1047.eqiad.wmnet with OS trixie
- 18:42 swfrench@deploy2002: Finished scap sync-world: Backport for Disable enrollment in PHP 8.3 (T405955) (duration: 10m 37s)
- 18:41 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy7002.wikimedia.org with OS trixie
- 18:40 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy7002.wikimedia.org - sukhe@cumin1003"
- 18:40 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy7002.wikimedia.org - sukhe@cumin1003"
- 18:40 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy7002.wikimedia.org on all recursors
- 18:40 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy7002.wikimedia.org on all recursors
- 18:40 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:40 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7002.wikimedia.org - sukhe@cumin1003"
- 18:39 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7002.wikimedia.org - sukhe@cumin1003"
- 18:38 swfrench@deploy2002: swfrench: Continuing with sync
- 18:36 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 18:36 swfrench@deploy2002: swfrench: Backport for Disable enrollment in PHP 8.3 (T405955) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:36 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy7002.wikimedia.org
- 18:34 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy3002.wikimedia.org with OS trixie
- 18:32 swfrench@deploy2002: Started scap sync-world: Backport for Disable enrollment in PHP 8.3 (T405955)
- 18:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
- 18:03 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2094.codfw.wmnet with OS bullseye
- 18:03 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1047.eqiad.wmnet with OS trixie
- 18:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1045.eqiad.wmnet with OS trixie
- 17:52 fceratto@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host db-test1003.eqiad.wmnet
- 17:52 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db-test1003.eqiad.wmnet with OS trixie
- 17:42 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Disable edit integration on jawiki (T405586) (duration: 10m 58s)
- 17:37 kharlan@deploy2002: kharlan: Continuing with sync
- 17:37 kharlan@deploy2002: kharlan: Backport for hCaptcha: Disable edit integration on jawiki (T405586) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:32 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db-test1003.eqiad.wmnet with reason: host reimage
- 17:32 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Disable edit integration on jawiki (T405586)
- 17:27 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db-test1003.eqiad.wmnet with reason: host reimage
- 17:24 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1045.eqiad.wmnet with reason: host reimage
- 17:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1181 gradually with 4 steps - Repooling after switchover
- 17:21 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy3002.wikimedia.org with OS trixie
- 17:21 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1045.eqiad.wmnet with reason: host reimage
- 17:21 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 17:21 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 17:20 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy3002.wikimedia.org on all recursors
- 17:20 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy3002.wikimedia.org on all recursors
- 17:20 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:20 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 17:20 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 17:16 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 17:16 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy3002.wikimedia.org
- 17:15 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db-test1003.eqiad.wmnet with OS trixie
- 17:15 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM db-test1003.eqiad.wmnet - fceratto@cumin1003"
- 17:15 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM db-test1003.eqiad.wmnet - fceratto@cumin1003"
- 17:15 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy3002.wikimedia.org
- 17:15 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:15 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db-test1003.eqiad.wmnet on all recursors
- 17:14 fceratto@cumin1003: START - Cookbook sre.dns.wipe-cache db-test1003.eqiad.wmnet on all recursors
- 17:14 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:14 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM db-test1003.eqiad.wmnet - fceratto@cumin1003"
- 17:14 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM db-test1003.eqiad.wmnet - fceratto@cumin1003"
- 17:12 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 17:06 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 17:06 fceratto@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 17:05 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1045.eqiad.wmnet with OS trixie
- 17:04 sukhe@cumin1003: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy3002.wikimedia.org
- 17:04 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 17:04 fceratto@cumin1003: START - Cookbook sre.ganeti.makevm for new host db-test1003.eqiad.wmnet
- 17:03 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db-test1003.eqiad.wmnet
- 17:03 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:03 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db-test1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 17:01 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db-test1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 17:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5001.eqsin.wmnet
- 16:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5001.eqsin.wmnet
- 16:56 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 16:55 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2094.codfw.wmnet with OS bullseye
- 16:51 fceratto@cumin1003: START - Cookbook sre.hosts.decommission for hosts db-test1003.eqiad.wmnet
- 16:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2230.codfw.wmnet onto db-test1002.eqiad.wmnet
- 16:40 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2094.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 16:40 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1002.eqiad.wmnet
- 16:39 fceratto@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host db-test1002.eqiad.wmnet
- 16:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db-test1002.eqiad.wmnet with OS trixie
- 16:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1181 gradually with 4 steps - Repooling after switchover
- 16:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:37 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) db1181 gradually with 4 steps - Repooling after switchover
- 16:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1181 gradually with 4 steps - Repooling after switchover
- 16:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1181 T410281', diff saved to https://phabricator.wikimedia.org/P85344 and previous config saved to /var/cache/conftool/dbconfig/20251117-163620-marostegui.json
- 16:35 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1236 to s7 primary T410281', diff saved to https://phabricator.wikimedia.org/P85343 and previous config saved to /var/cache/conftool/dbconfig/20251117-163535-marostegui.json
- 16:35 marostegui: Starting s7 eqiad failover from db1181 to db1236 - T410281
- 16:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 T410281
- 16:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:31 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1236 with weight 0 T410281', diff saved to https://phabricator.wikimedia.org/P85342 and previous config saved to /var/cache/conftool/dbconfig/20251117-163121-marostegui.json
- 16:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:19 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2094.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 16:19 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db-test1002.eqiad.wmnet with reason: host reimage
- 16:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4002.ulsfo.wmnet
- 16:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4002.ulsfo.wmnet
- 16:12 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db-test1002.eqiad.wmnet with reason: host reimage
- 16:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1002.eqiad.wmnet
- 16:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:02 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db-test1002.eqiad.wmnet with OS trixie
- 15:59 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM db-test1002.eqiad.wmnet - fceratto@cumin1003"
- 15:59 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM db-test1002.eqiad.wmnet - fceratto@cumin1003"
- 15:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 15:59 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db-test1002.eqiad.wmnet on all recursors
- 15:59 fceratto@cumin1003: START - Cookbook sre.dns.wipe-cache db-test1002.eqiad.wmnet on all recursors
- 15:59 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:59 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM db-test1002.eqiad.wmnet - fceratto@cumin1003"
- 15:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 15:59 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM db-test1002.eqiad.wmnet - fceratto@cumin1003"
- 15:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1002.eqiad.wmnet
- 15:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 15:56 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 15:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4001.ulsfo.wmnet
- 15:54 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 15:54 fceratto@cumin1003: START - Cookbook sre.ganeti.makevm for new host db-test1002.eqiad.wmnet
- 15:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db-test1002.eqiad.wmnet
- 15:54 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:54 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db-test1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 15:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4001.ulsfo.wmnet
- 15:49 sukhe@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy4002.wikimedia.org
- 15:49 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy4002.wikimedia.org with OS trixie
- 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2093.codfw.wmnet with OS bullseye
- 15:43 gehel@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 15:43 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db-test1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 15:41 gehel@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 15:35 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy3002.wikimedia.org
- 15:35 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host hcaptcha-proxy3002.wikimedia.org with OS trixie
- 15:34 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
- 15:34 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
- 15:34 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
- 15:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1001.eqiad.wmnet
- 15:33 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
- 15:33 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy4002.wikimedia.org with reason: host reimage
- 15:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1208.eqiad.wmnet
- 15:32 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
- 15:31 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 15:30 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 15:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2093.codfw.wmnet with reason: host reimage
- 15:28 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy4002.wikimedia.org with reason: host reimage
- 15:25 fceratto@cumin1003: START - Cookbook sre.hosts.decommission for hosts db-test1002.eqiad.wmnet
- 15:25 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2093.codfw.wmnet with reason: host reimage
- 15:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1208.eqiad.wmnet
- 15:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1208.eqiad.wmnet
- 15:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1001.eqiad.wmnet
- 15:11 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 15:05 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy4002.wikimedia.org with OS trixie
- 15:05 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4002.wikimedia.org - sukhe@cumin1003"
- 15:05 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4002.wikimedia.org - sukhe@cumin1003"
- 15:04 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy4002.wikimedia.org on all recursors
- 15:04 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy4002.wikimedia.org on all recursors
- 15:04 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:04 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4002.wikimedia.org - sukhe@cumin1003"
- 15:04 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4002.wikimedia.org - sukhe@cumin1003"
- 15:03 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2093.codfw.wmnet with OS bullseye
- 15:00 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 15:00 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4002.wikimedia.org
- 15:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:53 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Enable hCaptcha editing for idwiki, jawiki, and ptwiki (T405586) (duration: 16m 02s)
- 14:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2092.codfw.wmnet with OS bullseye
- 14:48 gehel@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:48 kharlan@deploy2002: kharlan: Continuing with sync
- 14:44 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy3002.wikimedia.org with OS trixie
- 14:43 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 14:43 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 14:43 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy3002.wikimedia.org on all recursors
- 14:43 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy3002.wikimedia.org on all recursors
- 14:43 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:43 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 14:41 kharlan@deploy2002: kharlan: Backport for hCaptcha: Enable hCaptcha editing for idwiki, jawiki, and ptwiki (T405586) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:41 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 14:39 gehel@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:38 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 14:38 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy3002.wikimedia.org
- 14:37 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy4002.wikimedia.org
- 14:37 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4002.wikimedia.org
- 14:37 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Enable hCaptcha editing for idwiki, jawiki, and ptwiki (T405586)
- 14:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1208.eqiad.wmnet
- 14:35 jforrester@deploy2002: Finished scap sync-world: Backport for Enable embedded Wikifunctions on Wikimania wiki (T401683) (duration: 09m 55s)
- 14:31 jforrester@deploy2002: jforrester: Continuing with sync
- 14:29 jforrester@deploy2002: jforrester: Backport for Enable embedded Wikifunctions on Wikimania wiki (T401683) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2092.codfw.wmnet with reason: host reimage
- 14:25 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2092.codfw.wmnet with reason: host reimage
- 14:25 Amir1: mwscript-k8s --dblist=large -- purgeUserOptions.php --login-age 15 (T406724)
- 14:25 jforrester@deploy2002: Started scap sync-world: Backport for Enable embedded Wikifunctions on Wikimania wiki (T401683)
- 14:15 esanders@deploy2002: Finished scap sync-world: Backport for Make LQT opt-in on ptwikibooks (T402549) (duration: 10m 53s)
- 14:11 James_F: jforrester@deploy2002:~$ mwscript sql --wiki=wikimaniawiki /srv/mediawiki/php-1.46.0-wmf.2/extensions/WikiLambda/sql/mysql/table-usage.sql # T401683
- 14:10 moritzm: installing glibc security updates
- 14:09 esanders@deploy2002: esanders: Continuing with sync
- 14:08 esanders@deploy2002: esanders: Backport for Make LQT opt-in on ptwikibooks (T402549) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:04 esanders@deploy2002: Started scap sync-world: Backport for Make LQT opt-in on ptwikibooks (T402549)
- 14:03 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2092.codfw.wmnet with OS bullseye
- 13:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2091.codfw.wmnet with OS bullseye
- 13:53 tappof: titan2002: switch /srv mount from /dev/md2 to /dev/vg0/srv T410152
- 13:52 Amir1: cumin2024@db2205.codfw.wmnet[(none)]> drop database if exists dkwiki; drop database if exists dkwikibooks; drop database if exists dkwiktionary; (T297297)
- 13:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2091.codfw.wmnet with reason: host reimage
- 13:33 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2091.codfw.wmnet with reason: host reimage
- 13:26 moritzm: update trixie installer image to 13.2 T410147
- 13:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2028.codfw.wmnet with OS bookworm
- 13:11 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2091.codfw.wmnet with OS bullseye
- 13:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 12:57 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 12:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 12:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 12:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2090.codfw.wmnet with OS bullseye
- 12:39 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS bookworm
- 12:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2230.codfw.wmnet onto db-test2002.codfw.wmnet
- 12:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2090.codfw.wmnet with reason: host reimage
- 12:29 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test2002.codfw.wmnet
- 12:25 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2090.codfw.wmnet with reason: host reimage
- 12:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2028.codfw.wmnet with OS bookworm
- 12:14 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 12:13 taavi: update CR firewall policy to add the x4 port to wiki replicas related rules T409560
- 12:02 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2090.codfw.wmnet with OS bullseye
- 11:57 fceratto@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host db-test2002.codfw.wmnet
- 11:57 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db-test2002.codfw.wmnet with OS trixie
- 11:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 11:50 aikochou@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 11:50 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 11:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 11:40 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Bugfix - oblivian@cumin1003"
- 11:40 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Bugfix - oblivian@cumin1003
- 11:39 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Bugfix - oblivian@cumin1003
- 11:39 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Bugfix - oblivian@cumin1003"
- 11:38 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db-test2002.codfw.wmnet with reason: host reimage
- 11:38 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 11:33 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db-test2002.codfw.wmnet with reason: host reimage
- 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS bookworm
- 11:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 11:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 11:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2028.codfw.wmnet with OS bookworm
- 11:20 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db-test2002.codfw.wmnet with OS trixie
- 11:20 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 11:19 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 11:16 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 11:16 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 11:11 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM db-test2002.codfw.wmnet - fceratto@cumin1003"
- 11:11 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM db-test2002.codfw.wmnet - fceratto@cumin1003"
- 11:10 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db-test2002.codfw.wmnet on all recursors
- 11:10 fceratto@cumin1003: START - Cookbook sre.dns.wipe-cache db-test2002.codfw.wmnet on all recursors
- 11:10 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:10 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM db-test2002.codfw.wmnet - fceratto@cumin1003"
- 11:04 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM db-test2002.codfw.wmnet - fceratto@cumin1003"
- 11:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 10:58 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 10:43 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 10:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 10:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
- 10:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS bookworm
- 10:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
- 10:37 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 10:37 fceratto@cumin1003: START - Cookbook sre.ganeti.makevm for new host db-test2002.codfw.wmnet
- 10:36 fceratto@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts db-test2002.codfw.wmnet
- 10:36 fceratto@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 10:36 fceratto@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db-test2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 10:35 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db-test2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 10:34 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset: apply
- 10:33 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset: apply
- 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
- 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
- 10:23 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 10:20 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS bookworm
- 10:18 fceratto@cumin1003: START - Cookbook sre.hosts.decommission for hosts db-test2002.codfw.wmnet
- 10:09 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha::passCaptcha: Log the action, trigger and SiteKey (duration: 17m 13s)
- 10:04 kharlan@deploy2002: kharlan: Continuing with sync
- 10:04 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS bookworm
- 10:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2028.codfw.wmnet with OS trixie
- 09:56 kharlan@deploy2002: kharlan: Backport for hCaptcha::passCaptcha: Log the action, trigger and SiteKey synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:52 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha::passCaptcha: Log the action, trigger and SiteKey
- 09:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 09:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 09:43 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2230.codfw.wmnet onto db-test2001.codfw.wmnet
- 09:40 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test2001.codfw.wmnet
- 09:38 kharlan@deploy2002: Finished scap sync-world: Backport for Remove wgMetricsPlatformEnableExperimentOverrides config variable (T405727) (duration: 11m 29s)
- 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 09:34 kharlan@deploy2002: sfaci, kharlan: Continuing with sync
- 09:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 09:31 kharlan@deploy2002: sfaci, kharlan: Backport for Remove wgMetricsPlatformEnableExperimentOverrides config variable (T405727) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:27 kharlan@deploy2002: Started scap sync-world: Backport for Remove wgMetricsPlatformEnableExperimentOverrides config variable (T405727)
- 09:22 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Conditionally disable the addurl rule (T410123) (duration: 10m 47s)
- 09:18 kharlan@deploy2002: kharlan: Continuing with sync
- 09:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 09:16 kharlan@deploy2002: kharlan: Backport for hCaptcha: Conditionally disable the addurl rule (T410123) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2028.codfw.wmnet with OS bookworm
- 09:11 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Conditionally disable the addurl rule (T410123)
- 09:09 kharlan@deploy2002: Finished scap sync-world: Backport for cirrus: enable wrong keyboard DWIM-style on hewiki and ruwiki (T408734) (duration: 13m 56s)
- 09:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1262 slowly with 10 steps - Repooling after replacing the DIMM
- 09:02 kharlan@deploy2002: dcausse, kharlan: Continuing with sync
- 09:00 kharlan@deploy2002: dcausse, kharlan: Backport for cirrus: enable wrong keyboard DWIM-style on hewiki and ruwiki (T408734) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:55 kharlan@deploy2002: Started scap sync-world: Backport for cirrus: enable wrong keyboard DWIM-style on hewiki and ruwiki (T408734)
- 08:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 08:53 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Record hcaptcha.execute() calls in VisualEditorFeatureUse (T410146), hCaptcha: Track the interfaceName in open-callback events (T410008 T402767) (duration: 42m 00s)
- 08:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 08:44 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.debug (exit_code=0) for Netbox circuit ID 93
- 08:43 ayounsi@cumin1003: START - Cookbook sre.network.debug for Netbox circuit ID 93
- 08:40 kharlan@deploy2002: kharlan: Continuing with sync
- 08:36 kharlan@deploy2002: kharlan: Backport for hCaptcha: Record hcaptcha.execute() calls in VisualEditorFeatureUse (T410146), hCaptcha: Track the interfaceName in open-callback events (T410008 T402767) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:32 tappof: titan1002: switch /srv mount from /dev/md2 to /dev/vg0/srv T410152
- 08:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS bookworm
- 08:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
- 08:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
- 08:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
- 08:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
- 08:11 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Record hcaptcha.execute() calls in VisualEditorFeatureUse (T410146), hCaptcha: Track the interfaceName in open-callback events (T410008 T402767)
- 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2028.codfw.wmnet with OS bookworm
- 08:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
- 08:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
- 08:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 08:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 08:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
- 08:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
- 08:01 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Rate-limit for authenticated users - oblivian@cumin1003"
- 08:01 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Rate-limit for authenticated users - oblivian@cumin1003
- 08:00 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Rate-limit for authenticated users - oblivian@cumin1003
- 08:00 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Rate-limit for authenticated users - oblivian@cumin1003"
- 07:57 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 07:57 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 07:56 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 07:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 07:44 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 07:40 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 07:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS bookworm
- 06:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1262 slowly with 10 steps - Repooling after replacing the DIMM
- 06:52 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) db1262 slowly with 10 steps - Repooling after replacing the DIMM
- 06:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1262 slowly with 10 steps - Repooling after replacing the DIMM
- 06:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb1024.eqiad.wmnet with reason: Setting up
- 01:15 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 14m 41s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-11-16
- 19:29 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1043.eqiad.wmnet with OS trixie
- 18:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1043.eqiad.wmnet with reason: host reimage
- 18:25 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1043.eqiad.wmnet with reason: host reimage
- 18:08 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1043.eqiad.wmnet with OS trixie
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 51s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-11-15
- 02:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1042.eqiad.wmnet with OS trixie
- 02:03 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1042.eqiad.wmnet with reason: host reimage
- 01:58 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1042.eqiad.wmnet with reason: host reimage
- 01:42 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1042.eqiad.wmnet with OS trixie
- 01:42 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1042.eqiad.wmnet with OS trixie
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 48s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:07 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1042.eqiad.wmnet with OS trixie
- 00:04 brett@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-eqiad
2025-11-14
- 23:57 brett@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
- 22:53 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1041.eqiad.wmnet with OS trixie
- 21:53 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1041.eqiad.wmnet with reason: host reimage
- 21:50 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1041.eqiad.wmnet with reason: host reimage
- 21:33 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS trixie
- 21:32 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1040.eqiad.wmnet with OS trixie
- 21:03 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1040.eqiad.wmnet with reason: host reimage
- 20:57 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1040.eqiad.wmnet with reason: host reimage
- 20:41 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1040.eqiad.wmnet with OS trixie
- 20:29 ejegg: payments-wiki upgraded from 8437b605 to 235a82db
- 19:36 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet with OS bookworm
- 19:18 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1008-dev.eqiad.wmnet with reason: host reimage
- 19:12 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1008-dev.eqiad.wmnet with reason: host reimage
- 18:35 herron: titan1001: switch /srv mount from /dev/md2 to /dev/vg0/srv T410152
- 18:20 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS bookworm
- 18:19 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 17:56 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2094.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:16 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2094.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 16:55 fceratto@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host db-test2001.codfw.wmnet
- 16:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db-test2001.codfw.wmnet with OS trixie
- 16:37 ejegg: donorwiki upgraded from b9f94a76 to 8437b605
- 16:36 ejegg: payments-wiki upgraded from b9f94a76 to 8437b605
- 16:34 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db-test2001.codfw.wmnet with reason: host reimage
- 16:28 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db-test2001.codfw.wmnet with reason: host reimage
- 16:11 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db-test2001.codfw.wmnet with OS trixie
- 16:09 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM db-test2001.codfw.wmnet - fceratto@cumin1003"
- 16:09 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM db-test2001.codfw.wmnet - fceratto@cumin1003"
- 16:09 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db-test2001.codfw.wmnet on all recursors
- 16:09 fceratto@cumin1003: START - Cookbook sre.dns.wipe-cache db-test2001.codfw.wmnet on all recursors
- 16:09 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:09 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM db-test2001.codfw.wmnet - fceratto@cumin1003"
- 16:08 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM db-test2001.codfw.wmnet - fceratto@cumin1003"
- 16:05 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 16:05 fceratto@cumin1003: START - Cookbook sre.ganeti.makevm for new host db-test2001.codfw.wmnet
- 15:42 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 15:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 15:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 15:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 15:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 15:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 15:16 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 15:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 15:14 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db-test2001.codfw.wmnet
- 15:14 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:14 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db-test2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 15:13 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db-test2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 15:13 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 15:09 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 15:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 15:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 15:04 fceratto@cumin1003: START - Cookbook sre.hosts.decommission for hosts db-test2001.codfw.wmnet
- 15:01 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 14:56 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 14:56 ejegg: donorwiki upgraded from c2a4b377 to b9f94a76
- 14:55 ejegg: payments-wiki upgraded from b67adca1 to b9f94a76
- 14:50 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 14:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 14:43 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 14:31 aikochou@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 14:26 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1029.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 14:26 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 14:25 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1030.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 14:21 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1032.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 14:19 ejegg: civicrm upgraded from f0a5b175 to 0afa5c8f
- 14:18 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 14:17 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1030.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 14:17 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1029.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 14:14 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1031.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 14:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1032.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 14:04 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1031.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:52 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1032
- 13:51 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1032
- 13:51 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1031
- 13:50 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:50 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt wdqs test servers - jclark@cumin1003"
- 13:50 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt wdqs test servers - jclark@cumin1003"
- 13:49 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1031
- 13:49 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1030
- 13:49 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1030
- 13:49 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1029
- 13:48 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1029
- 13:48 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1028
- 13:48 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1028
- 13:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:46 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 13:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1028.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:39 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:39 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt wdqs test servers - jclark@cumin1003"
- 13:39 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt wdqs test servers - jclark@cumin1003"
- 13:35 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 13:02 cmooney@cumin1003: END (ERROR) - Cookbook sre.network.peering (exit_code=97) with action 'configure' for AS: 45102
- 13:02 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 45102
- 12:47 moritzm: rebalance eqiad/D following switch migration T405945
- 11:43 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:40 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.clone (exit_code=97) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:39 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:37 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.clone (exit_code=97) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:36 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:29 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:17 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:16 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:09 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:04 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:04 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:03 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:02 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 10:56 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 10:48 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 10:47 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 10:38 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 10:37 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 10:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 10:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 09:20 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet,service=s4
- 08:04 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Update
- 06:58 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Update
- 06:54 moritzm: rebalance eqiad/C following switch migration T405945
- 06:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1024].eqiad.wmnet with reason: Cloning clouddb1024:s4
- 06:25 marostegui@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet,service=s4
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 46s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-11-13
- 23:31 denisse@deploy2002: Finished deploy [librenms/librenms@5fca3ff]: Upgrade LibreNMS to 25.10.0 - T410039 (duration: 00m 15s)
- 23:30 denisse@deploy2002: Started deploy [librenms/librenms@5fca3ff]: Upgrade LibreNMS to 25.10.0 - T410039
- 23:08 catrope@deploy2002: Finished scap sync-world: Backport for Drop references to removed Advanced mobile contribution configuration (T402470) (duration: 09m 30s)
- 23:04 catrope@deploy2002: catrope, jdlrobson: Continuing with sync
- 23:01 catrope@deploy2002: catrope, jdlrobson: Backport for Drop references to removed Advanced mobile contribution configuration (T402470) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:59 catrope@deploy2002: Started scap sync-world: Backport for Drop references to removed Advanced mobile contribution configuration (T402470)
- 22:28 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on sretest2001.codfw.wmnet with reason: sleep test
- 21:53 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Simplify ConfirmEditTriggersCaptcha logic for API edits (T405586) (duration: 08m 56s)
- 21:49 kharlan@deploy2002: kharlan: Continuing with sync
- 21:46 kharlan@deploy2002: kharlan: Backport for hCaptcha: Simplify ConfirmEditTriggersCaptcha logic for API edits (T405586) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:44 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Simplify ConfirmEditTriggersCaptcha logic for API edits (T405586)
- 21:40 jhathaway@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2001.codfw.wmnet with OS bookworm
- 21:39 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on sretest1003.eqiad.wmnet with reason: sleep test
- 21:34 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Deploy: known-client rate limits and multi-select fixes - swfrench@cumin2002"
- 21:34 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: known-client rate limits and multi-select fixes - swfrench@cumin2002
- 21:33 swfrench@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: known-client rate limits and multi-select fixes - swfrench@cumin2002
- 21:33 swfrench@cumin2002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Deploy: known-client rate limits and multi-select fixes - swfrench@cumin2002"
- 21:23 jhathaway@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2001.codfw.wmnet with reason: host reimage
- 21:22 bvibber@deploy2002: Finished scap sync-world: Backport for StickyHeaders: scroll-margin-top fixes (T409349), Reduce number of bucketsizes for MediaViewer (labs, group0) (T372165), Editcheck: flag suggestions when logging actions (T407170) (duration: 08m 58s)
- 21:18 jhathaway@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2001.codfw.wmnet with reason: host reimage
- 21:16 bvibber@deploy2002: bvibber, kemayo: Continuing with sync
- 21:15 bvibber@deploy2002: bvibber, kemayo: Backport for StickyHeaders: scroll-margin-top fixes (T409349), Reduce number of bucketsizes for MediaViewer (labs, group0) (T372165), Editcheck: flag suggestions when logging actions (T407170) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:13 bvibber@deploy2002: Started scap sync-world: Backport for StickyHeaders: scroll-margin-top fixes (T409349), Reduce number of bucketsizes for MediaViewer (labs, group0) (T372165), Editcheck: flag suggestions when logging actions (T407170)
- 21:07 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
- 20:52 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: apply wmf-opensearch-search-plugins update - bking@cumin2002 - T407520
- 20:40 ejegg: civicrm upgraded from 59d75a7e to 0675f10c
- 20:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1241.eqiad.wmnet onto db1262.eqiad.wmnet
- 20:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1241 gradually with 4 steps - Pool db1241.eqiad.wmnet in after cloning
- 20:22 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on sretest2001.codfw.wmnet with reason: sleep test
- 20:01 eileen: civicrm upgraded from a1faf78d to 59d75a7e
- 19:59 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy3002.wikimedia.org
- 19:59 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:59 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy3002.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
- 19:59 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy3002.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
- 19:56 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 19:55 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 19:54 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 19:52 sukhe@cumin1003: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy3002.wikimedia.org
- 19:48 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy3002.wikimedia.org
- 19:48 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host hcaptcha-proxy3002.wikimedia.org with OS trixie
- 19:47 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1241 gradually with 4 steps - Pool db1241.eqiad.wmnet in after cloning
- 19:25 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: apply wmf-opensearch-search-plugins update - bking@cumin2002 - T407520
- 19:20 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply wmf-opensearch-search-plugins update - bking@cumin2002 - T407520
- 18:59 swfrench-wmf: scaled mw-(api-ext|web)/next to "idle" size - T405955
- 18:58 swfrench@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Hold deployments during mw-(api-ext|web) capacity changes - T405955 (duration: 46m 54s)
- 18:58 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:58 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:58 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:57 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:57 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy3002.wikimedia.org with OS trixie
- 18:57 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 18:57 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 18:56 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy3002.wikimedia.org on all recursors
- 18:56 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy3002.wikimedia.org on all recursors
- 18:56 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:56 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 18:56 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:56 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 18:56 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:56 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:55 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:54 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:54 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:53 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:53 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:53 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 18:53 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy3002.wikimedia.org
- 18:52 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:52 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:52 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:51 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:50 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:50 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:50 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:49 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy3002.wikimedia.org
- 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:48 swfrench-wmf: zero external traffic on mw-(api-ext|web) next releases - T405955
- 18:46 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 18:42 sukhe: manually running decomm cookbook on hcaptcha-proxy3002: host makevm failed, trying again T409860
- 18:42 sukhe@cumin1003: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy3002.wikimedia.org
- 18:32 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy3002.wikimedia.org
- 18:32 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy3002.wikimedia.org on all recursors
- 18:32 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy3002.wikimedia.org on all recursors
- 18:31 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:31 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 18:31 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 18:22 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy3002.wikimedia.org on all recursors
- 18:22 swfrench-wmf: rolling run-puppet-agent on A:cp hosts for ATS Lua config change - T405955
- 18:22 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy3002.wikimedia.org on all recursors
- 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 18:22 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 18:18 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 18:18 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy3002.wikimedia.org
- 18:15 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy7001.wikimedia.org
- 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy7001.wikimedia.org on all recursors
- 18:15 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy7001.wikimedia.org on all recursors
- 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy7001.wikimedia.org - sukhe@cumin1003"
- 18:15 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy7001.wikimedia.org - sukhe@cumin1003"
- 18:13 swfrench-wmf: disable-puppet on A:cp hosts for ATS Lua config change - T405955
- 18:12 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 18:11 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy7001.wikimedia.org on all recursors
- 18:11 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 55 hosts with reason: T407520
- 18:11 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy7001.wikimedia.org on all recursors
- 18:11 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:11 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7001.wikimedia.org - sukhe@cumin1003"
- 18:11 swfrench@deploy2002: Locking from deployment [ALL REPOSITORIES]: Hold deployments during mw-(api-ext|web) capacity changes - T405955
- 18:11 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7001.wikimedia.org - sukhe@cumin1003"
- 18:09 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy3002.wikimedia.org
- 18:09 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host hcaptcha-proxy3002.wikimedia.org with OS trixie
- 18:08 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 18:08 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy7001.wikimedia.org
- 18:07 swfrench-wmf: scaled mw-web/main to normal multi-DC size - T405955
- 18:07 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply wmf-opensearch-search-plugins update - bking@cumin2002 - T407520
- 18:07 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:07 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: removing hcaptcha-proxy7001;failed makevm - sukhe@cumin1003"
- 18:07 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:06 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:06 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: removing hcaptcha-proxy7001;failed makevm - sukhe@cumin1003"
- 18:06 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:05 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:02 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 18:02 swfrench-wmf: stopped diverting PHP_ENGINE-enrolled traffic at rest-gateway - T405955
- 18:02 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 18:02 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy7001.wikimedia.org
- 18:02 sukhe@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 18:02 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 17:58 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 17:58 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy7001.wikimedia.org
- 17:52 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 17:52 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 17:41 swfrench-wmf: scaled mw-api-ext/main to normal multi-DC size - T405955
- 17:40 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 17:39 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 17:39 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 17:25 dreamyjazz@deploy2002: Finished scap sync-world: Backport for MakeGlobalVariablesScriptHookHandler: Fix hCaptcha site key handling (T405595), VisualEditor hCaptcha: Add config to disable onload handling (T409962) (duration: 08m 48s)
- 17:21 dreamyjazz@deploy2002: dreamyjazz, kharlan: Continuing with sync
- 17:18 dreamyjazz@deploy2002: dreamyjazz, kharlan: Backport for MakeGlobalVariablesScriptHookHandler: Fix hCaptcha site key handling (T405595), VisualEditor hCaptcha: Add config to disable onload handling (T409962) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:18 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy3002.wikimedia.org with OS trixie
- 17:18 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 17:18 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 17:18 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host hcaptcha-proxy7001.wikimedia.org
- 17:18 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host hcaptcha-proxy7001.wikimedia.org with OS trixie
- 17:17 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy3002.wikimedia.org on all recursors
- 17:17 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy3002.wikimedia.org on all recursors
- 17:17 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:17 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 17:17 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy3002.wikimedia.org - sukhe@cumin1003"
- 17:16 dreamyjazz@deploy2002: Started scap sync-world: Backport for MakeGlobalVariablesScriptHookHandler: Fix hCaptcha site key handling (T405595), VisualEditor hCaptcha: Add config to disable onload handling (T409962)
- 17:14 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 17:14 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy3002.wikimedia.org
- 17:12 sukhe@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy3001.wikimedia.org
- 17:12 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy3001.wikimedia.org with OS trixie
- 17:11 robh: eqiad c/d migrations complete for today
- 17:07 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on aqs1019.eqiad.wmnet with reason: C/D Migration
- 17:06 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on restbase1041.eqiad.wmnet with reason: C/D Migration
- 17:05 dancy@deploy2002: Installation of scap version "4.227.0" completed for 2 hosts
- 17:04 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on restbase1040.eqiad.wmnet with reason: C/D Migration
- 17:03 dancy@deploy2002: Installing scap version "4.227.0" for 2 host(s)
- 17:02 moritzm: restarting Tomcat on idp1005
- 16:58 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on ganeti1052.eqiad.wmnet with reason: C/D Migration
- 16:58 robh@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
- 16:56 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy3001.wikimedia.org with reason: host reimage
- 16:56 robh@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
- 16:56 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on ganeti1051.eqiad.wmnet with reason: C/D Migration
- 16:55 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on ganeti1046.eqiad.wmnet with reason: C/D Migration
- 16:55 robh@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
- 16:53 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy3001.wikimedia.org with reason: host reimage
- 16:51 robh@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
- 16:51 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on ganeti1038.eqiad.wmnet with reason: C/D Migration
- 16:51 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply wmf-opensearch-search-plugins update, other updates (see also T407110) - bking@cumin2002 - T407520
- 16:51 robh@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
- 16:49 robh@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
- 16:48 dancy@deploy2002: Installation of scap version "4.226.0" completed for 2 hosts
- 16:47 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on ganeti1050.eqiad.wmnet with reason: C/D Migration
- 16:47 robh@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
- 16:47 dancy@deploy2002: Installing scap version "4.226.0" for 2 host(s)
- 16:44 robh@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
- 16:44 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on ganeti1049.eqiad.wmnet with reason: C/D Migration
- 16:43 robh@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
- 16:41 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: T407520
- 16:41 robh@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
- 16:40 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on ganeti1046.eqiad.wmnet with reason: C/D Migration
- 16:40 robh@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
- 16:36 robh@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
- 16:35 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on ganeti1045.eqiad.wmnet with reason: C/D Migration
- 16:34 robh@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
- 16:29 robh@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
- 16:28 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy7001.wikimedia.org with OS trixie
- 16:27 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy7001.wikimedia.org - sukhe@cumin1003"
- 16:27 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy7001.wikimedia.org - sukhe@cumin1003"
- 16:26 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy7001.wikimedia.org on all recursors
- 16:26 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy7001.wikimedia.org on all recursors
- 16:26 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:26 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7001.wikimedia.org - sukhe@cumin1003"
- 16:26 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy7001.wikimedia.org - sukhe@cumin1003"
- 16:26 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on ganeti1037.eqiad.wmnet with reason: C/D Migration
- 16:26 robh@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
- 16:24 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy3001.wikimedia.org with OS trixie
- 16:23 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy3001.wikimedia.org - sukhe@cumin1003"
- 16:23 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy3001.wikimedia.org - sukhe@cumin1003"
- 16:23 robh@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
- 16:23 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 16:23 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy3001.wikimedia.org on all recursors
- 16:23 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy3001.wikimedia.org on all recursors
- 16:23 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:23 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy3001.wikimedia.org - sukhe@cumin1003"
- 16:23 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy3001.wikimedia.org - sukhe@cumin1003"
- 16:22 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ganeti1048.eqiad.wmnet with reason: C/D Migration
- 16:21 robh@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
- 16:20 robh@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
- 16:20 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ganeti1047.eqiad.wmnet with reason: C/D Migration
- 16:19 robh@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
- 16:19 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy7001.wikimedia.org
- 16:19 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 16:18 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy3001.wikimedia.org
- 16:18 moritzm: installing cups security updates
- 16:16 robh: eqiad c/d migration project: ganeti hosts moving today with proper full drains
- 16:15 robh@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
- 16:11 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ganeti1028.eqiad.wmnet with reason: C/D Migration
- 16:10 robh@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
- 16:08 robh@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
- 16:06 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ganeti1034.eqiad.wmnet with reason: C/D Migration
- 16:04 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: apply wmf-opensearch-search-plugins update - bking@cumin2002 - T407520
- 16:04 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: sync
- 16:03 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ganeti1027.eqiad.wmnet with reason: C/D Migration
- 16:03 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: sync
- 16:02 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: sync
- 16:01 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: sync
- 16:01 claime: roll restarting mobileapps in codfw
- 15:59 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ml-serve1003.eqiad.wmnet with reason: C/D Migration
- 15:59 robh: eqiad c/d migrations window start
- 15:58 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply wmf-opensearch-search-plugins update, other updates (see also T407110) - bking@cumin2002 - T407520
- 15:57 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: apply wmf-opensearch-search-plugins update - bking@cumin2002 - T407520
- 15:41 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 15:36 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 15:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
- 15:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
- 15:34 inflatador: bking@apt1002 sudo -E reprepro -C component/opensearch13 include bullseye-wikimedia /home/bking/wmf-opensearch-search-plugins-1.3.20+12-bullseye/wmf-opensearch-search-plugins_1.3.20+12_amd64.changes T407520
- 15:24 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on gitlab2002.wikimedia.org with reason: Update
- 15:23 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on gitlab1003.wikimedia.org with reason: Update
- 15:11 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:11 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove IPv6 address from db-test1001.eqiad.wmnet - fceratto@cumin1003"
- 15:08 kamila@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:05 kamila@cumin1003: START - Cookbook sre.dns.netbox
- 15:05 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
- 15:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 15:03 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
- 15:02 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove IPv6 address from db-test1001.eqiad.wmnet - fceratto@cumin1003"
- 15:02 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
- 15:00 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
- 14:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 14:57 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 14:50 kart_: Update Recommendation API to 2025-11-10-154629-production (T403730)
- 14:46 kartik@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 14:42 Dreamy_Jazz: Ran `foreachwikiindblist checkuser-suggested-investigations.dblist extensions/CheckUser/maintenance/populateSicUrlIdentifier.php` for T409564
- 14:41 Dreamy_Jazz: Ran `foreachwikiindblist checkuser-suggested-investigations.dblist extensions/CheckUser/maintenance/populateSicUrlIdentifier.php`
- 14:35 kartik@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 14:34 Lucas_WMDE: UTC afternoon backport+config window done
- 14:33 mfossati@deploy2002: Finished scap sync-world: Backport for ImageBrowsing: add tier 2 experiment (T409739), xLab: add tier 2 experiment to ImageBrowsing (T409739) (duration: 16m 11s)
- 14:29 mfossati@deploy2002: mfossati: Continuing with sync
- 14:25 kamila@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2203.codfw.wmnet
- 14:25 kamila@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2203.codfw.wmnet
- 14:24 Raine: homer lsw1-c6-codfw* commit 're-adding failed host -- T408004'
- 14:19 mfossati@deploy2002: mfossati: Backport for ImageBrowsing: add tier 2 experiment (T409739), xLab: add tier 2 experiment to ImageBrowsing (T409739) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:17 mfossati@deploy2002: Started scap sync-world: Backport for ImageBrowsing: add tier 2 experiment (T409739), xLab: add tier 2 experiment to ImageBrowsing (T409739)
- 14:16 kamila@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker2203.codfw.wmnet
- 14:15 kamila@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2203.codfw.wmnet
- 14:14 esanders@deploy2002: Finished scap sync-world: Backport for Freeze LiquidThreads on enwikinews (T406717) (duration: 08m 37s)
- 14:12 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy4002.wikimedia.org
- 14:12 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4002.wikimedia.org
- 14:10 esanders@deploy2002: tchanders, esanders: Continuing with sync
- 14:08 esanders@deploy2002: tchanders, esanders: Backport for Freeze LiquidThreads on enwikinews (T406717) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:05 esanders@deploy2002: Started scap sync-world: Backport for Freeze LiquidThreads on enwikinews (T406717)
- 13:55 klausman@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:44 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1241 - Depool db1241.eqiad.wmnet to then clone it to db1262.eqiad.wmnet - marostegui@cumin1003
- 13:44 marostegui@cumin1003: START - Cookbook sre.mysql.depool db1241 - Depool db1241.eqiad.wmnet to then clone it to db1262.eqiad.wmnet - marostegui@cumin1003
- 13:44 marostegui@cumin1003: START - Cookbook sre.mysql.clone of db1241.eqiad.wmnet onto db1262.eqiad.wmnet
- 13:42 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:32 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:31 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
- 13:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:31 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
- 13:30 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 13:30 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
- 13:29 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
- 13:29 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 13:29 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Upgrade
- 13:28 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 13:24 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 13:23 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.clone (exit_code=97) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 13:19 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Upgrade
- 13:19 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 13:06 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 13:04 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Upgrade
- 13:01 moritzm: installing amd64-microcode security updates
- 12:55 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Upgrade
- 12:50 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 12:50 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 12:50 Amir1: cumin2024@db2205.codfw.wmnet[(none)]> drop database if exists bawiktionary; drop database if exists chwikimedia; drop database if exists closed_zh_twwiki; drop database if exists comcomwiki; (T297297)
- 12:48 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 12:48 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 12:33 moritzm: installing bind security updates (client-side tools/libs only)
- 12:25 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 12:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1264 slowly with 10 steps - Pooling for the first time
- 12:24 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 12:24 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 12:23 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 12:23 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 12:22 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 12:21 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 12:18 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 12:18 cgoubert@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 12:17 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 12:17 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 12:17 cgoubert@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 12:17 cgoubert@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 12:15 cgoubert@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 12:07 moritzm: uploaded wmf-laptop 1.0.4 to apt.wikimedia.org
- 11:55 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:47 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:45 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:44 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:32 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:32 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:30 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:30 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:29 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:28 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test1001.eqiad.wmnet
- 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1003.eqiad.wmnet with OS bookworm
- 10:59 moritzm: upgrade Envoy on idm* T405808
- 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1003.eqiad.wmnet with reason: host reimage
- 10:49 moritzm: installing libfcgi security updates
- 10:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db-test1001.eqiad.wmnet with reason: Cloning
- 10:46 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1003.eqiad.wmnet with reason: host reimage
- 10:34 fceratto@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host db-test1001.eqiad.wmnet
- 10:34 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db-test1001.eqiad.wmnet with OS trixie
- 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1003.eqiad.wmnet with OS bookworm
- 10:16 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1016.eqiad.wmnet,service=x3
- 10:12 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db-test1001.eqiad.wmnet with reason: host reimage
- 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1264 slowly with 10 steps - Pooling for the first time
- 10:08 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) db1264 slowly with 10 steps - Pooling for the first time
- 10:07 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db-test1001.eqiad.wmnet with reason: host reimage
- 10:07 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1264 slowly with 10 steps - Pooling for the first time
- 10:03 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.2 refs T408272
- 09:56 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db-test1001.eqiad.wmnet with OS trixie
- 09:55 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM db-test1001.eqiad.wmnet - fceratto@cumin1003"
- 09:55 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM db-test1001.eqiad.wmnet - fceratto@cumin1003"
- 09:55 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db-test1001.eqiad.wmnet on all recursors
- 09:55 fceratto@cumin1003: START - Cookbook sre.dns.wipe-cache db-test1001.eqiad.wmnet on all recursors
- 09:55 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:54 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM db-test1001.eqiad.wmnet - fceratto@cumin1003"
- 09:54 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM db-test1001.eqiad.wmnet - fceratto@cumin1003"
- 09:50 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 09:50 fceratto@cumin1003: START - Cookbook sre.ganeti.makevm for new host db-test1001.eqiad.wmnet
- 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:46 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 09:05 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 09:01 fceratto@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host db-test1001.eqiad.wmnet
- 09:01 fceratto@cumin1003: START - Cookbook sre.ganeti.makevm for new host db-test1001.eqiad.wmnet
- 08:59 fceratto@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host db-test1001.eqiad.wmnet
- 08:59 fceratto@cumin1003: START - Cookbook sre.ganeti.makevm for new host db-test1001.eqiad.wmnet
- 08:58 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
- 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
- 08:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
- 08:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
- 08:05 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 08:05 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2165 gradually with 4 steps - Migration of db2165.codfw.wmnet completed
- 07:20 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2165 gradually with 4 steps - Migration of db2165.codfw.wmnet completed
- 07:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1016,1022].eqiad.wmnet with reason: Cloning clouddb1022:s3
- 07:19 marostegui@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet,service=x3
- 07:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1022].eqiad.wmnet with reason: Cloning clouddb1022:s3
- 07:18 marostegui@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=x3
- 07:18 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=x3
- 07:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2165 - Upgrading db2165.codfw.wmnet
- 07:02 fceratto@cumin1003: START - Cookbook sre.mysql.depool db2165 - Upgrading db2165.codfw.wmnet
- 07:02 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db2165 T409818', diff saved to https://phabricator.wikimedia.org/P85294 and previous config saved to /var/cache/conftool/dbconfig/20251113-065957-fceratto.json
- 06:58 fceratto@dns1004: END - running authdns-update
- 06:57 fceratto@dns1004: START - running authdns-update
- 06:53 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2161 to s8 primary and set section read-write T409818', diff saved to https://phabricator.wikimedia.org/P85293 and previous config saved to /var/cache/conftool/dbconfig/20251113-065342-fceratto.json
- 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 codfw as read-only for maintenance - T409818', diff saved to https://phabricator.wikimedia.org/P85292 and previous config saved to /var/cache/conftool/dbconfig/20251113-064929-fceratto.json
- 06:47 federico3: Starting s8 codfw failover from db2165 to db2161 - T409818
- 06:36 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2161 with weight 0 T409818', diff saved to https://phabricator.wikimedia.org/P85291 and previous config saved to /var/cache/conftool/dbconfig/20251113-063651-fceratto.json
- 06:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s8 T409818
- 06:18 kart_: machinetranslation: Increase replicas (T386371)
- 06:17 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
- 06:14 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
- 06:08 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
- 06:06 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
- 03:44 eileen: civicrm upgraded from 654f12c2 to a1faf78d
- 02:19 eileen: civicrm upgraded from 90beebc7 to 654f12c2
- 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 04s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-11-12
- 23:15 eileen: civicrm upgraded from c98db6d4 to 90beebc7
- 22:25 eevans@deploy2002: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
- 22:25 eevans@deploy2002: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
- 22:24 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
- 22:24 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/data-gateway: apply
- 22:20 eileen: civicrm upgraded from 2265451b to c98db6d4
- 21:51 kemayo@deploy2002: Finished scap sync-world: Backport for EventLogging: Fix wikitext editor interface detection (T409779), EventLogging: Fix wikitext editor interface detection (T409779) (duration: 07m 48s)
- 21:50 eileen: civicrm upgraded from 04649670 to 2265451b
- 21:47 kemayo@deploy2002: kemayo: Continuing with sync
- 21:46 kemayo@deploy2002: kemayo: Backport for EventLogging: Fix wikitext editor interface detection (T409779), EventLogging: Fix wikitext editor interface detection (T409779) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:43 kemayo@deploy2002: Started scap sync-world: Backport for EventLogging: Fix wikitext editor interface detection (T409779), EventLogging: Fix wikitext editor interface detection (T409779)
- 21:10 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 20:36 sukhe@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy5002.wikimedia.org
- 20:36 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy5002.wikimedia.org with OS trixie
- 20:30 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy4002.wikimedia.org
- 20:30 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4002.wikimedia.org
- 20:30 sukhe@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy6002.wikimedia.org
- 20:30 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy6002.wikimedia.org with OS trixie
- 20:28 kharlan@deploy2002: Finished scap sync-world: Backport for EventLogging: Expand on no-js logging (T409779 T263505), EventLogging: Expand on no-js logging (T409779 T263505) (duration: 10m 15s)
- 20:23 kharlan@deploy2002: kharlan: Continuing with sync
- 20:20 kharlan@deploy2002: kharlan: Backport for EventLogging: Expand on no-js logging (T409779 T263505), EventLogging: Expand on no-js logging (T409779 T263505) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:17 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy4002.wikimedia.org
- 20:17 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4002.wikimedia.org
- 20:17 kharlan@deploy2002: Started scap sync-world: Backport for EventLogging: Expand on no-js logging (T409779 T263505), EventLogging: Expand on no-js logging (T409779 T263505)
- 20:16 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy5002.wikimedia.org with reason: host reimage
- 20:12 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy6002.wikimedia.org with reason: host reimage
- 20:10 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy5002.wikimedia.org with reason: host reimage
- 20:08 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy6002.wikimedia.org with reason: host reimage
- 20:01 cscott@deploy2002: Finished scap sync-world: Backport for Bump wikimedia/parsoid to 0.23.0-a4 (T409910 T409607), Bump wikimedia/parsoid to 0.23.0-a4 (T409607) (duration: 11m 35s)
- 19:55 cscott@deploy2002: cscott: Continuing with sync
- 19:53 cscott@deploy2002: cscott: Backport for Bump wikimedia/parsoid to 0.23.0-a4 (T409910 T409607), Bump wikimedia/parsoid to 0.23.0-a4 (T409607) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 19:49 cscott@deploy2002: Started scap sync-world: Backport for Bump wikimedia/parsoid to 0.23.0-a4 (T409910 T409607), Bump wikimedia/parsoid to 0.23.0-a4 (T409607)
- 19:38 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy6002.wikimedia.org with OS trixie
- 19:38 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy6002.wikimedia.org - sukhe@cumin1003"
- 19:38 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy6002.wikimedia.org - sukhe@cumin1003"
- 19:38 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy6002.wikimedia.org on all recursors
- 19:38 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy6002.wikimedia.org on all recursors
- 19:38 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:38 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy6002.wikimedia.org - sukhe@cumin1003"
- 19:38 eileen: config revision changed from 4abcd79b to 0c731fd4
- 19:37 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy6002.wikimedia.org - sukhe@cumin1003"
- 19:34 eileen: civicrm upgraded from a519e045 to 04649670
- 19:33 eileen: config revision changed from 6877a9ba to 4abcd79b
- 19:32 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 19:32 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy6002.wikimedia.org
- 19:30 sukhe@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy6001.wikimedia.org
- 19:30 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy6001.wikimedia.org with OS trixie
- 19:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy5002.wikimedia.org with OS trixie
- 19:16 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy5002.wikimedia.org - sukhe@cumin1003"
- 19:16 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy5002.wikimedia.org - sukhe@cumin1003"
- 19:16 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy5002.wikimedia.org on all recursors
- 19:16 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy5002.wikimedia.org on all recursors
- 19:16 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:16 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy5002.wikimedia.org - sukhe@cumin1003"
- 19:15 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy5002.wikimedia.org - sukhe@cumin1003"
- 19:12 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy6001.wikimedia.org with reason: host reimage
- 19:11 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 19:11 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy5002.wikimedia.org
- 19:10 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.2 refs T408272
- 19:09 sukhe@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy5001.wikimedia.org
- 19:09 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy5001.wikimedia.org with OS trixie
- 19:08 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy6001.wikimedia.org with reason: host reimage
- 19:02 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 19:01 robh: eqiad c/d migrations complete for today
- 18:56 swfrench@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Hold deployments during mw-(api-ext|web) capacity changes - T405955 (duration: 23m 20s)
- 18:56 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:56 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:56 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:55 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:53 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:53 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:53 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:53 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:52 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:52 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:52 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:51 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:50 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:50 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:50 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:50 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:49 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy4002.wikimedia.org
- 18:49 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4002.wikimedia.org
- 18:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:49 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:48 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy4002.wikimedia.org
- 18:48 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4002.wikimedia.org
- 18:48 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:48 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy5001.wikimedia.org with reason: host reimage
- 18:48 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:48 sukhe@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy4001.wikimedia.org
- 18:48 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy4001.wikimedia.org with OS trixie
- 18:48 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on restbase1042.eqiad.wmnet with reason: C/D Migration
- 18:46 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on aqs1015.eqiad.wmnet with reason: C/D Migration
- 18:45 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:45 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:45 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:45 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:45 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy5001.wikimedia.org with reason: host reimage
- 18:43 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:42 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:42 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:42 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:41 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on logging-hd1003.eqiad.wmnet with reason: C/D Migration
- 18:41 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:41 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:40 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:40 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:39 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy6001.wikimedia.org with OS trixie
- 18:39 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:39 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy6001.wikimedia.org - sukhe@cumin1003"
- 18:39 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy6001.wikimedia.org - sukhe@cumin1003"
- 18:39 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:39 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:39 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy6001.wikimedia.org on all recursors
- 18:39 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy6001.wikimedia.org on all recursors
- 18:39 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:39 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy6001.wikimedia.org - sukhe@cumin1003"
- 18:39 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:38 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:36 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:36 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy6001.wikimedia.org - sukhe@cumin1003"
- 18:36 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:36 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on logstash1034.eqiad.wmnet with reason: C/D Migration
- 18:34 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc-wf1002.eqiad.wmnet with reason: C/D Migration
- 18:33 swfrench@deploy2002: Locking from deployment [ALL REPOSITORIES]: Hold deployments during mw-(api-ext|web) capacity changes - T405955
- 18:33 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 18:32 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy6001.wikimedia.org
- 18:32 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy4001.wikimedia.org with reason: host reimage
- 18:32 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ml-serve1004.eqiad.wmnet with reason: C/D Migration
- 18:30 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1249.eqiad.wmnet with reason: C/D Migration
- 18:28 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy4001.wikimedia.org with reason: host reimage
- 18:27 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on logstash1035.eqiad.wmnet with reason: C/D Migration
- 18:25 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1248.eqiad.wmnet with reason: C/D Migration
- 18:25 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host franio1004
- 18:25 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host franio1004
- 18:24 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:24 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on cp1115.eqiad.wmnet with reason: C/D Migration
- 18:22 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc-gp1006.eqiad.wmnet with reason: C/D Migration
- 18:22 swfrench@deploy2002: Finished scap sync-world: Fully migrate mw-(api-ext|web) to 8.3 - T405955 (duration: 03m 51s)
- 18:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
- 18:19 swfrench@deploy2002: Started scap sync-world: Fully migrate mw-(api-ext|web) to 8.3 - T405955
- 18:16 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on backup1014.eqiad.wmnet with reason: C/D Migration
- 18:15 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1136.eqiad.wmnet with reason: C/D Migration
- 18:12 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on cp1112.eqiad.wmnet with reason: C/D Migration
- 18:04 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-presto1019.eqiad.wmnet with reason: C/D Migration
- 17:54 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on kubestage1004.eqiad.wmnet with reason: C/D Migration
- 17:52 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on pki-root1001.eqiad.wmnet with reason: C/D Migration
- 17:51 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1223.eqiad.wmnet with reason: C/D Migration
- 17:51 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy5001.wikimedia.org with OS trixie
- 17:50 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on rdb1014.eqiad.wmnet with reason: C/D Migration
- 17:50 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy4001.wikimedia.org with OS trixie
- 17:49 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-druid1005.eqiad.wmnet with reason: C/D Migration
- 17:48 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1173.eqiad.wmnet with reason: C/D Migration
- 17:47 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4001.wikimedia.org - sukhe@cumin1003"
- 17:47 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4001.wikimedia.org - sukhe@cumin1003"
- 17:46 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1232.eqiad.wmnet with reason: C/D Migration
- 17:44 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1175.eqiad.wmnet with reason: C/D Migration
- 17:43 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on es1052.eqiad.wmnet with reason: C/D Migration
- 17:41 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy5001.wikimedia.org - sukhe@cumin1003"
- 17:41 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy5001.wikimedia.org - sukhe@cumin1003"
- 17:41 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy5001.wikimedia.org on all recursors
- 17:41 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy5001.wikimedia.org on all recursors
- 17:41 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:41 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy5001.wikimedia.org - sukhe@cumin1003"
- 17:41 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy5001.wikimedia.org - sukhe@cumin1003"
- 17:40 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1247.eqiad.wmnet with reason: C/D Migration
- 17:39 robh: eqiad c/d migration d6 rack complete for today, onto d3
- 17:34 sukhe@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy2002.wikimedia.org
- 17:34 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy2002.wikimedia.org with OS trixie
- 17:30 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 17:30 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy5001.wikimedia.org
- 17:29 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on sessionstore1006.eqiad.wmnet with reason: C/D Migration
- 17:29 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy4001.wikimedia.org on all recursors
- 17:29 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy4001.wikimedia.org on all recursors
- 17:29 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:29 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4001.wikimedia.org - sukhe@cumin1003"
- 17:29 sukhe@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy2001.wikimedia.org
- 17:29 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy2001.wikimedia.org with OS trixie
- 17:27 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1225.eqiad.wmnet with reason: C/D Migration
- 17:27 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4001.wikimedia.org - sukhe@cumin1003"
- 17:26 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1224.eqiad.wmnet with reason: C/D Migration
- 17:26 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
- 17:25 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
- 17:25 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on dbproxy1025.eqiad.wmnet with reason: C/D Migration
- 17:24 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1174.eqiad.wmnet with reason: C/D Migration
- 17:22 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on aqs1022.eqiad.wmnet with reason: C/D Migration
- 17:22 sukhe@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy1002.wikimedia.org
- 17:22 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy1002.wikimedia.org with OS trixie
- 17:21 sukhe@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy4002.wikimedia.org
- 17:21 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4002.wikimedia.org
- 17:21 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 17:21 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4001.wikimedia.org
- 17:20 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on aqs1022.eqiad.wmnet with reason: C/D Migration
- 17:19 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1258.eqiad.wmnet with reason: C/D Migration
- 17:18 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on restbase1045.eqiad.wmnet with reason: C/D Migration
- 17:18 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy2002.wikimedia.org with reason: host reimage
- 17:17 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1259.eqiad.wmnet with reason: C/D Migration
- 17:15 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on es1053.eqiad.wmnet with reason: C/D Migration
- 17:13 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy2002.wikimedia.org with reason: host reimage
- 17:12 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1233.eqiad.wmnet with reason: C/D Migration
- 17:11 sukhe@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy1001.wikimedia.org
- 17:11 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy1001.wikimedia.org with OS trixie
- 17:11 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy2001.wikimedia.org with reason: host reimage
- 17:10 robh: eqiad c/d migration work in D6
- 17:10 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on es1046.eqiad.wmnet with reason: C/D Migration
- 17:08 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy1002.wikimedia.org with reason: host reimage
- 17:04 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ganeti1033.eqiad.wmnet with reason: C/D Migration
- 17:03 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy2001.wikimedia.org with reason: host reimage
- 17:02 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy1002.wikimedia.org with reason: host reimage
- 17:02 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ganeti1024.eqiad.wmnet with reason: C/D Migration
- 16:58 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy1001.wikimedia.org with reason: host reimage
- 16:57 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Enable the MEX / wbui2025 beta feature on testwikidata (v2) (T407737) (duration: 12m 40s)
- 16:55 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy2002.wikimedia.org with OS trixie
- 16:55 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy2002.wikimedia.org - sukhe@cumin1003"
- 16:55 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy2002.wikimedia.org - sukhe@cumin1003"
- 16:54 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy2002.wikimedia.org on all recursors
- 16:54 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy2002.wikimedia.org on all recursors
- 16:54 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:54 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy2002.wikimedia.org - sukhe@cumin1003"
- 16:54 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy2002.wikimedia.org - sukhe@cumin1003"
- 16:54 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on dse-k8s-worker1003.eqiad.wmnet with reason: C/D Migration
- 16:52 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy1001.wikimedia.org with reason: host reimage
- 16:51 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy1002.wikimedia.org with OS trixie
- 16:51 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy1002.wikimedia.org - sukhe@cumin1003"
- 16:51 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy1002.wikimedia.org - sukhe@cumin1003"
- 16:51 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 16:51 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy1002.wikimedia.org on all recursors
- 16:50 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy1002.wikimedia.org on all recursors
- 16:50 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:50 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy1002.wikimedia.org - sukhe@cumin1003"
- 16:50 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Continuing with sync
- 16:50 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy1002.wikimedia.org - sukhe@cumin1003"
- 16:49 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:47 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy2002.wikimedia.org
- 16:47 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 16:47 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy1002.wikimedia.org
- 16:47 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for Enable the MEX / wbui2025 beta feature on testwikidata (v2) (T407737) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:45 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy2001.wikimedia.org with OS trixie
- 16:44 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy2001.wikimedia.org - sukhe@cumin1003"
- 16:44 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy2001.wikimedia.org - sukhe@cumin1003"
- 16:44 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Enable the MEX / wbui2025 beta feature on testwikidata (v2) (T407737)
- 16:44 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy2001.wikimedia.org on all recursors
- 16:44 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy2001.wikimedia.org on all recursors
- 16:44 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:44 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy2001.wikimedia.org - sukhe@cumin1003"
- 16:44 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy2001.wikimedia.org - sukhe@cumin1003"
- 16:42 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy1001.wikimedia.org with OS trixie
- 16:42 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy1001.wikimedia.org - sukhe@cumin1003"
- 16:42 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy1001.wikimedia.org - sukhe@cumin1003"
- 16:41 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy1001.wikimedia.org on all recursors
- 16:41 sukhe@cumin1003: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy1001.wikimedia.org on all recursors
- 16:41 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:41 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy1001.wikimedia.org - sukhe@cumin1003"
- 16:41 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy1001.wikimedia.org - sukhe@cumin1003"
- 16:40 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 16:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:37 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy2001.wikimedia.org
- 16:35 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:35 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 16:35 sukhe@cumin1003: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy1001.wikimedia.org
- 16:33 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Fixing grants
- 16:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:26 fceratto@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host db-test1001.eqiad.wmnet
- 16:26 fceratto@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 16:26 sukhe: sudo cumin "O:installserver" "run-puppet-agent"
- 16:22 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 16:22 fceratto@cumin1003: START - Cookbook sre.ganeti.makevm for new host db-test1001.eqiad.wmnet
- 16:05 fceratto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM db-test2002.codfw.wmnet
- 15:56 fceratto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM db-test2002.codfw.wmnet
- 15:56 fceratto@cumin1003: END (ERROR) - Cookbook sre.ganeti.reboot-vm (exit_code=97) for VM db-test1002.eqiad.wmnet
- 15:38 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 15:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 15:36 fceratto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM db-test1002.eqiad.wmnet
- 15:35 fceratto@cumin1003: END (ERROR) - Cookbook sre.ganeti.reboot-vm (exit_code=97) for VM db-test1001.eqiad.wmnet
- 15:33 marostegui: Drop cumin2024@cumin1002 from production - T409929
- 15:30 Amir1: cumin2024@db2205.codfw.wmnet[(none)]> drop database if exists affcomwiki; (T297297)
- 15:25 ecarg@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:24 ecarg@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:24 Lucas_WMDE: UTC afternoon backport+config window done
- 15:24 ecarg@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:24 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for StringForLanguageBuilder: Use LanguageFallbackMode enum (T409876), BlockErrorFormatter: Convert booleans to string in message params (T409810), Enable DiscussionTools visual enhancements everywhere except enwiki, svwiki and ruwiki (T379264) (duration: 17m 17s)
- 15:23 ecarg@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:22 fceratto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM db-test1001.eqiad.wmnet
- 15:22 ecarg@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:22 ecarg@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:19 ecarg@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:18 ecarg@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:18 ecarg@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:17 moritzm: migrated pwstore repository from cumin1002 to cumin1003 T389380
- 15:17 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, jforrester, esanders: Continuing with sync
- 15:16 ecarg@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:10 ecarg@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:10 ecarg@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:09 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 15:09 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 15:09 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, jforrester, esanders: Backport for StringForLanguageBuilder: Use LanguageFallbackMode enum (T409876), BlockErrorFormatter: Convert booleans to string in message params (T409810), Enable DiscussionTools visual enhancements everywhere except enwiki, svwiki and ruwiki (T379264) synced to the testservers (see
- 15:08 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 15:08 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 15:07 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 15:07 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 15:06 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for StringForLanguageBuilder: Use LanguageFallbackMode enum (T409876), BlockErrorFormatter: Convert booleans to string in message params (T409810), Enable DiscussionTools visual enhancements everywhere except enwiki, svwiki and ruwiki (T379264)
- 15:04 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Hide edit one/all checkbox when needed (T409808) (duration: 12m 30s)
- 14:56 lucaswerkmeister-wmde@deploy2002: awight, lucaswerkmeister-wmde: Continuing with sync
- 14:55 lucaswerkmeister-wmde@deploy2002: awight, lucaswerkmeister-wmde: Backport for Hide edit one/all checkbox when needed (T409808) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:51 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Hide edit one/all checkbox when needed (T409808)
- 14:48 fceratto@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host db-test1003.eqiad.wmnet
- 14:48 fceratto@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 14:48 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha instrumentation: Log editor_interface for editAttempStep (T409701), Support an "always challenge" SiteKey when shouldForceShowCaptcha is enabled (T405595), hCaptcha: Define configuration for "always challenge" mode (T405595) (duration: 47m 08s)
- 14:47 fceratto@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host db-test1001.eqiad.wmnet
- 14:47 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s3
- 14:47 fceratto@cumin1003: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host db-test1002.eqiad.wmnet
- 14:46 fceratto@cumin1003: START - Cookbook sre.ganeti.makevm for new host db-test1002.eqiad.wmnet
- 14:46 fceratto@cumin1003: START - Cookbook sre.ganeti.makevm for new host db-test1001.eqiad.wmnet
- 14:45 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 14:45 fceratto@cumin1003: START - Cookbook sre.ganeti.makevm for new host db-test1003.eqiad.wmnet
- 14:33 kharlan@deploy2002: kharlan: Continuing with sync
- 14:27 kharlan@deploy2002: kharlan: Backport for hCaptcha instrumentation: Log editor_interface for editAttempStep (T409701), Support an "always challenge" SiteKey when shouldForceShowCaptcha is enabled (T405595), hCaptcha: Define configuration for "always challenge" mode (T405595) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can
- 14:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db-test2001.codfw.wmnet with reason: Clone T400056
- 14:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2230.codfw.wmnet with reason: Clone T400056
- 14:01 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha instrumentation: Log editor_interface for editAttempStep (T409701), Support an "always challenge" SiteKey when shouldForceShowCaptcha is enabled (T405595), hCaptcha: Define configuration for "always challenge" mode (T405595)
- 13:55 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host ml-serve1012
- 13:50 elukey@cumin2002: START - Cookbook sre.hosts.powercycle for host ml-serve1012
- 13:48 taavi: updating cr firewall policy with new caprica definitions, to pick up new clouddb hosts
- 13:31 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 100%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85288 and previous config saved to /var/cache/conftool/dbconfig/20251112-133127-root.json
- 13:24 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1012.eqiad.wmnet with OS trixie
- 13:23 moritzm: installing glib2.0 security updates
- 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 75%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85287 and previous config saved to /var/cache/conftool/dbconfig/20251112-131621-root.json
- 13:04 taavi@deploy2002: Finished scap sync-world: Backport for reverse-proxy: Add new eqiad/codfw per-rack subnets, Add script to update reverse-proxy.php (duration: 09m 51s)
- 13:03 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1012.eqiad.wmnet with reason: host reimage
- 13:01 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 50%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85286 and previous config saved to /var/cache/conftool/dbconfig/20251112-130115-root.json
- 13:00 klausman@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1012.eqiad.wmnet with reason: host reimage
- 12:59 taavi@deploy2002: taavi: Continuing with sync
- 12:57 taavi@deploy2002: taavi: Backport for reverse-proxy: Add new eqiad/codfw per-rack subnets, Add script to update reverse-proxy.php synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:54 taavi@deploy2002: Started scap sync-world: Backport for reverse-proxy: Add new eqiad/codfw per-rack subnets, Add script to update reverse-proxy.php
- 12:47 klausman@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1012.eqiad.wmnet with OS trixie
- 12:46 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 45%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85285 and previous config saved to /var/cache/conftool/dbconfig/20251112-124609-root.json
- 12:43 fceratto@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test2001.codfw.wmnet
- 12:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1022].eqiad.wmnet with reason: Cloning clouddb1022:s3
- 12:41 marostegui@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
- 12:31 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 40%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85282 and previous config saved to /var/cache/conftool/dbconfig/20251112-123103-root.json
- 12:30 kart_: Updated cxserver to 2025-11-12-114324-production (T408515)
- 12:21 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 12:20 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 12:20 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 12:19 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 35%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85281 and previous config saved to /var/cache/conftool/dbconfig/20251112-121557-root.json
- 12:14 topranks: shut down link from ssw1-d8-eqiad ethernet-1/28 <-> asw2-c7-eqiad et-7/0/49 to observe results T409800
- 12:14 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 12:14 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 12:09 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
- 12:09 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
- 12:08 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
- 12:07 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
- 12:06 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 12:05 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
- 12:00 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 30%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85280 and previous config saved to /var/cache/conftool/dbconfig/20251112-120051-root.json
- 11:45 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 25%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85279 and previous config saved to /var/cache/conftool/dbconfig/20251112-114545-root.json
- 11:44 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 11:44 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 11:43 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 11:43 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 11:41 jmm@cumin1003: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cumin2002.codfw.wmnet
- 11:39 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
- 11:36 moritzm: migrated cumin2002 to nftables T389380
- 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 20%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85278 and previous config saved to /var/cache/conftool/dbconfig/20251112-113040-root.json
- 11:26 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
- 11:25 jmm@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cumin2002.codfw.wmnet
- 11:24 jmm@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cumin2002.codfw.wmnet
- 11:22 topranks: will not shut just yet will log again when about to do so T409800
- 11:18 topranks: shut down link from ssw1-d8-eqiad ethernet-1/28 <-> asw2-c7-eqiad et-7/0/49 to observe results T409800
- 11:18 jmm@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cumin2002.codfw.wmnet
- 11:17 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on asw2-c-eqiad,ssw1-d8-eqiad with reason: shutting down one leg of LAG from ssw1-d8-eqiad to asw2-c7-eqiad
- 11:17 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.2 refs T408272
- 11:15 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 10%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85277 and previous config saved to /var/cache/conftool/dbconfig/20251112-111534-root.json
- 11:08 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 11:07 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 11:07 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 11:06 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 11:06 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 11:04 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
- 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 9%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85275 and previous config saved to /var/cache/conftool/dbconfig/20251112-110028-root.json
- 10:50 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:50 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:49 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.2 refs T408272
- 10:45 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 8%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85273 and previous config saved to /var/cache/conftool/dbconfig/20251112-104522-root.json
- 10:39 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
- 10:39 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:38 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:37 ladsgroup@deploy2002: Finished scap sync-world: Backport for Revert "pagers: Make history pager work with Postgres" (T409831) (duration: 08m 34s)
- 10:34 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:34 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:32 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 10:30 ladsgroup@deploy2002: ladsgroup: Backport for Revert "pagers: Make history pager work with Postgres" (T409831) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:30 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 7%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85272 and previous config saved to /var/cache/conftool/dbconfig/20251112-103016-root.json
- 10:28 ladsgroup@deploy2002: Started scap sync-world: Backport for Revert "pagers: Make history pager work with Postgres" (T409831)
- 10:26 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 8 hosts
- 10:26 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for 8 hosts
- 10:18 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Set fallback for ConfirmEditTriggersCaptcha (T409736) (duration: 07m 51s)
- 10:15 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 6%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85271 and previous config saved to /var/cache/conftool/dbconfig/20251112-101510-root.json
- 10:14 kharlan@deploy2002: kharlan: Continuing with sync
- 10:12 kharlan@deploy2002: kharlan: Backport for hCaptcha: Set fallback for ConfirmEditTriggersCaptcha (T409736) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:10 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Set fallback for ConfirmEditTriggersCaptcha (T409736)
- 10:05 kharlan@deploy2002: Finished scap sync-world: Backport for Throttler: Use SecurityLogContext (duration: 10m 35s)
- 10:03 marostegui@cumin1003: dbctl commit (dc=all): 'Add db1264 to x1 depooled T407941', diff saved to https://phabricator.wikimedia.org/P85270 and previous config saved to /var/cache/conftool/dbconfig/20251112-100346-marostegui.json
- 10:01 kharlan@deploy2002: kharlan: Continuing with sync
- 10:00 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 5%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85268 and previous config saved to /var/cache/conftool/dbconfig/20251112-100004-root.json
- 09:57 kharlan@deploy2002: kharlan: Backport for Throttler: Use SecurityLogContext synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:55 kharlan@deploy2002: Started scap sync-world: Backport for Throttler: Use SecurityLogContext
- 09:54 jynus: rolling restart of dbprov hosts for mariadb+kernel upgrade
- 09:50 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ml-serve1012.eqiad.wmnet on all recursors
- 09:50 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ml-serve1012.eqiad.wmnet on all recursors
- 09:49 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:49 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update IPs for ml-server1012 - cmooney@cumin1003"
- 09:49 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update IPs for ml-server1012 - cmooney@cumin1003"
- 09:44 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 4%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85267 and previous config saved to /var/cache/conftool/dbconfig/20251112-094458-root.json
- 09:44 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 09:44 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ml-serve1012.eqiad.wmnet with reason: manually adjusting host DNS to new IPs ahead of reimage
- 09:39 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 9 hosts with reason: MariaDB and kernel upgrade and restart
- 09:38 kharlan@deploy2002: Finished scap sync-world: Backport for EventLogging: Register mediawiki.hcaptcha.risk_score stream (T405597) (duration: 10m 41s)
- 09:32 kharlan@deploy2002: kharlan: Continuing with sync
- 09:31 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1171.eqiad.wmnet
- 09:31 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1171.eqiad.wmnet
- 09:30 kharlan@deploy2002: kharlan: Backport for EventLogging: Register mediawiki.hcaptcha.risk_score stream (T405597) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 3%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85266 and previous config saved to /var/cache/conftool/dbconfig/20251112-092953-root.json
- 09:29 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 09:27 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 09:27 kharlan@deploy2002: Started scap sync-world: Backport for EventLogging: Register mediawiki.hcaptcha.risk_score stream (T405597)
- 09:22 kharlan@deploy2002: Finished scap sync-world: Backport for ext.confirmEdit.hCaptcha: Consider action=submit an edit interface (T409701 T409703 T409415), Refactor CaptchaScoreHooks to use EventSubmitter (T405597) (duration: 09m 21s)
- 09:21 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1171.eqiad.wmnet with reason: Restart
- 09:18 kharlan@deploy2002: kharlan: Continuing with sync
- 09:15 kharlan@deploy2002: kharlan: Backport for ext.confirmEdit.hCaptcha: Consider action=submit an edit interface (T409701 T409703 T409415), Refactor CaptchaScoreHooks to use EventSubmitter (T405597) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 2%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85265 and previous config saved to /var/cache/conftool/dbconfig/20251112-091447-root.json
- 09:13 kharlan@deploy2002: Started scap sync-world: Backport for ext.confirmEdit.hCaptcha: Consider action=submit an edit interface (T409701 T409703 T409415), Refactor CaptchaScoreHooks to use EventSubmitter (T405597)
- 09:12 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 09:12 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 09:12 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 09:11 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 09:06 dcausse@deploy2002: mwscript-k8s job started: resetAuthenticationThrottle.php zhwiki --signup --ip=103.108.250.0/24 # resetting throttle cache for T409852
- 08:59 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 1%: Moved it to es7', diff saved to https://phabricator.wikimedia.org/P85264 and previous config saved to /var/cache/conftool/dbconfig/20251112-085941-root.json
- 08:58 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha instrumentation: Handle hcaptcha.render events during edits (T409701 T409703 T409415) (duration: 09m 17s)
- 08:58 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 08:57 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 08:53 dcausse@deploy2002: mwscript-k8s job started: resetAuthenticationThrottle.php zhwiki --signup --ip=1.2.3.4 # resetting throttle cache for T409852
- 08:52 kharlan@deploy2002: kharlan: Continuing with sync
- 08:51 kharlan@deploy2002: kharlan: Backport for hCaptcha instrumentation: Handle hcaptcha.render events during edits (T409701 T409703 T409415) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:49 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha instrumentation: Handle hcaptcha.render events during edits (T409701 T409703 T409415)
- 08:46 dcausse@deploy2002: mwscript-k8s job started: resetAuthenticationThrottle.php zhwiki # T409852
- 08:44 dcausse@deploy2002: Finished scap sync-world: Backport for Throttle exemption for Edit-a-thon in Hong Kong - 15 November 2025 (T409852), cirrus: start A/B test on completion with default_sort (T404858) (duration: 12m 57s)
- 08:40 dcausse@deploy2002: dcausse, superpes: Continuing with sync
- 08:34 dcausse@deploy2002: dcausse, superpes: Backport for Throttle exemption for Edit-a-thon in Hong Kong - 15 November 2025 (T409852), cirrus: start A/B test on completion with default_sort (T404858) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:31 dcausse@deploy2002: Started scap sync-world: Backport for Throttle exemption for Edit-a-thon in Hong Kong - 15 November 2025 (T409852), cirrus: start A/B test on completion with default_sort (T404858)
- 08:28 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 08:27 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 08:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 08:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:24 stran@deploy2002: Finished scap sync-world: Backport for Deploy temporary accounts to more large/LQT-unblocked projects (T409691) (duration: 15m 19s)
- 08:20 stran@deploy2002: stran: Continuing with sync
- 08:11 stran@deploy2002: stran: Backport for Deploy temporary accounts to more large/LQT-unblocked projects (T409691) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:09 stran@deploy2002: Started scap sync-world: Backport for Deploy temporary accounts to more large/LQT-unblocked projects (T409691)
- 07:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2203 gradually with 4 steps - Repooling after upgrade
- 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of es1039.eqiad.wmnet onto es1033.eqiad.wmnet
- 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) es1039 gradually with 4 steps - Pool es1039.eqiad.wmnet in after cloning
- 07:34 kart_: Update Apertium to 2025-11-10-034557-production (T408515)
- 07:28 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/apertium: apply
- 07:28 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/apertium: apply
- 07:23 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
- 07:22 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/apertium: apply
- 07:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2203 gradually with 4 steps - Repooling after upgrade
- 07:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2179.codfw.wmnet with reason: Maintenance
- 06:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2203 - Depool db2203 for migration to mariadb 10.11
- 06:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool db2203 - Depool db2203 for migration to mariadb 10.11
- 06:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2203.codfw.wmnet with reason: Maintenance
- 06:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool es1039 gradually with 4 steps - Pool es1039.eqiad.wmnet in after cloning
- 06:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1160.eqiad.wmnet with reason: Maintenance
- 06:54 marostegui@dns1006: END - running authdns-update
- 06:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2203 T409255', diff saved to https://phabricator.wikimedia.org/P85254 and previous config saved to /var/cache/conftool/dbconfig/20251112-065426-marostegui.json
- 06:53 marostegui@dns1006: START - running authdns-update
- 06:53 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2212 to s1 primary and set section read-write T409255', diff saved to https://phabricator.wikimedia.org/P85253 and previous config saved to /var/cache/conftool/dbconfig/20251112-065321-marostegui.json
- 06:53 marostegui@cumin1003: dbctl commit (dc=all): 'Set s1 codfw as read-only for maintenance - T409255', diff saved to https://phabricator.wikimedia.org/P85252 and previous config saved to /var/cache/conftool/dbconfig/20251112-065259-marostegui.json
- 06:52 marostegui: Starting s1 codfw failover from db2203 to db2212 - T409255
- 06:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 32 hosts with reason: Primary switchover s1 T409255
- 06:46 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2212 with weight 0 T409255', diff saved to https://phabricator.wikimedia.org/P85251 and previous config saved to /var/cache/conftool/dbconfig/20251112-064643-marostegui.json
- 01:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 01:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (T407997)', diff saved to https://phabricator.wikimedia.org/P85250 and previous config saved to /var/cache/conftool/dbconfig/20251112-015351-marostegui.json
- 01:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P85249 and previous config saved to /var/cache/conftool/dbconfig/20251112-013843-marostegui.json
- 01:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P85248 and previous config saved to /var/cache/conftool/dbconfig/20251112-012335-marostegui.json
- 01:15 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 14m 14s)
- 01:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (T407997)', diff saved to https://phabricator.wikimedia.org/P85247 and previous config saved to /var/cache/conftool/dbconfig/20251112-010828-marostegui.json
- 01:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1263 (T407997)', diff saved to https://phabricator.wikimedia.org/P85246 and previous config saved to /var/cache/conftool/dbconfig/20251112-010151-marostegui.json
- 01:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1263.eqiad.wmnet with reason: Maintenance
- 01:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (T407997)', diff saved to https://phabricator.wikimedia.org/P85245 and previous config saved to /var/cache/conftool/dbconfig/20251112-010128-marostegui.json
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P85244 and previous config saved to /var/cache/conftool/dbconfig/20251112-004620-marostegui.json
- 00:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P85243 and previous config saved to /var/cache/conftool/dbconfig/20251112-003112-marostegui.json
- 00:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (T407997)', diff saved to https://phabricator.wikimedia.org/P85242 and previous config saved to /var/cache/conftool/dbconfig/20251112-001604-marostegui.json
- 00:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1261 (T407997)', diff saved to https://phabricator.wikimedia.org/P85241 and previous config saved to /var/cache/conftool/dbconfig/20251112-000922-marostegui.json
- 00:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1261.eqiad.wmnet with reason: Maintenance
- 00:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (T407997)', diff saved to https://phabricator.wikimedia.org/P85240 and previous config saved to /var/cache/conftool/dbconfig/20251112-000857-marostegui.json
2025-11-11
- 23:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P85239 and previous config saved to /var/cache/conftool/dbconfig/20251111-235349-marostegui.json
- 23:49 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-eqiad
- 23:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P85238 and previous config saved to /var/cache/conftool/dbconfig/20251111-233842-marostegui.json
- 23:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (T407997)', diff saved to https://phabricator.wikimedia.org/P85237 and previous config saved to /var/cache/conftool/dbconfig/20251111-232334-marostegui.json
- 23:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1260 (T407997)', diff saved to https://phabricator.wikimedia.org/P85236 and previous config saved to /var/cache/conftool/dbconfig/20251111-231655-marostegui.json
- 23:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1260.eqiad.wmnet with reason: Maintenance
- 23:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (T407997)', diff saved to https://phabricator.wikimedia.org/P85235 and previous config saved to /var/cache/conftool/dbconfig/20251111-231632-marostegui.json
- 23:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P85234 and previous config saved to /var/cache/conftool/dbconfig/20251111-230124-marostegui.json
- 22:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P85233 and previous config saved to /var/cache/conftool/dbconfig/20251111-224616-marostegui.json
- 22:37 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
- 22:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (T407997)', diff saved to https://phabricator.wikimedia.org/P85232 and previous config saved to /var/cache/conftool/dbconfig/20251111-223109-marostegui.json
- 22:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1252 (T407997)', diff saved to https://phabricator.wikimedia.org/P85231 and previous config saved to /var/cache/conftool/dbconfig/20251111-222430-marostegui.json
- 22:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1252.eqiad.wmnet with reason: Maintenance
- 22:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T407997)', diff saved to https://phabricator.wikimedia.org/P85230 and previous config saved to /var/cache/conftool/dbconfig/20251111-222405-marostegui.json
- 22:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P85229 and previous config saved to /var/cache/conftool/dbconfig/20251111-220858-marostegui.json
- 21:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P85228 and previous config saved to /var/cache/conftool/dbconfig/20251111-215350-marostegui.json
- 21:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T407997)', diff saved to https://phabricator.wikimedia.org/P85227 and previous config saved to /var/cache/conftool/dbconfig/20251111-213842-marostegui.json
- 21:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1249 (T407997)', diff saved to https://phabricator.wikimedia.org/P85226 and previous config saved to /var/cache/conftool/dbconfig/20251111-213205-marostegui.json
- 21:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1249.eqiad.wmnet with reason: Maintenance
- 21:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T407997)', diff saved to https://phabricator.wikimedia.org/P85225 and previous config saved to /var/cache/conftool/dbconfig/20251111-213141-marostegui.json
- 21:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P85224 and previous config saved to /var/cache/conftool/dbconfig/20251111-211634-marostegui.json
- 21:08 reedy@deploy2002: Finished scap sync-world: Backport for CommonSettings.php: Remove old $wgCentralDBname (T389348), InitialiseSettings: Update comment about wgPopupsConflictingRefTooltipsGadgetName (T362771) (duration: 07m 00s)
- 21:03 reedy@deploy2002: reedy: Continuing with sync
- 21:03 reedy@deploy2002: reedy: Backport for CommonSettings.php: Remove old $wgCentralDBname (T389348), InitialiseSettings: Update comment about wgPopupsConflictingRefTooltipsGadgetName (T362771) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P85223 and previous config saved to /var/cache/conftool/dbconfig/20251111-210126-marostegui.json
- 21:01 reedy@deploy2002: Started scap sync-world: Backport for CommonSettings.php: Remove old $wgCentralDBname (T389348), InitialiseSettings: Update comment about wgPopupsConflictingRefTooltipsGadgetName (T362771)
- 20:59 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{dse-k8s-worker1011.eqiad.wmnet} and (A:dse-k8s-master-eqiad or A:dse-k8s-worker-eqiad)
- 20:59 reedy@deploy2002: Finished scap sync-world: Backport for fix: don't run listTaskCounts if Newcomer Task are not available (T408052 T408531) (duration: 06m 57s)
- 20:55 reedy@deploy2002: reedy: Continuing with sync
- 20:54 reedy@deploy2002: reedy: Backport for fix: don't run listTaskCounts if Newcomer Task are not available (T408052 T408531) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:52 reedy@deploy2002: Started scap sync-world: Backport for fix: don't run listTaskCounts if Newcomer Task are not available (T408052 T408531)
- 20:52 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P{dse-k8s-worker1011.eqiad.wmnet} and (A:dse-k8s-master-eqiad or A:dse-k8s-worker-eqiad)
- 20:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1019.eqiad.wmnet
- 20:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T407997)', diff saved to https://phabricator.wikimedia.org/P85222 and previous config saved to /var/cache/conftool/dbconfig/20251111-204618-marostegui.json
- 20:42 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1019.eqiad.wmnet
- 20:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1003.eqiad.wmnet
- 20:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1248 (T407997)', diff saved to https://phabricator.wikimedia.org/P85221 and previous config saved to /var/cache/conftool/dbconfig/20251111-203942-marostegui.json
- 20:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1248.eqiad.wmnet with reason: Maintenance
- 20:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T407997)', diff saved to https://phabricator.wikimedia.org/P85220 and previous config saved to /var/cache/conftool/dbconfig/20251111-203917-marostegui.json
- 20:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1003.eqiad.wmnet
- 20:24 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
- 20:24 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
- 20:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P85219 and previous config saved to /var/cache/conftool/dbconfig/20251111-202409-marostegui.json
- 20:24 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
- 20:23 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
- 20:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P85218 and previous config saved to /var/cache/conftool/dbconfig/20251111-200901-marostegui.json
- 19:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T407997)', diff saved to https://phabricator.wikimedia.org/P85217 and previous config saved to /var/cache/conftool/dbconfig/20251111-195354-marostegui.json
- 19:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1247 (T407997)', diff saved to https://phabricator.wikimedia.org/P85216 and previous config saved to /var/cache/conftool/dbconfig/20251111-194714-marostegui.json
- 19:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1247.eqiad.wmnet with reason: Maintenance
- 19:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 19:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T407997)', diff saved to https://phabricator.wikimedia.org/P85215 and previous config saved to /var/cache/conftool/dbconfig/20251111-194201-marostegui.json
- 19:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P85214 and previous config saved to /var/cache/conftool/dbconfig/20251111-192654-marostegui.json
- 19:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P85213 and previous config saved to /var/cache/conftool/dbconfig/20251111-191146-marostegui.json
- 18:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T407997)', diff saved to https://phabricator.wikimedia.org/P85212 and previous config saved to /var/cache/conftool/dbconfig/20251111-185637-marostegui.json
- 18:50 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1244 (T407997)', diff saved to https://phabricator.wikimedia.org/P85211 and previous config saved to /var/cache/conftool/dbconfig/20251111-185000-marostegui.json
- 18:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1244.eqiad.wmnet with reason: Maintenance
- 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T407997)', diff saved to https://phabricator.wikimedia.org/P85210 and previous config saved to /var/cache/conftool/dbconfig/20251111-184936-marostegui.json
- 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P85209 and previous config saved to /var/cache/conftool/dbconfig/20251111-183429-marostegui.json
- 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P85208 and previous config saved to /var/cache/conftool/dbconfig/20251111-181921-marostegui.json
- 18:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T407997)', diff saved to https://phabricator.wikimedia.org/P85207 and previous config saved to /var/cache/conftool/dbconfig/20251111-180414-marostegui.json
- 17:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1243 (T407997)', diff saved to https://phabricator.wikimedia.org/P85206 and previous config saved to /var/cache/conftool/dbconfig/20251111-175732-marostegui.json
- 17:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1243.eqiad.wmnet with reason: Maintenance
- 17:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T407997)', diff saved to https://phabricator.wikimedia.org/P85205 and previous config saved to /var/cache/conftool/dbconfig/20251111-175708-marostegui.json
- 17:56 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:56 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding hcaptcha-proxy.anycast.wmnet - sukhe@cumin1003"
- 17:56 sukhe: add hcaptcha-proxy.anycast.wmnet 10.3.0.10/32: T409780
- 17:56 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding hcaptcha-proxy.anycast.wmnet - sukhe@cumin1003"
- 17:52 sukhe@cumin1003: START - Cookbook sre.dns.netbox
- 17:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P85204 and previous config saved to /var/cache/conftool/dbconfig/20251111-174200-marostegui.json
- 17:33 logmsgbot: lucaswerkmeister-wmde Deployed security patch for T409737
- 17:30 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-d6-eqiad
- 17:30 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-d6-eqiad
- 17:29 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-d6-eqiad
- 17:29 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-d6-eqiad
- 17:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P85203 and previous config saved to /var/cache/conftool/dbconfig/20251111-172653-marostegui.json
- 17:22 logmsgbot: lucaswerkmeister-wmde Deployed security patch for T409737
- 17:19 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-d6-eqiad
- 17:19 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-d6-eqiad
- 17:18 cmooney@cumin1003: END (ERROR) - Cookbook sre.network.tls (exit_code=97) for network device lsw1-d6-eqiad
- 17:18 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-d6-eqiad
- 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T407997)', diff saved to https://phabricator.wikimedia.org/P85202 and previous config saved to /var/cache/conftool/dbconfig/20251111-171145-marostegui.json
- 17:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1242 (T407997)', diff saved to https://phabricator.wikimedia.org/P85201 and previous config saved to /var/cache/conftool/dbconfig/20251111-170504-marostegui.json
- 17:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1242.eqiad.wmnet with reason: Maintenance
- 17:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T407997)', diff saved to https://phabricator.wikimedia.org/P85200 and previous config saved to /var/cache/conftool/dbconfig/20251111-170159-marostegui.json
- 16:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P85199 and previous config saved to /var/cache/conftool/dbconfig/20251111-164651-marostegui.json
- 16:38 marostegui@cumin1003: dbctl commit (dc=all): 'db2246 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P85198 and previous config saved to /var/cache/conftool/dbconfig/20251111-163811-root.json
- 16:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P85197 and previous config saved to /var/cache/conftool/dbconfig/20251111-163144-marostegui.json
- 16:24 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{dse-k8s-worker1004.eqiad.wmnet} and (A:dse-k8s-master-eqiad or A:dse-k8s-worker-eqiad)
- 16:23 marostegui@cumin1003: dbctl commit (dc=all): 'db2246 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P85196 and previous config saved to /var/cache/conftool/dbconfig/20251111-162305-root.json
- 16:22 marostegui: Drop afl_ip related triggers from s4 T408780
- 16:21 marostegui: Install MariaDB 10.11.15 on Debian Trixie es1033 T409533
- 16:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T407997)', diff saved to https://phabricator.wikimedia.org/P85195 and previous config saved to /var/cache/conftool/dbconfig/20251111-161636-marostegui.json
- 16:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1221 (T407997)', diff saved to https://phabricator.wikimedia.org/P85194 and previous config saved to /var/cache/conftool/dbconfig/20251111-160950-marostegui.json
- 16:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 16:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1221.eqiad.wmnet with reason: Maintenance
- 16:08 marostegui@cumin1003: dbctl commit (dc=all): 'db2246 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P85193 and previous config saved to /var/cache/conftool/dbconfig/20251111-160759-root.json
- 16:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P85192 and previous config saved to /var/cache/conftool/dbconfig/20251111-160132-marostegui.json
- 15:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
- 15:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
- 15:46 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P{dse-k8s-worker1004.eqiad.wmnet} and (A:dse-k8s-master-eqiad or A:dse-k8s-worker-eqiad)
- 15:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 (T407997)', diff saved to https://phabricator.wikimedia.org/P85191 and previous config saved to /var/cache/conftool/dbconfig/20251111-154624-marostegui.json
- 15:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2246 (T407997)', diff saved to https://phabricator.wikimedia.org/P85189 and previous config saved to /var/cache/conftool/dbconfig/20251111-153942-marostegui.json
- 15:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2246.codfw.wmnet with reason: Maintenance
- 15:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 (T407997)', diff saved to https://phabricator.wikimedia.org/P85188 and previous config saved to /var/cache/conftool/dbconfig/20251111-153918-marostegui.json
- 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
- 15:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
- 15:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P85187 and previous config saved to /var/cache/conftool/dbconfig/20251111-152410-marostegui.json
- 15:09 kostajh: UTC afternoon deploys done
- 15:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P85186 and previous config saved to /var/cache/conftool/dbconfig/20251111-150902-marostegui.json
- 15:04 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Use FancyCaptcha for API edits and page creations (T405595), hCaptcha: Disable addurl trigger for hCaptcha edits (T409822) (duration: 12m 38s)
- 14:59 kharlan@deploy2002: kharlan: Continuing with sync
- 14:54 kharlan@deploy2002: kharlan: Backport for hCaptcha: Use FancyCaptcha for API edits and page creations (T405595), hCaptcha: Disable addurl trigger for hCaptcha edits (T409822) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 (T407997)', diff saved to https://phabricator.wikimedia.org/P85185 and previous config saved to /var/cache/conftool/dbconfig/20251111-145355-marostegui.json
- 14:52 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Use FancyCaptcha for API edits and page creations (T405595), hCaptcha: Disable addurl trigger for hCaptcha edits (T409822)
- 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2028.codfw.wmnet with OS trixie
- 14:47 kharlan@deploy2002: Finished scap sync-world: Backport for Instrument hCaptcha risk signal in edits (T405597) (duration: 09m 15s)
- 14:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2245 (T407997)', diff saved to https://phabricator.wikimedia.org/P85184 and previous config saved to /var/cache/conftool/dbconfig/20251111-144711-marostegui.json
- 14:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2245.codfw.wmnet with reason: Maintenance
- 14:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 (T407997)', diff saved to https://phabricator.wikimedia.org/P85183 and previous config saved to /var/cache/conftool/dbconfig/20251111-144648-marostegui.json
- 14:43 kharlan@deploy2002: kharlan: Continuing with sync
- 14:40 kharlan@deploy2002: kharlan: Backport for Instrument hCaptcha risk signal in edits (T405597) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:40 cmooney@cumin1003: START - Cookbook sre.hosts.dhcp for host sretest1006.eqiad.wmnet
- 14:38 kharlan@deploy2002: Started scap sync-world: Backport for Instrument hCaptcha risk signal in edits (T405597)
- 14:35 kharlan@deploy2002: Finished scap sync-world: Backport for [arwikibooks] Add an alias for project namespace (T409789) (duration: 07m 10s)
- 14:33 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2230.codfw.wmnet onto db-test2001.codfw.wmnet
- 14:31 kharlan@deploy2002: superpes, kharlan: Continuing with sync
- 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P85182 and previous config saved to /var/cache/conftool/dbconfig/20251111-143140-marostegui.json
- 14:31 kharlan@deploy2002: superpes, kharlan: Backport for [arwikibooks] Add an alias for project namespace (T409789) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:28 kharlan@deploy2002: Started scap sync-world: Backport for [arwikibooks] Add an alias for project namespace (T409789)
- 14:25 kharlan@deploy2002: Finished scap sync-world: Backport for Remove SpecialContributeSkinsEnabled for special wikis (T400067) (duration: 10m 43s)
- 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 14:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) es1039 - Depool es1039.eqiad.wmnet to then clone it to es1033.eqiad.wmnet - marostegui@cumin1003
- 14:21 kharlan@deploy2002: kharlan, abi: Continuing with sync
- 14:21 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 14:20 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test2001.codfw.wmnet
- 14:19 marostegui@cumin1003: START - Cookbook sre.mysql.depool es1039 - Depool es1039.eqiad.wmnet to then clone it to es1033.eqiad.wmnet - marostegui@cumin1003
- 14:19 marostegui@cumin1003: START - Cookbook sre.mysql.clone of es1039.eqiad.wmnet onto es1033.eqiad.wmnet
- 14:17 kharlan@deploy2002: kharlan, abi: Backport for Remove SpecialContributeSkinsEnabled for special wikis (T400067) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P85180 and previous config saved to /var/cache/conftool/dbconfig/20251111-141632-marostegui.json
- 14:14 kharlan@deploy2002: Started scap sync-world: Backport for Remove SpecialContributeSkinsEnabled for special wikis (T400067)
- 14:12 kharlan@deploy2002: Finished scap sync-world: Backport for hcaptcha: Don't prevent form submissions unless making an edit (T408693) (duration: 13m 50s)
- 14:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es1033.eqiad.wmnet with reason: Moving es1033 to es7
- 14:08 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1033 T409257', diff saved to https://phabricator.wikimedia.org/P85179 and previous config saved to /var/cache/conftool/dbconfig/20251111-140849-marostegui.json
- 14:05 kharlan@deploy2002: kharlan: Continuing with sync
- 14:05 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 14:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1220.eqiad.wmnet onto db1264.eqiad.wmnet
- 14:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1220 gradually with 4 steps - Pool db1220.eqiad.wmnet in after cloning
- 14:02 kharlan@deploy2002: kharlan: Backport for hcaptcha: Don't prevent form submissions unless making an edit (T408693) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 (T407997)', diff saved to https://phabricator.wikimedia.org/P85177 and previous config saved to /var/cache/conftool/dbconfig/20251111-140125-marostegui.json
- 13:58 kharlan@deploy2002: Started scap sync-world: Backport for hcaptcha: Don't prevent form submissions unless making an edit (T408693)
- 13:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1169.eqiad.wmnet with reason: Upgrading mariadb
- 13:56 marostegui: Install new MariaDB 10.11.15 on db1169 T409533
- 13:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2240 (T407997)', diff saved to https://phabricator.wikimedia.org/P85176 and previous config saved to /var/cache/conftool/dbconfig/20251111-135444-marostegui.json
- 13:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2240.codfw.wmnet with reason: Maintenance
- 13:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2239.codfw.wmnet with reason: Maintenance
- 13:49 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
- 13:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T407997)', diff saved to https://phabricator.wikimedia.org/P85174 and previous config saved to /var/cache/conftool/dbconfig/20251111-134926-marostegui.json
- 13:48 kharlan@deploy2002: Finished scap sync-world: Backport for OutputPage: Export the error message key as a client-side config var (T409431), ext.wikimediaEvents.createAccount: Instrument error page erros (T409431) (duration: 12m 22s)
- 13:43 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
- 13:43 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
- 13:42 kharlan@deploy2002: kharlan: Continuing with sync
- 13:38 kharlan@deploy2002: kharlan: Backport for OutputPage: Export the error message key as a client-side config var (T409431), ext.wikimediaEvents.createAccount: Instrument error page erros (T409431) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:36 kharlan@deploy2002: Started scap sync-world: Backport for OutputPage: Export the error message key as a client-side config var (T409431), ext.wikimediaEvents.createAccount: Instrument error page erros (T409431)
- 13:35 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 13:34 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
- 13:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P85172 and previous config saved to /var/cache/conftool/dbconfig/20251111-133419-marostegui.json
- 13:29 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
- 13:24 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 13:19 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1220 gradually with 4 steps - Pool db1220.eqiad.wmnet in after cloning
- 13:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 13:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P85170 and previous config saved to /var/cache/conftool/dbconfig/20251111-131912-marostegui.json
- 13:09 dreamyjazz@deploy2002: Finished scap sync-world: Backport for CodexTablePager: Only show visible table caption if configured (T409807) (duration: 19m 14s)
- 13:09 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 13:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T407997)', diff saved to https://phabricator.wikimedia.org/P85169 and previous config saved to /var/cache/conftool/dbconfig/20251111-130405-marostegui.json
- 13:03 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 12:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2237 (T407997)', diff saved to https://phabricator.wikimedia.org/P85168 and previous config saved to /var/cache/conftool/dbconfig/20251111-125720-marostegui.json
- 12:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2237.codfw.wmnet with reason: Maintenance
- 12:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T407997)', diff saved to https://phabricator.wikimedia.org/P85167 and previous config saved to /var/cache/conftool/dbconfig/20251111-125656-marostegui.json
- 12:52 dreamyjazz@deploy2002: dreamyjazz: Backport for CodexTablePager: Only show visible table caption if configured (T409807) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:51 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 12:50 dreamyjazz@deploy2002: Started scap sync-world: Backport for CodexTablePager: Only show visible table caption if configured (T409807)
- 12:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2028.codfw.wmnet with OS bookworm
- 12:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P85166 and previous config saved to /var/cache/conftool/dbconfig/20251111-124148-marostegui.json
- 12:29 Amir1: ladsgroup@deploy2002:~$ mwscript-k8s --dblist=all --follow -- userOptions.php --delete mfMode
- 12:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P85165 and previous config saved to /var/cache/conftool/dbconfig/20251111-122640-marostegui.json
- 12:22 Amir1: drop database if exists alswikibooks; drop database if exists alswikiquote; drop database if exists alswiktionary; drop database if exists boardvote2005; drop database if exists boardvote2006; drop database if exists boardvote; (T297297)
- 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 12:16 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 12:16 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 12:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T407997)', diff saved to https://phabricator.wikimedia.org/P85164 and previous config saved to /var/cache/conftool/dbconfig/20251111-121133-marostegui.json
- 12:05 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 12:04 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2236 (T407997)', diff saved to https://phabricator.wikimedia.org/P85163 and previous config saved to /var/cache/conftool/dbconfig/20251111-120453-marostegui.json
- 12:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2236.codfw.wmnet with reason: Maintenance
- 12:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T407997)', diff saved to https://phabricator.wikimedia.org/P85162 and previous config saved to /var/cache/conftool/dbconfig/20251111-120429-marostegui.json
- 11:59 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 11:59 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 11:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS bookworm
- 11:58 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2028.codfw.wmnet with OS trixie
- 11:56 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 11:56 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 11:55 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 11:55 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 11:49 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 11:49 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 11:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P85161 and previous config saved to /var/cache/conftool/dbconfig/20251111-114921-marostegui.json
- 11:48 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 11:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P85160 and previous config saved to /var/cache/conftool/dbconfig/20251111-113414-marostegui.json
- 11:34 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 11:33 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 11:31 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 11:31 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 11:26 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 11:26 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 11:21 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2028.codfw.wmnet with OS trixie
- 11:20 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 11:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T407997)', diff saved to https://phabricator.wikimedia.org/P85159 and previous config saved to /var/cache/conftool/dbconfig/20251111-111906-marostegui.json
- 11:16 jynus: reload haproxy on dbprox1024, dbproxy1029
- 11:13 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
- 11:12 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
- 11:12 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 11:12 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 11:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2219 (T407997)', diff saved to https://phabricator.wikimedia.org/P85158 and previous config saved to /var/cache/conftool/dbconfig/20251111-111225-marostegui.json
- 11:12 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 11:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 11:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T407997)', diff saved to https://phabricator.wikimedia.org/P85157 and previous config saved to /var/cache/conftool/dbconfig/20251111-111200-marostegui.json
- 11:11 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 11:10 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 11:10 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 11:06 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 11:05 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- ish: payments-wiki upgraded from 1d4b0d2a to b67adca1
- 11:04 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 11:04 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 11:03 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 10:59 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 10:58 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 10:57 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 10:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P85156 and previous config saved to /var/cache/conftool/dbconfig/20251111-105652-marostegui.json
- 10:54 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 10:52 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:52 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:48 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 10:45 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:44 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:42 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:42 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P85155 and previous config saved to /var/cache/conftool/dbconfig/20251111-104145-marostegui.json
- 10:36 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T407997)', diff saved to https://phabricator.wikimedia.org/P85154 and previous config saved to /var/cache/conftool/dbconfig/20251111-102637-marostegui.json
- 10:25 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:23 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:23 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2210 (T407997)', diff saved to https://phabricator.wikimedia.org/P85153 and previous config saved to /var/cache/conftool/dbconfig/20251111-101956-marostegui.json
- 10:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2210.codfw.wmnet with reason: Maintenance
- 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T407997)', diff saved to https://phabricator.wikimedia.org/P85152 and previous config saved to /var/cache/conftool/dbconfig/20251111-101933-marostegui.json
- 10:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P85150 and previous config saved to /var/cache/conftool/dbconfig/20251111-100425-marostegui.json
- 10:01 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2090.codfw.wmnet with OS bullseye
- 09:49 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
- 09:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P85149 and previous config saved to /var/cache/conftool/dbconfig/20251111-094918-marostegui.json
- 09:49 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
- 09:48 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 09:47 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 09:47 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 09:47 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 09:46 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 09:46 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 09:45 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 09:45 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 09:45 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 09:45 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 09:36 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.2 refs T408272
- 09:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T407997)', diff saved to https://phabricator.wikimedia.org/P85148 and previous config saved to /var/cache/conftool/dbconfig/20251111-093410-marostegui.json
- 09:31 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 09:31 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 09:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P85145 and previous config saved to /var/cache/conftool/dbconfig/20251111-090712-marostegui.json
- 08:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P85144 and previous config saved to /var/cache/conftool/dbconfig/20251111-085204-marostegui.json
- 08:46 ryankemper@cumin1002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2028.codfw.wmnet with OS trixie
- 08:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T407997)', diff saved to https://phabricator.wikimedia.org/P85143 and previous config saved to /var/cache/conftool/dbconfig/20251111-083657-marostegui.json
- 08:29 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2172 (T407997)', diff saved to https://phabricator.wikimedia.org/P85142 and previous config saved to /var/cache/conftool/dbconfig/20251111-082950-marostegui.json
- 08:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 08:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T407997)', diff saved to https://phabricator.wikimedia.org/P85141 and previous config saved to /var/cache/conftool/dbconfig/20251111-082927-marostegui.json
- 08:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P85140 and previous config saved to /var/cache/conftool/dbconfig/20251111-081419-marostegui.json
- 08:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 08:13 moritzm: installing intel-microcode security updates
- 08:08 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 07:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P85139 and previous config saved to /var/cache/conftool/dbconfig/20251111-075911-marostegui.json
- 07:52 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 07:52 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2028.codfw.wmnet with OS trixie
- 07:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T407997)', diff saved to https://phabricator.wikimedia.org/P85138 and previous config saved to /var/cache/conftool/dbconfig/20251111-074404-marostegui.json
- 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2155 (T407997)', diff saved to https://phabricator.wikimedia.org/P85137 and previous config saved to /var/cache/conftool/dbconfig/20251111-073659-marostegui.json
- 07:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 07:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T407997)', diff saved to https://phabricator.wikimedia.org/P85136 and previous config saved to /var/cache/conftool/dbconfig/20251111-073635-marostegui.json
- 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P85135 and previous config saved to /var/cache/conftool/dbconfig/20251111-072127-marostegui.json
- 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 07:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P85134 and previous config saved to /var/cache/conftool/dbconfig/20251111-070620-marostegui.json
- 06:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1220 - Depool db1220.eqiad.wmnet to then clone it to db1264.eqiad.wmnet - marostegui@cumin1003
- 06:56 marostegui@cumin1003: START - Cookbook sre.mysql.depool db1220 - Depool db1220.eqiad.wmnet to then clone it to db1264.eqiad.wmnet - marostegui@cumin1003
- 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.clone of db1220.eqiad.wmnet onto db1264.eqiad.wmnet
- 06:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Cloning another host
- 06:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Cloning another host
- 06:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T407997)', diff saved to https://phabricator.wikimedia.org/P85132 and previous config saved to /var/cache/conftool/dbconfig/20251111-065112-marostegui.json
- 06:42 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2147 (T407997)', diff saved to https://phabricator.wikimedia.org/P85131 and previous config saved to /var/cache/conftool/dbconfig/20251111-064257-marostegui.json
- 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 06:33 marostegui: Deploy schema change on x1 codfw master with replication T409733
- 06:31 kart_: apertium: staging: Update to 2025-11-10-034557-production (T408515)
- 06:29 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/apertium: apply
- 06:29 kartik@deploy2002: helmfile [staging] START helmfile.d/services/apertium: apply
- 06:24 marostegui: Deploy schema change on x1 codfw master with replication T409101
- 06:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2203.codfw.wmnet with reason: Maintenance
- 05:02 mwpresync@deploy2002: Pruned MediaWiki: 1.45.0-wmf.24 (duration: 02m 30s)
- 04:49 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.2 refs T408272 (duration: 46m 27s)
- 04:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.2 refs T408272
- 01:19 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 01:18 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 01:01 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 01m 18s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:57 logmsgbot: catrope Deployed security patch for T409743
- 00:22 catrope@deploy2002: Finished scap sync-world: Backport for i18n: Update wikimedia-emailauth-login-help to link to Special:AccountRecovery (T399749), OATHManage: Don't always set the page title to "Create new recovery codes", OATHAuth: Increase 2FA opt-in to 70% of users (T399664) (duration: 10m 56s)
- 00:15 catrope@deploy2002: catrope, mstyles: Continuing with sync
- 00:15 catrope@deploy2002: catrope, mstyles: Backport for i18n: Update wikimedia-emailauth-login-help to link to Special:AccountRecovery (T399749), OATHManage: Don't always set the page title to "Create new recovery codes", OATHAuth: Increase 2FA opt-in to 70% of users (T399664) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can no
- 00:11 catrope@deploy2002: Started scap sync-world: Backport for i18n: Update wikimedia-emailauth-login-help to link to Special:AccountRecovery (T399749), OATHManage: Don't always set the page title to "Create new recovery codes", OATHAuth: Increase 2FA opt-in to 70% of users (T399664)
2025-11-10
- 23:39 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 23:37 catrope@deploy2002: catrope, mstyles: Continuing with sync
- 23:34 catrope@deploy2002: catrope, mstyles: Backport for i18n: Update wikimedia-emailauth-login-help to link to Special:AccountRecovery (T399749), OATHManage: Don't always set the page title to "Create new recovery codes", OATHAuth: Increase 2FA opt-in to 70% of users (T399664) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can no
- 23:08 catrope@deploy2002: Started scap sync-world: Backport for i18n: Update wikimedia-emailauth-login-help to link to Special:AccountRecovery (T399749), OATHManage: Don't always set the page title to "Create new recovery codes", OATHAuth: Increase 2FA opt-in to 70% of users (T399664)
- 22:46 maryum: Deployed fix for T406664 and T401053
- 21:57 catrope@deploy2002: Finished scap sync-world: Backport for Update QuickSurvey platforms (duration: 09m 02s)
- 21:53 catrope@deploy2002: catrope, jdlrobson: Continuing with sync
- 21:51 catrope@deploy2002: catrope, jdlrobson: Backport for Update QuickSurvey platforms synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:48 catrope@deploy2002: Started scap sync-world: Backport for Update QuickSurvey platforms
- 21:45 arlolra@deploy2002: Finished scap sync-world: Backport for Deploy Parsoid Read Views to 13 wikis (T409593) (duration: 08m 14s)
- 21:41 arlolra@deploy2002: arlolra: Continuing with sync
- 21:40 arlolra@deploy2002: arlolra: Backport for Deploy Parsoid Read Views to 13 wikis (T409593) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:37 arlolra@deploy2002: Started scap sync-world: Backport for Deploy Parsoid Read Views to 13 wikis (T409593)
- 21:35 catrope@deploy2002: Finished scap sync-world: Backport for Remove extended autoconfirmed time for Tor on enwiki (T409022) (duration: 08m 19s)
- 21:31 catrope@deploy2002: catrope, pppery: Continuing with sync
- 21:30 catrope@deploy2002: catrope, pppery: Backport for Remove extended autoconfirmed time for Tor on enwiki (T409022) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:28 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Deploy: Minor usability improvements for known-client objects - swfrench@cumin2002"
- 21:28 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Minor usability improvements for known-client objects - swfrench@cumin2002
- 21:27 catrope@deploy2002: Started scap sync-world: Backport for Remove extended autoconfirmed time for Tor on enwiki (T409022)
- 21:27 swfrench@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Minor usability improvements for known-client objects - swfrench@cumin2002
- 21:27 swfrench@cumin2002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Deploy: Minor usability improvements for known-client objects - swfrench@cumin2002"
- 21:21 aude@deploy2002: Finished scap sync-world: Backport for Use addModuleStyles for ReadingList icons (T409116) (duration: 08m 16s)
- 21:16 aude@deploy2002: toyofuku, aude: Continuing with sync
- 21:15 aude@deploy2002: toyofuku, aude: Backport for Use addModuleStyles for ReadingList icons (T409116) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:13 aude@deploy2002: Started scap sync-world: Backport for Use addModuleStyles for ReadingList icons (T409116)
- 20:25 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 20:21 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 19:59 tzatziki: removing 1 file for legal compliance
- 19:54 tzatziki: removing 2 files for legal compliance
- 19:35 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2094.codfw.wmnet with OS bullseye
- 18:58 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2090.codfw.wmnet with reason: host reimage
- 18:54 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2090.codfw.wmnet with reason: host reimage
- 18:45 mutante: destroyed former people.wikimedia.org backends people1004/people2003 - replaced by trixie VMs people1005/people2004
- 18:44 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts people2003.codfw.wmnet
- 18:44 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:44 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: people2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin2002"
- 18:44 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: people2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin2002"
- 18:43 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
- 18:42 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
- 18:40 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 18:35 dzahn@cumin2002: START - Cookbook sre.hosts.decommission for hosts people2003.codfw.wmnet
- 18:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
- 18:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
- 18:33 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
- 18:32 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
- 18:30 swfrench@deploy2002: Stopping before sync operations
- 18:29 swfrench@deploy2002: Started scap sync-world: No-deployment scap run to switch mw-(cron|videoscaler) to PHP 8.3 - T405955
- 18:28 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2093.codfw.wmnet with reason: host reimage
- 18:25 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts people1004.eqiad.wmnet
- 18:25 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:25 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: people1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin2002"
- 18:24 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2091.codfw.wmnet with reason: host reimage
- 18:23 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: people1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin2002"
- 18:23 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:23 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:22 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:22 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:20 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:20 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:20 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:20 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:18 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 18:18 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 18:17 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2093.codfw.wmnet with reason: host reimage
- 18:17 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2091.codfw.wmnet with reason: host reimage
- 18:16 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:16 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:15 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:15 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 18:15 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:15 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2094.codfw.wmnet with OS bullseye
- 18:15 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2093.codfw.wmnet with OS bullseye
- 18:14 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2091.codfw.wmnet with OS bullseye
- 18:14 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2090.codfw.wmnet with OS bullseye
- 18:12 ryankemper: [WDQS] Restarted wdqs-main in codfw
- 18:12 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:12 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:12 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:11 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:11 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:11 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:10 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:10 dzahn@cumin2002: START - Cookbook sre.hosts.decommission for hosts people1004.eqiad.wmnet
- 18:10 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on people1004.eqiad.wmnet with reason: decom
- 18:06 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:05 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:05 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:05 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:04 mmartorana@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 18:04 mmartorana@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 18:04 mmartorana@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 18:04 mmartorana@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 18:03 mmartorana@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 18:03 mmartorana@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 18:03 mmartorana@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 18:02 mmartorana@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 18:02 mmartorana@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 18:01 mmartorana@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 18:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 18:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2181 gradually with 4 steps - Migration of db2181.codfw.wmnet completed
- 17:59 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2094.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:22 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2094.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:16 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2181 gradually with 4 steps - Migration of db2181.codfw.wmnet completed
- 17:02 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2094.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 16:58 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2094.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 16:48 bking@cumin1002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (1 nodes at a time) for ElasticSearch cluster relforge: reboot to apply updates - bking@cumin1002 - T407110
- 16:34 bking@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (1 nodes at a time) for ElasticSearch cluster relforge: reboot to apply updates - bking@cumin1002 - T407110
- 16:34 bking@cumin1002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (1 nodes at a time) for ElasticSearch cluster relforge: reboot to apply updates - bking@cumin1002 - T407110
- 16:32 bking@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (1 nodes at a time) for ElasticSearch cluster relforge: reboot to apply updates - bking@cumin1002 - T407110
- 16:31 bking@cumin1002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: apply updates - bking@cumin1002 - T407110
- 16:31 bking@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: apply updates - bking@cumin1002 - T407110
- 16:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2199.codfw.wmnet with reason: MariaDB upgrade
- 16:11 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2181 - Upgrading db2181.codfw.wmnet
- 16:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool db2181 - Upgrading db2181.codfw.wmnet
- 16:10 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 16:10 swfrench-wmf: begin rolling restart of codfw-associated confds after conf2006 etcd restart - T352245
- 16:05 swfrench@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Hold deployments during etcd certificate change - T352245 (duration: 11m 38s)
- 15:53 swfrench@deploy2002: Locking from deployment [ALL REPOSITORIES]: Hold deployments during etcd certificate change - T352245
- 15:47 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 15:34 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 15:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
- 15:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
- 15:25 Amir1: drop database if exists webshop (T297297)
- 15:11 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 15:11 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2167 gradually with 4 steps - Migration of db2167.codfw.wmnet completed
- 15:00 esanders@deploy2002: Finished scap sync-world: Backport for [ptwiki] Add new abusefilter usergroup (T409578) (duration: 07m 37s)
- 14:55 esanders@deploy2002: superpes, esanders: Continuing with sync
- 14:54 esanders@deploy2002: superpes, esanders: Backport for [ptwiki] Add new abusefilter usergroup (T409578) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:52 esanders@deploy2002: Started scap sync-world: Backport for [ptwiki] Add new abusefilter usergroup (T409578)
- 14:50 esanders@deploy2002: Finished scap sync-world: Backport for Create maintenance script to apply manual fixes (T397426) (duration: 07m 21s)
- 14:45 esanders@deploy2002: esanders: Continuing with sync
- 14:45 esanders@deploy2002: esanders: Backport for Create maintenance script to apply manual fixes (T397426) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:43 esanders@deploy2002: Started scap sync-world: Backport for Create maintenance script to apply manual fixes (T397426)
- 14:31 kart_: Update Recommnedation API to 2025-11-07-162011-production (T405000, T406854, T408936, T408937, T408934)
- 14:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2167 gradually with 4 steps - Migration of db2167.codfw.wmnet completed
- 14:20 kartik@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 14:17 esanders@deploy2002: Finished scap sync-world: Backport for Freeze LiquidThreads on enwiktionary (T405080) (duration: 13m 48s)
- 14:15 kartik@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 14:10 esanders@deploy2002: esanders: Continuing with sync
- 14:08 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 14:08 esanders@deploy2002: esanders: Backport for Freeze LiquidThreads on enwiktionary (T405080) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:06 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 14:04 esanders@deploy2002: Started scap sync-world: Backport for Freeze LiquidThreads on enwiktionary (T405080)
- 14:00 ladsgroup@deploy2002: Finished scap sync-world: Backport for Remove nlwiki exception from thumb limits (T408715) (duration: 40m 54s)
- 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2167 - Upgrading db2167.codfw.wmnet
- 13:51 fceratto@cumin1003: START - Cookbook sre.mysql.depool db2167 - Upgrading db2167.codfw.wmnet
- 13:51 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 13:50 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 13:44 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 13:43 ladsgroup@deploy2002: ladsgroup: Backport for Remove nlwiki exception from thumb limits (T408715) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 13:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2166 gradually with 4 steps - Migration of db2166.codfw.wmnet completed
- 13:41 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 13:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
- 13:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
- 13:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
- 13:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
- 13:19 ladsgroup@deploy2002: Started scap sync-world: Backport for Remove nlwiki exception from thumb limits (T408715)
- 13:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
- 13:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
- 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 13:14 Amir1: mwscript-k8s --dblist=medium --follow -- purgeUserOptions.php --login-age 15 (T406724)
- 13:12 elukey: restart postfix on mx-in1001 to apply an IP ban - T408632
- 13:12 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 13:11 elukey: restart postfix on mx-in2001 to apply an IP ban - T408632
- 13:11 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 13:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
- 13:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
- 13:07 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 13:07 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 13:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
- 13:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
- 12:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2166 gradually with 4 steps - Migration of db2166.codfw.wmnet completed
- 12:49 Amir1: drop database if exists tokiponawiki; drop database if exists tokiponawikibooks; drop database if exists tokiponawikiquote; drop database if exists tokiponawiktionary; (T297297)
- 12:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 12:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 12:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 12:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 12:43 Amir1: drop database if exists de_labswikimedia; drop database if exists en_labswikimedia; drop database if exists flaggedrevs_labswikimedia; drop database if exists liquidthreads_labswikimedia; drop database if exists readerfeedback_labswikimedia; (T297297)
- 12:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2166 - Upgrading db2166.codfw.wmnet
- 12:28 fceratto@cumin1003: START - Cookbook sre.mysql.depool db2166 - Upgrading db2166.codfw.wmnet
- 12:28 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 11:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
- 11:09 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
- 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
- 11:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
- 10:04 slyngshede@dns1004: END - running authdns-update
- 10:03 slyngshede@dns1004: START - running authdns-update
- 10:03 slyngs: Upgrade CAS (idp.wikimedia.org) to version 7.2.7
- 09:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 09:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 09:59 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 09:59 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 09:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 09:48 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 09:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 09:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:48 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 08:48 moritzm: installing Java 8 security updates on Bookworm
- 08:48 moritzm: uploaded openjdk-8 8u472-ga-1~deb12u1 to apt.wikimedia.org (forward port of latest Java 8 security updates)
- 08:21 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 08:19 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 08:19 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 08:07 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Stevemunene out of all services on: 2395 hosts
- 01:42 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
- 01:42 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
- 01:42 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
- 01:41 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
- 01:38 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 01:36 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 14m 17s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-11-09
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 20s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-11-08
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 14m 08s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:49 ryankemper@cumin1002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
2025-11-07
- 23:27 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudlb2003-dev.codfw.wmnet with OS trixie
- 23:04 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudlb2003-dev.codfw.wmnet with reason: host reimage
- 23:00 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudlb2003-dev.codfw.wmnet with reason: host reimage
- 22:43 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudlb2003-dev.codfw.wmnet with OS trixie
- 22:40 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 19:58 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudlb2002-dev.codfw.wmnet with OS trixie
- 19:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1264.eqiad.wmnet with OS bookworm
- 19:57 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:55 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:38 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1264.eqiad.wmnet with reason: host reimage
- 19:32 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1264.eqiad.wmnet with reason: host reimage
- 19:17 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host db1264.eqiad.wmnet with OS bookworm
- 19:16 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1264.eqiad.wmnet with OS bookworm
- 18:50 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host db1264.eqiad.wmnet with OS bookworm
- 18:40 robh: eqiad c/d migration work complete for today
- 18:37 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: sync
- 18:36 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: sync
- 18:32 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on cirrussearch1122.eqiad.wmnet with reason: C/D Migration
- 18:26 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on cirrussearch1121.eqiad.wmnet with reason: C/D Migration
- 18:25 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: sync
- 18:25 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: sync
- 18:24 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: sync
- 18:24 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: sync
- 18:24 cdanis@deploy2002: Finished scap sync-world: Backport for intake-logging EventGate: store x-ja4h req hdr (duration: 07m 02s)
- 18:24 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on cirrussearch1120.eqiad.wmnet with reason: C/D Migration
- 18:22 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on dbprov1004.eqiad.wmnet with reason: C/D Migration
- 18:20 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on prometheus1007.eqiad.wmnet with reason: C/D Migration
- 18:20 cdanis@deploy2002: cdanis: Continuing with sync
- 18:20 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1152.eqiad.wmnet with reason: C/D Migration
- 18:19 cdanis@deploy2002: cdanis: Backport for intake-logging EventGate: store x-ja4h req hdr synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:17 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1231.eqiad.wmnet with reason: C/D Migration
- 18:17 cdanis@deploy2002: Started scap sync-world: Backport for intake-logging EventGate: store x-ja4h req hdr
- 18:16 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1231.eqiad.wmnet with reason: C/D Migration
- 18:16 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1231.eqiad.wmnet with reason: C/D Migration
- 18:11 robh: eqiad d7 network port migations in progress
- 18:10 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-be1093.eqiad.wmnet with reason: C/D Migration
- 18:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T407997)', diff saved to https://phabricator.wikimedia.org/P85109 and previous config saved to /var/cache/conftool/dbconfig/20251107-180454-marostegui.json
- 18:04 robh: eqiad d4 migrations done for today
- 17:59 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on moss-fe1002.eqiad.wmnet with reason: C/D Migration
- 17:57 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on kafka-logging1003.eqiad.wmnet with reason: C/D Migration
- 17:57 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on cp1114.eqiad.wmnet with reason: C/D Migration
- 17:55 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-be1067.eqiad.wmnet with reason: C/D Migration
- 17:53 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1054.eqiad.wmnet with reason: C/D Migration
- 17:53 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1053.eqiad.wmnet with reason: C/D Migration
- 17:52 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy7002.magru.wmnet with OS trixie
- 17:51 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on cp1113.eqiad.wmnet with reason: C/D Migration
- 17:49 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy7001.magru.wmnet with OS trixie
- 17:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P85108 and previous config saved to /var/cache/conftool/dbconfig/20251107-174946-marostegui.json
- 17:49 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on thanos-fe1007.eqiad.wmnet with reason: C/D Migration
- 17:48 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy3002.esams.wmnet with OS trixie
- 17:47 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1052.eqiad.wmnet with reason: C/D Migration
- 17:45 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1051.eqiad.wmnet with reason: C/D Migration
- 17:44 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1230.eqiad.wmnet with reason: C/D Migration
- 17:43 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1229.eqiad.wmnet with reason: C/D Migration
- 17:41 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on cirrussearch1103.eqiad.wmnet with reason: C/D Migration
- 17:40 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1134.eqiad.wmnet with reason: C/D Migration
- 17:39 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1138.eqiad.wmnet with reason: C/D Migration
- 17:37 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy7002.magru.wmnet with reason: host reimage
- 17:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P85107 and previous config saved to /var/cache/conftool/dbconfig/20251107-173439-marostegui.json
- 17:31 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy7001.magru.wmnet with reason: host reimage
- 17:30 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1137.eqiad.wmnet with reason: C/D Migration
- 17:29 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1136.eqiad.wmnet with reason: C/D Migration
- 17:29 cdanis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy7002.magru.wmnet with reason: host reimage
- 17:28 cdanis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy7001.magru.wmnet with reason: host reimage
- 17:28 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy3002.esams.wmnet with reason: host reimage
- 17:26 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1135.eqiad.wmnet with reason: C/D Migration
- 17:24 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on kubestage1004.eqiad.wmnet with reason: C/D Migration
- 17:23 cdanis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy3002.esams.wmnet with reason: host reimage
- 17:22 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1247.eqiad.wmnet with reason: C/D Migration
- 17:21 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudlb2002-dev.codfw.wmnet with reason: host reimage
- 17:21 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudlb2002-dev.codfw.wmnet with reason: host reimage
- 17:21 robh: eqiad d2 network migrations done for today, moving onto d3
- 17:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T407997)', diff saved to https://phabricator.wikimedia.org/P85106 and previous config saved to /var/cache/conftool/dbconfig/20251107-171931-marostegui.json
- 17:19 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wdqs1017.eqiad.wmnet with reason: C/D Migration
- 17:18 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on relforge1009.eqiad.wmnet with reason: C/D Migration
- 17:16 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-fe1013.eqiad.wmnet with reason: C/D Migration
- 17:15 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1228.eqiad.wmnet with reason: C/D Migration
- 17:14 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1227.eqiad.wmnet with reason: C/D Migration
- 17:13 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-presto1020.eqiad.wmnet with reason: C/D Migration
- 17:10 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on druid1013.eqiad.wmnet with reason: C/D Migration
- 17:09 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-fe1020.eqiad.wmnet with reason: C/D Migration
- 17:08 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1209.eqiad.wmnet with reason: C/D Migration
- 17:06 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wdqs1022.eqiad.wmnet with reason: C/D Migration
- 17:06 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on titan1002.eqiad.wmnet with reason: C/D Migration
- 17:05 robh: eqiad d2 migrations in progress
- 17:04 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on maps1014.eqiad.wmnet with reason: C/D Migration
- 17:03 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudlb2002-dev.codfw.wmnet with OS trixie
- 17:03 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-be1091.eqiad.wmnet with reason: C/D Migration
- 17:02 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy3001.esams.wmnet with OS trixie
- 17:00 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1175.eqiad.wmnet with reason: C/D Migration
- 17:00 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
- 17:00 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 17:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2216 (T407997)', diff saved to https://phabricator.wikimedia.org/P85105 and previous config saved to /var/cache/conftool/dbconfig/20251107-170042-marostegui.json
- 17:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2216.codfw.wmnet with reason: Maintenance
- 17:00 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
- 17:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2212 (T407997)', diff saved to https://phabricator.wikimedia.org/P85104 and previous config saved to /var/cache/conftool/dbconfig/20251107-170018-marostegui.json
- 16:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:57 cdanis@cumin1003: START - Cookbook sre.hosts.reimage for host tcp-proxy7002.magru.wmnet with OS trixie
- 16:56 cdanis@cumin1003: START - Cookbook sre.hosts.reimage for host tcp-proxy7001.magru.wmnet with OS trixie
- 16:56 cdanis@cumin1003: START - Cookbook sre.hosts.reimage for host tcp-proxy3002.esams.wmnet with OS trixie
- 16:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 16:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 16:49 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on cp1112.eqiad.wmnet with reason: C/D Migration
- 16:47 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy3001.esams.wmnet with reason: host reimage
- 16:46 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on bast1003.wikimedia.org with reason: C/D Migration
- 16:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2212', diff saved to https://phabricator.wikimedia.org/P85103 and previous config saved to /var/cache/conftool/dbconfig/20251107-164510-marostegui.json
- 16:44 cdanis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy3001.esams.wmnet with reason: host reimage
- 16:43 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1182.eqiad.wmnet with reason: C/D Migration
- 16:42 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1172.eqiad.wmnet with reason: C/D Migration
- 16:41 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1253.eqiad.wmnet with reason: C/D Migration
- 16:39 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1263.eqiad.wmnet with reason: C/D Migration
- 16:39 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on es1051.eqiad.wmnet with reason: C/D Migration
- 16:37 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on rdb1012.eqiad.wmnet with reason: C/D Migration
- 16:36 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS bookworm
- 16:34 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS bookworm
- 16:33 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on aqs1014.eqiad.wmnet with reason: C/D Migration
- 16:32 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on restbase1033.eqiad.wmnet with reason: C/D Migration
- 16:32 robh: eqiad row C migrations complete for today, moving onto row D, D1 to start
- 16:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2212', diff saved to https://phabricator.wikimedia.org/P85102 and previous config saved to /var/cache/conftool/dbconfig/20251107-163003-marostegui.json
- 16:28 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mwlog1002.eqiad.wmnet with reason: C/D Migration
- 16:27 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1171.eqiad.wmnet with reason: C/D Migration
- 16:26 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1170.eqiad.wmnet with reason: C/D Migration
- 16:25 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1220.eqiad.wmnet with reason: C/D Migration
- 16:24 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1219.eqiad.wmnet with reason: C/D Migration
- 16:23 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on dbproxy1024.eqiad.wmnet with reason: C/D Migration
- 16:21 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on pki1002.eqiad.wmnet with reason: C/D Migration
- 16:20 cdanis@cumin1003: START - Cookbook sre.hosts.reimage for host tcp-proxy3001.esams.wmnet with OS trixie
- 16:20 robh: eqiad c/d migration now working rack c6
- 16:19 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1231.eqiad.wmnet with reason: C/D Migration
- 16:18 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1245.eqiad.wmnet with reason: C/D Migration
- 16:17 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1244.eqiad.wmnet with reason: C/D Migration
- 16:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2212 (T407997)', diff saved to https://phabricator.wikimedia.org/P85101 and previous config saved to /var/cache/conftool/dbconfig/20251107-161455-marostegui.json
- 16:10 robh: eqiad c3 network migrations complete for today, moving onto next rack
- 16:09 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy3001.esams.wmnet with OS trixie
- 16:08 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:08 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on krb1002.eqiad.wmnet with reason: C/D Migration
- 16:06 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on dbproxy1029.eqiad.wmnet with reason: C/D Migration
- 16:05 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 16:04 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:03 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1150.eqiad.wmnet with reason: C/D Migration
- 16:03 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1252.eqiad.wmnet with reason: C/D Migration
- 16:02 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 16:01 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:59 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-conf1006.eqiad.wmnet with reason: C/D Migration
- 15:59 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 15:58 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1230.eqiad.wmnet with reason: C/D Migration
- 15:58 dzahn@dns1004: END - running authdns-update
- 15:57 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1167.eqiad.wmnet with reason: C/D Migration
- 15:57 dzahn@dns1004: START - running authdns-update
- 15:57 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1166.eqiad.wmnet with reason: C/D Migration
- 15:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2212 (T407997)', diff saved to https://phabricator.wikimedia.org/P85099 and previous config saved to /var/cache/conftool/dbconfig/20251107-155605-marostegui.json
- 15:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2212.codfw.wmnet with reason: Maintenance
- 15:54 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1180.eqiad.wmnet with reason: C/D Migration
- 15:53 robh: eqiad C3 switch migrations in progress
- 15:52 robh: eqiad C2 switch migrations in progress
- 15:52 cdanis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy3001.esams.wmnet with reason: host reimage
- 15:52 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on es1057.eqiad.wmnet with reason: C/D Migration
- 15:51 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1243.eqiad.wmnet with reason: C/D Migration
- 15:49 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1242.eqiad.wmnet with reason: C/D Migration
- 15:48 cdanis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy3001.esams.wmnet with reason: host reimage
- 15:47 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc-gp1005.eqiad.wmnet with reason: C/D Migration
- 15:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2028.codfw.wmnet with OS trixie
- 15:44 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1132.eqiad.wmnet with reason: C/D Migration
- 15:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2202.codfw.wmnet with reason: Maintenance
- 15:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T407997)', diff saved to https://phabricator.wikimedia.org/P85098 and previous config saved to /var/cache/conftool/dbconfig/20251107-153957-marostegui.json
- 15:38 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1264.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:33 dpogorzelski@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=ml-serve1012.eqiad.wmnet,dc=eqiad,cluster=ml_serve,service=kubesvc
- 15:28 taavi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:28 taavi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add x1/x4 wiki replicas cloudlb addresses - taavi@cumin1003"
- 15:28 taavi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add x1/x4 wiki replicas cloudlb addresses - taavi@cumin1003"
- 15:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P85097 and previous config saved to /var/cache/conftool/dbconfig/20251107-152449-marostegui.json
- 15:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 15:23 taavi@cumin1003: START - Cookbook sre.dns.netbox
- 15:19 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 15:12 cdanis@cumin1003: START - Cookbook sre.hosts.reimage for host tcp-proxy3001.esams.wmnet with OS trixie
- 15:10 cdanis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host tcp-proxy3001.esams.wmnet with OS trixie
- 15:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P85096 and previous config saved to /var/cache/conftool/dbconfig/20251107-150941-marostegui.json
- 15:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host db1264.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:07 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 15:05 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 15:04 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:04 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 15:02 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 15:02 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1264.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:00 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 14:59 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
- 14:59 cdanis@cumin1003: START - Cookbook sre.hosts.reimage for host tcp-proxy3001.esams.wmnet with OS trixie
- 14:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host db1264.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 14:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T407997)', diff saved to https://phabricator.wikimedia.org/P85095 and previous config saved to /var/cache/conftool/dbconfig/20251107-145434-marostegui.json
- 14:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 14:49 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 14:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 14:44 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 14:42 andrew@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcontrol1008-dev.eqiad.wmnet']
- 14:42 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcontrol1008-dev.eqiad.wmnet']
- 14:40 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 14:39 andrew@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcontrol1008-dev.eqiad.wmnet']
- 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2188 (T407997)', diff saved to https://phabricator.wikimedia.org/P85093 and previous config saved to /var/cache/conftool/dbconfig/20251107-143657-marostegui.json
- 14:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2188.codfw.wmnet with reason: Maintenance
- 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T407997)', diff saved to https://phabricator.wikimedia.org/P85092 and previous config saved to /var/cache/conftool/dbconfig/20251107-143633-marostegui.json
- 14:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P85091 and previous config saved to /var/cache/conftool/dbconfig/20251107-142125-marostegui.json
- 14:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P85090 and previous config saved to /var/cache/conftool/dbconfig/20251107-140619-marostegui.json
- 13:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T407997)', diff saved to https://phabricator.wikimedia.org/P85089 and previous config saved to /var/cache/conftool/dbconfig/20251107-135111-marostegui.json
- 13:47 dpogorzelski@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=ml-serve1011.eqiad.wmnet,dc=eqiad,cluster=ml_serve,service=kubesvc
- 13:46 marostegui: Deploy schema change on x1 codfw master with replication T409539
- 13:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2176 (T407997)', diff saved to https://phabricator.wikimedia.org/P85088 and previous config saved to /var/cache/conftool/dbconfig/20251107-133002-marostegui.json
- 13:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 13:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T407997)', diff saved to https://phabricator.wikimedia.org/P85087 and previous config saved to /var/cache/conftool/dbconfig/20251107-132938-marostegui.json
- 13:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P85086 and previous config saved to /var/cache/conftool/dbconfig/20251107-131431-marostegui.json
- 13:12 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps-test2006.codfw.wmnet
- 13:05 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:05 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 13:01 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 12:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P85085 and previous config saved to /var/cache/conftool/dbconfig/20251107-125923-marostegui.json
- 12:58 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 12:45 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps-test2006.codfw.wmnet
- 12:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T407997)', diff saved to https://phabricator.wikimedia.org/P85084 and previous config saved to /var/cache/conftool/dbconfig/20251107-124415-marostegui.json
- 12:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2174 (T407997)', diff saved to https://phabricator.wikimedia.org/P85083 and previous config saved to /var/cache/conftool/dbconfig/20251107-122347-marostegui.json
- 12:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 12:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T407997)', diff saved to https://phabricator.wikimedia.org/P85082 and previous config saved to /var/cache/conftool/dbconfig/20251107-122324-marostegui.json
- 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps-test2005.codfw.wmnet
- 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 12:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P85081 and previous config saved to /var/cache/conftool/dbconfig/20251107-120816-marostegui.json
- 11:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P85079 and previous config saved to /var/cache/conftool/dbconfig/20251107-115309-marostegui.json
- 11:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T407997)', diff saved to https://phabricator.wikimedia.org/P85078 and previous config saved to /var/cache/conftool/dbconfig/20251107-113801-marostegui.json
- 11:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 11:17 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2173 (T407997)', diff saved to https://phabricator.wikimedia.org/P85077 and previous config saved to /var/cache/conftool/dbconfig/20251107-111737-marostegui.json
- 11:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2173.codfw.wmnet with reason: Maintenance
- 11:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T407997)', diff saved to https://phabricator.wikimedia.org/P85076 and previous config saved to /var/cache/conftool/dbconfig/20251107-111712-marostegui.json
- 11:15 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 11:10 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps-test2005.codfw.wmnet
- 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 11:05 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
- 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps-test2004.codfw.wmnet
- 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 11:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P85075 and previous config saved to /var/cache/conftool/dbconfig/20251107-110204-marostegui.json
- 10:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:53 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P85074 and previous config saved to /var/cache/conftool/dbconfig/20251107-104657-marostegui.json
- 10:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:45 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps-test2004.codfw.wmnet
- 10:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:43 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2093.codfw.wmnet with OS bullseye
- 10:43 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:43 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps-test2003.codfw.wmnet
- 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:39 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:38 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 10:38 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 10:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:35 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:35 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2091.codfw.wmnet with OS bullseye
- 10:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T407997)', diff saved to https://phabricator.wikimedia.org/P85073 and previous config saved to /var/cache/conftool/dbconfig/20251107-103149-marostegui.json
- 10:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:29 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2090.codfw.wmnet with OS bullseye
- 10:29 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 10:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 10:22 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps-test2003.codfw.wmnet
- 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps-test2002.codfw.wmnet
- 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:18 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps-test2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2170 (T407997)', diff saved to https://phabricator.wikimedia.org/P85072 and previous config saved to /var/cache/conftool/dbconfig/20251107-101126-marostegui.json
- 10:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T407997)', diff saved to https://phabricator.wikimedia.org/P85071 and previous config saved to /var/cache/conftool/dbconfig/20251107-101102-marostegui.json
- 10:09 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 10:07 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps-test2002.codfw.wmnet
- 10:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 10:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2164 gradually with 4 steps - Migration of db2164.codfw.wmnet completed
- 09:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P85069 and previous config saved to /var/cache/conftool/dbconfig/20251107-095555-marostegui.json
- 09:54 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 09:54 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 09:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 09:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 09:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P85067 and previous config saved to /var/cache/conftool/dbconfig/20251107-094047-marostegui.json
- 09:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 09:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 09:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T407997)', diff saved to https://phabricator.wikimedia.org/P85065 and previous config saved to /var/cache/conftool/dbconfig/20251107-092539-marostegui.json
- 09:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 09:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 09:17 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2164 gradually with 4 steps - Migration of db2164.codfw.wmnet completed
- 09:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2153 (T407997)', diff saved to https://phabricator.wikimedia.org/P85063 and previous config saved to /var/cache/conftool/dbconfig/20251107-090521-marostegui.json
- 09:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2153.codfw.wmnet with reason: Maintenance
- 09:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T407997)', diff saved to https://phabricator.wikimedia.org/P85062 and previous config saved to /var/cache/conftool/dbconfig/20251107-090457-marostegui.json
- 08:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 08:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host maps-test2001.codfw.wmnet with OS bookworm
- 08:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P85061 and previous config saved to /var/cache/conftool/dbconfig/20251107-084949-marostegui.json
- 08:44 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts dbprov1003.eqiad.wmnet
- 08:44 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:44 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 08:40 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 08:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P85060 and previous config saved to /var/cache/conftool/dbconfig/20251107-083442-marostegui.json
- 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on maps-test2001.codfw.wmnet with reason: host reimage
- 08:32 jynus@cumin1003: START - Cookbook sre.dns.netbox
- 08:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 08:28 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts dbprov1003.eqiad.wmnet
- 08:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on maps-test2001.codfw.wmnet with reason: host reimage
- 08:27 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2164 - Upgrading db2164.codfw.wmnet
- 08:27 fceratto@cumin1003: START - Cookbook sre.mysql.depool db2164 - Upgrading db2164.codfw.wmnet
- 08:27 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 08:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T407997)', diff saved to https://phabricator.wikimedia.org/P85058 and previous config saved to /var/cache/conftool/dbconfig/20251107-081934-marostegui.json
- 08:06 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host maps-test2001.codfw.wmnet with OS bookworm
- 08:00 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbprov2003.codfw.wmnet
- 08:00 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:00 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 07:59 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbprov2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 07:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2146 (T407997)', diff saved to https://phabricator.wikimedia.org/P85057 and previous config saved to /var/cache/conftool/dbconfig/20251107-075857-marostegui.json
- 07:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2146.codfw.wmnet with reason: Maintenance
- 07:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T407997)', diff saved to https://phabricator.wikimedia.org/P85056 and previous config saved to /var/cache/conftool/dbconfig/20251107-075833-marostegui.json
- 07:50 jynus@cumin1003: START - Cookbook sre.dns.netbox
- 07:45 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts dbprov2003.codfw.wmnet
- 07:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P85055 and previous config saved to /var/cache/conftool/dbconfig/20251107-074326-marostegui.json
- 07:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P85054 and previous config saved to /var/cache/conftool/dbconfig/20251107-072818-marostegui.json
- 07:27 moritzm: fix failed logrotation on install1005
- 07:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T407997)', diff saved to https://phabricator.wikimedia.org/P85053 and previous config saved to /var/cache/conftool/dbconfig/20251107-071310-marostegui.json
- 06:52 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Rate-limit by wmfuniq fix, conftool 6 - oblivian@cumin1003"
- 06:52 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Rate-limit by wmfuniq fix, conftool 6 - oblivian@cumin1003
- 06:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2145 (T407997)', diff saved to https://phabricator.wikimedia.org/P85052 and previous config saved to /var/cache/conftool/dbconfig/20251107-065226-marostegui.json
- 06:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2145.codfw.wmnet with reason: Maintenance
- 06:51 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Rate-limit by wmfuniq fix, conftool 6 - oblivian@cumin1003
- 06:51 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Rate-limit by wmfuniq fix, conftool 6 - oblivian@cumin1003"
- 06:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 03:06 ryankemper@cumin1002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 03:05 tstarling@deploy2002: Finished scap sync-world: Backport for Add English translations to namespaces that lack them (T407127), Set robot noindex policy for draft namespaces that lacked it (T407127) (duration: 09m 58s)
- 02:58 tstarling@deploy2002: tstarling: Continuing with sync
- 02:57 tstarling@deploy2002: tstarling: Backport for Add English translations to namespaces that lack them (T407127), Set robot noindex policy for draft namespaces that lacked it (T407127) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 02:55 tstarling@deploy2002: Started scap sync-world: Backport for Add English translations to namespaces that lack them (T407127), Set robot noindex policy for draft namespaces that lacked it (T407127)
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 34s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-11-06
- 23:56 zabe@deploy2002: Finished scap sync-world: Backport for Update for new WikimediaMaintenance script locations (duration: 07m 15s)
- 23:51 zabe@deploy2002: zabe: Continuing with sync
- 23:51 zabe@deploy2002: zabe: Backport for Update for new WikimediaMaintenance script locations synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:48 zabe@deploy2002: Started scap sync-world: Backport for Update for new WikimediaMaintenance script locations
- 23:44 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
- 23:43 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
- 23:13 cjming: end of UTC late backport window
- 23:11 cjming@deploy2002: Finished scap sync-world: Backport for Use wikimedia.org as the "server" for the wiki-agnostic RESTbase specs, Use prefixed 'sub' field in OAuth 2 access tokens on beta cluster (T399199), Re-run xLab MW Module Loaded experiment v2 (T401705) (duration: 08m 34s)
- 23:06 cjming@deploy2002: cjming, tgr, aaron: Continuing with sync
- 23:04 cjming@deploy2002: cjming, tgr, aaron: Backport for Use wikimedia.org as the "server" for the wiki-agnostic RESTbase specs, Use prefixed 'sub' field in OAuth 2 access tokens on beta cluster (T399199), Re-run xLab MW Module Loaded experiment v2 (T401705) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there
- 23:02 cjming@deploy2002: Started scap sync-world: Backport for Use wikimedia.org as the "server" for the wiki-agnostic RESTbase specs, Use prefixed 'sub' field in OAuth 2 access tokens on beta cluster (T399199), Re-run xLab MW Module Loaded experiment v2 (T401705)
- 22:49 catrope@deploy2002: Finished scap sync-world: Backport for AccountRecovery: Use canonical URL in confirmation email, Enable Special:AccountRecovery everywhere (T399742) (duration: 10m 24s)
- 22:46 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 22:42 catrope@deploy2002: catrope: Continuing with sync
- 22:40 catrope@deploy2002: catrope: Backport for AccountRecovery: Use canonical URL in confirmation email, Enable Special:AccountRecovery everywhere (T399742) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:40 ryankemper@cumin1002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 22:38 catrope@deploy2002: Started scap sync-world: Backport for AccountRecovery: Use canonical URL in confirmation email, Enable Special:AccountRecovery everywhere (T399742)
- 22:36 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 22:34 kemayo@deploy2002: Finished scap sync-world: Backport for Edit check: allow any check to be an a/b test including default ones (T406134), Enable editcheck addReference a/b test on enwiki (T406134) (duration: 13m 52s)
- 22:27 kemayo@deploy2002: kemayo: Continuing with sync
- 22:24 kemayo@deploy2002: kemayo: Backport for Edit check: allow any check to be an a/b test including default ones (T406134), Enable editcheck addReference a/b test on enwiki (T406134) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:20 kemayo@deploy2002: Started scap sync-world: Backport for Edit check: allow any check to be an a/b test including default ones (T406134), Enable editcheck addReference a/b test on enwiki (T406134)
- 21:59 ladsgroup@deploy2002: Finished scap sync-world: Backport for Revert "BacklinkCache: Switch order between pr_cascade and links queries", Revert "RestrictionStore: Switch order between pr_cascade and links queries" (duration: 55m 26s)
- 21:55 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 21:55 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 21:38 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 21:29 ladsgroup@deploy2002: ladsgroup: Backport for Revert "BacklinkCache: Switch order between pr_cascade and links queries", Revert "RestrictionStore: Switch order between pr_cascade and links queries" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:13 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 21:13 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added VIP for tcpproxy service in eqiad - dzahn@cumin2002"
- 21:13 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added VIP for tcpproxy service in eqiad - dzahn@cumin2002"
- 21:08 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 21:07 dzahn@dns1004: END - running authdns-update
- 21:04 ladsgroup@deploy2002: Started scap sync-world: Backport for Revert "BacklinkCache: Switch order between pr_cascade and links queries", Revert "RestrictionStore: Switch order between pr_cascade and links queries"
- 21:03 dzahn@dns1004: START - running authdns-update
- 20:57 eileen: civicrm upgraded from 75455a21 to 0f69c4eb
- 20:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T407997)', diff saved to https://phabricator.wikimedia.org/P85050 and previous config saved to /var/cache/conftool/dbconfig/20251106-204120-marostegui.json
- 20:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P85049 and previous config saved to /var/cache/conftool/dbconfig/20251106-202612-marostegui.json
- 20:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P85048 and previous config saved to /var/cache/conftool/dbconfig/20251106-201105-marostegui.json
- 19:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T407997)', diff saved to https://phabricator.wikimedia.org/P85047 and previous config saved to /var/cache/conftool/dbconfig/20251106-195557-marostegui.json
- 19:55 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2092.codfw.wmnet with OS bullseye
- 19:55 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 19:53 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 19:44 andrew@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcontrol1008-dev.eqiad.wmnet']
- 19:43 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 19:39 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2093.codfw.wmnet with reason: host reimage
- 19:39 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on lsw1-d6-eqiad,lsw1-d6-eqiad IPv6,lsw1-d6-eqiad.mgmt with reason: told switch to reboot and its stuck in UEFI shell
- 19:37 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2092.codfw.wmnet with reason: host reimage
- 19:37 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1251 (T407997)', diff saved to https://phabricator.wikimedia.org/P85046 and previous config saved to /var/cache/conftool/dbconfig/20251106-193705-marostegui.json
- 19:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1251.eqiad.wmnet with reason: Maintenance
- 19:34 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2093.codfw.wmnet with reason: host reimage
- 19:34 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2092.codfw.wmnet with reason: host reimage
- 19:33 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2091.codfw.wmnet with reason: host reimage
- 19:31 swfrench-wmf: rolling run-puppet-agent on A:cp hosts for haproxy config change
- 19:29 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2091.codfw.wmnet with reason: host reimage
- 19:27 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2090.codfw.wmnet with reason: host reimage
- 19:21 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2090.codfw.wmnet with reason: host reimage
- 19:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 19:19 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS trixie
- 19:18 swfrench-wmf: disable-puppet on A:cp hosts for haproxy config change
- 19:15 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.1 refs T408271
- 19:06 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wdqs1013.eqiad.wmnet with reason: C/D Migration
- 19:05 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on wcqs1003.eqiad.wmnet with reason: C/D Migration
- 19:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1239.eqiad.wmnet with reason: Maintenance
- 19:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T407997)', diff saved to https://phabricator.wikimedia.org/P85045 and previous config saved to /var/cache/conftool/dbconfig/20251106-190506-marostegui.json
- 19:02 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on puppetserver1001.eqiad.wmnet with reason: C/D Migration
- 18:57 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-test-worker1002.eqiad.wmnet with reason: C/D Migration
- 18:55 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on sessionstore1005.eqiad.wmnet with reason: C/D Migration
- 18:53 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on es1045.eqiad.wmnet with reason: C/D Migration
- 18:52 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2093.codfw.wmnet with OS bullseye
- 18:52 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2092.codfw.wmnet with OS bullseye
- 18:52 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2091.codfw.wmnet with OS bullseye
- 18:51 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2090.codfw.wmnet with OS bullseye
- 18:51 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1262.eqiad.wmnet with reason: C/D Migration
- 18:51 jhancock@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-be2093']
- 18:51 jhancock@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-be2092']
- 18:51 jhancock@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-be2091']
- 18:50 jhancock@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be2093']
- 18:50 jhancock@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-be2090']
- 18:50 jhancock@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be2092']
- 18:50 jhancock@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be2091']
- 18:50 jhancock@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be2090']
- 18:50 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1218.eqiad.wmnet with reason: C/D Migration
- 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P85044 and previous config saved to /var/cache/conftool/dbconfig/20251106-184958-marostegui.json
- 18:49 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1217.eqiad.wmnet with reason: C/D Migration
- 18:46 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1169.eqiad.wmnet with reason: C/D Migration
- 18:44 robh: C5 eqiad c/d server switch migrations in progress
- 18:44 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on db1168.eqiad.wmnet with reason: C/D Migration
- 18:43 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2093.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:43 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2090.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:43 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2092.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:42 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2091.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:41 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on aqs1018.eqiad.wmnet with reason: C/D Migration
- 18:38 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on krb1002.eqiad.wmnet with reason: C/D Migration
- 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P85043 and previous config saved to /var/cache/conftool/dbconfig/20251106-183452-marostegui.json
- 18:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-wikifunctions: apply
- 18:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-wikifunctions: apply
- 18:28 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1048.eqiad.wmnet with reason: C/D Migration
- 18:27 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1047.eqiad.wmnet with reason: C/D Migration
- 18:27 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2093.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:26 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2092.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:26 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1046.eqiad.wmnet with reason: C/D Migration
- 18:25 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2091.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:25 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2090.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:24 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on mc1045.eqiad.wmnet with reason: C/D Migration
- 18:23 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: apply
- 18:22 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: apply
- 18:20 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on kafka-logging1002.eqiad.wmnet with reason: C/D Migration
- 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T407997)', diff saved to https://phabricator.wikimedia.org/P85042 and previous config saved to /var/cache/conftool/dbconfig/20251106-181944-marostegui.json
- 18:18 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on maps1013.eqiad.wmnet with reason: C/D Migration
- 18:18 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:18 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:18 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:17 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:17 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on druid1012.eqiad.wmnet with reason: C/D Migration
- 18:16 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:15 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:15 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:15 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:15 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1234.eqiad.wmnet with reason: C/D Migration
- 18:14 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1221.eqiad.wmnet with reason: C/D Migration
- 18:12 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1220.eqiad.wmnet with reason: C/D Migration
- 18:11 swfrench@deploy2002: Stopping before sync operations
- 18:10 swfrench@deploy2002: Started scap sync-world: No-deployment scap run to switch mw-wikifunctions to PHP 8.3 - T405955
- 18:10 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
- 18:10 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
- 18:10 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:10 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:09 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
- 18:09 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:09 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on an-worker1132.eqiad.wmnet with reason: C/D Migration
- 18:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:09 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
- 18:09 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:06 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:06 robh: Rack C2 C/D switch migrations in progress
- 18:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:05 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-be1092.eqiad.wmnet with reason: C/D Migration
- 18:04 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-be1086.eqiad.wmnet with reason: C/D Migration
- 18:02 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
- 18:02 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 18:01 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on ms-be1066.eqiad.wmnet with reason: C/D Migration
- 18:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1235 (T407997)', diff saved to https://phabricator.wikimedia.org/P85041 and previous config saved to /var/cache/conftool/dbconfig/20251106-180052-marostegui.json
- 18:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1235.eqiad.wmnet with reason: Maintenance
- 18:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T407997)', diff saved to https://phabricator.wikimedia.org/P85040 and previous config saved to /var/cache/conftool/dbconfig/20251106-180028-marostegui.json
- 17:53 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on apus-fe1003.eqiad.wmnet with reason: C/D Migration
- 17:51 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 17:51 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2154 gradually with 4 steps - Migration of db2154.codfw.wmnet completed
- 17:50 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts people2003.codfw.wmnet
- 17:47 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2091.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:46 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2092.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:46 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2093.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:46 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2090.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P85038 and previous config saved to /var/cache/conftool/dbconfig/20251106-174521-marostegui.json
- 17:42 dzahn@cumin2002: START - Cookbook sre.hosts.decommission for hosts people2003.codfw.wmnet
- 17:39 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on people1004.eqiad.wmnet with reason: decom
- 17:39 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on people2003.codfw.wmnet with reason: decom
- 17:38 mutante: shutting down people1004 and people2003 - had already shut them down on Oct 29 but someone or something booted them again T408713
- 17:37 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cp1110.eqiad.wmnet with reason: C/D Migration
- 17:33 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on mc1050.eqiad.wmnet with reason: C/D Migration
- 17:31 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on mc1049.eqiad.wmnet with reason: C/D Migration
- 17:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P85036 and previous config saved to /var/cache/conftool/dbconfig/20251106-173013-marostegui.json
- 17:28 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on wdqs1014.eqiad.wmnet with reason: C/D Migration
- 17:25 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on relforge1008.eqiad.wmnet with reason: C/D Migration
- 17:23 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be2094.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:21 robh: multiple moves from C/D per T405942
- 17:19 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1119.eqiad.wmnet with reason: C/D Migration
- 17:19 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1118.eqiad.wmnet with reason: C/D Migration
- 17:17 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1081.eqiad.wmnet with reason: C/D Migration
- 17:16 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1080.eqiad.wmnet with reason: C/D Migration
- 17:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T407997)', diff saved to https://phabricator.wikimedia.org/P85034 and previous config saved to /var/cache/conftool/dbconfig/20251106-171505-marostegui.json
- 17:14 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on ml-cache1002.eqiad.wmnet with reason: C/D Migration
- 17:13 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2094.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:12 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2093.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:12 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1223.eqiad.wmnet with reason: C/D Migration
- 17:12 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2092.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:11 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2091.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:11 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2090.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:10 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1222.eqiad.wmnet with reason: C/D Migration
- 17:09 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2094
- 17:08 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2094
- 17:08 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2093
- 17:08 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2093
- 17:08 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2092
- 17:08 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2092
- 17:08 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2091
- 17:08 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2091
- 17:08 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2090
- 17:07 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2090
- 17:07 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:07 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-be2090-4 to codfw - jhancock@cumin1003"
- 17:07 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-be2090-4 to codfw - jhancock@cumin1003"
- 17:06 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2154 gradually with 4 steps - Migration of db2154.codfw.wmnet completed
- 17:03 jhancock@cumin1003: START - Cookbook sre.dns.netbox
- 17:01 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-presto1019.eqiad.wmnet with reason: C/D Migration
- 16:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1234 (T407997)', diff saved to https://phabricator.wikimedia.org/P85032 and previous config saved to /var/cache/conftool/dbconfig/20251106-165631-marostegui.json
- 16:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1234.eqiad.wmnet with reason: Maintenance
- 16:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T407997)', diff saved to https://phabricator.wikimedia.org/P85031 and previous config saved to /var/cache/conftool/dbconfig/20251106-165607-marostegui.json
- 16:55 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on ms-fe1011.eqiad.wmnet with reason: C/D Migration
- 16:52 jynus: drop backup grants from m* section primaries T403166
- 16:52 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on prometheus1008.eqiad.wmnet with reason: C/D Migration
- 16:49 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1226.eqiad.wmnet with reason: C/D Migration
- 16:47 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1225.eqiad.wmnet with reason: C/D Migration
- 16:45 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1224.eqiad.wmnet with reason: C/D Migration
- 16:43 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1180.eqiad.wmnet with reason: C/D Migration
- 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P85030 and previous config saved to /var/cache/conftool/dbconfig/20251106-164100-marostegui.json
- 16:40 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1151.eqiad.wmnet with reason: C/D Migration
- 16:39 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2154 - Upgrading db2154.codfw.wmnet
- 16:38 fceratto@cumin1003: START - Cookbook sre.mysql.depool db2154 - Upgrading db2154.codfw.wmnet
- 16:38 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 16:37 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on an-worker1133.eqiad.wmnet with reason: C/D Migration
- 16:35 urbanecm@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
- 16:34 urbanecm@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
- 16:32 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on ms-fe1019.eqiad.wmnet with reason: C/D Migration
- 16:28 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on ms-be1082.eqiad.wmnet with reason: C/D Migration
- 16:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P85028 and previous config saved to /var/cache/conftool/dbconfig/20251106-162552-marostegui.json
- 16:23 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on dbprov1003.eqiad.wmnet with reason: C/D Migration
- 16:21 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1088.eqiad.wmnet with reason: C/D Migration
- 16:18 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1087.eqiad.wmnet with reason: C/D Migration
- 16:17 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1083.eqiad.wmnet with reason: C/D Migration
- 16:17 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1083.eqiad.wmnet with reason: C/D Migration
- 16:16 moritzm: installing sysstat security updates
- 16:16 Amir1: cumin2024@db2205.codfw.wmnet[(none)]> drop database if exists boardvotetest and boardvote2007_test; (T297297)
- 16:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T407997)', diff saved to https://phabricator.wikimedia.org/P85027 and previous config saved to /var/cache/conftool/dbconfig/20251106-161045-marostegui.json
- 16:09 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cirrussearch1082.eqiad.wmnet with reason: C/D Migration
- 16:04 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cp1109.eqiad.wmnet with reason: C/D Migration
- 16:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 16:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2163 gradually with 4 steps - Migration of db2163.codfw.wmnet completed
- 16:02 ejegg: payments-wiki upgraded from c2a4b377 to 1d4b0d2a
- 15:55 dancy@deploy2002: Installation of scap version "4.225.0" completed for 2 hosts
- 15:53 dancy@deploy2002: Installing scap version "4.225.0" for 2 host(s)
- 15:52 robh: cp1108 moving as part of migration
- 15:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1232 (T407997)', diff saved to https://phabricator.wikimedia.org/P85025 and previous config saved to /var/cache/conftool/dbconfig/20251106-155207-marostegui.json
- 15:52 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on cp1108.eqiad.wmnet with reason: C/D Migration
- 15:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1232.eqiad.wmnet with reason: Maintenance
- 15:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T407997)', diff saved to https://phabricator.wikimedia.org/P85024 and previous config saved to /var/cache/conftool/dbconfig/20251106-155143-marostegui.json
- 15:49 jynus: drop grants for dbprov1003 & dbprov2003 T403166
- 15:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P85022 and previous config saved to /var/cache/conftool/dbconfig/20251106-153636-marostegui.json
- 15:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P85019 and previous config saved to /var/cache/conftool/dbconfig/20251106-152129-marostegui.json
- 15:18 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2163 gradually with 4 steps - Migration of db2163.codfw.wmnet completed
- 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device fasw2-c1a-eqiad
- 15:10 jmm@cumin2002: START - Cookbook sre.network.tls for network device fasw2-c1a-eqiad
- 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr2-eqsin
- 15:10 jmm@cumin2002: START - Cookbook sre.network.tls for network device cr2-eqsin
- 15:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T407997)', diff saved to https://phabricator.wikimedia.org/P85017 and previous config saved to /var/cache/conftool/dbconfig/20251106-150622-marostegui.json
- 15:03 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-e1-eqiad
- 15:03 jmm@cumin2002: START - Cookbook sre.network.tls for network device ssw1-e1-eqiad
- 15:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-f1-eqiad
- 15:02 jmm@cumin2002: START - Cookbook sre.network.tls for network device ssw1-f1-eqiad
- 15:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-a2-codfw
- 15:02 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-a2-codfw
- 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-a3-codfw
- 15:00 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-a3-codfw
- 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-a4-codfw
- 15:00 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-a4-codfw
- 14:59 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-a6-codfw
- 14:59 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-a6-codfw
- 14:59 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-a5-codfw
- 14:59 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-a5-codfw
- 14:57 tappof: bump space for prometheus k8s-dse in eqiad
- 14:55 Lucas_WMDE: UTC afternoon backport+config window done
- 14:55 Lucas_WMDE: Deployed security patch for T409423
- 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change dns for row c gateway interfaces eqiad CRs - cmooney@cumin1003"
- 14:54 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change dns for row c gateway interfaces eqiad CRs - cmooney@cumin1003"
- 14:53 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-a8-codfw
- 14:53 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-a8-codfw
- 14:53 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-a7-codfw
- 14:53 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-a7-codfw
- 14:50 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2163 - Upgrading db2163.codfw.wmnet
- 14:50 fceratto@cumin1003: START - Cookbook sre.mysql.depool db2163 - Upgrading db2163.codfw.wmnet
- 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 14:47 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 14:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1219 (T407997)', diff saved to https://phabricator.wikimedia.org/P85015 and previous config saved to /var/cache/conftool/dbconfig/20251106-144714-marostegui.json
- 14:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 14:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T407997)', diff saved to https://phabricator.wikimedia.org/P85014 and previous config saved to /var/cache/conftool/dbconfig/20251106-144650-marostegui.json
- 14:46 logmsgbot: lucaswerkmeister-wmde Deployed security patch for T409423
- 14:40 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-b2-codfw
- 14:40 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-b2-codfw
- 14:40 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-b5-codfw
- 14:39 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-b5-codfw
- 14:39 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-b3-codfw
- 14:39 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-b3-codfw
- 14:37 moritzm: installing bind security updates (client-side tools/libs only)
- 14:36 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-b4-codfw
- 14:36 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-b4-codfw
- 14:35 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-b6-codfw
- 14:35 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-b6-codfw
- 14:35 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-b8-codfw
- 14:35 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-b8-codfw
- 14:34 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-b7-codfw
- 14:34 elukey@cumin1003: START - Cookbook sre.network.tls for network device lsw1-b7-codfw
- 14:34 lucaswerkmeister-wmde@deploy2002: mwscript-k8s job started: namespaceDupes tcywiki --fix # T328207
- 14:34 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-a1-codfw
- 14:34 elukey@cumin1003: START - Cookbook sre.network.tls for network device ssw1-a1-codfw
- 14:33 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [tcywiki] Add Portal and Draft namespaces and its talk (T409329) (duration: 08m 12s)
- 14:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 14:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2152 gradually with 4 steps - Migration of db2152.codfw.wmnet completed
- 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P85012 and previous config saved to /var/cache/conftool/dbconfig/20251106-143142-marostegui.json
- 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-a8-codfw
- 14:31 elukey@cumin1003: START - Cookbook sre.network.tls for network device ssw1-a8-codfw
- 14:29 lucaswerkmeister-wmde@deploy2002: superpes, lucaswerkmeister-wmde: Continuing with sync
- 14:27 topranks: move private1-c-eqiad sub-interface from ae3 to et-1/0/5 on cr1-eqiad (T405579)
- 14:27 lucaswerkmeister-wmde@deploy2002: superpes, lucaswerkmeister-wmde: Backport for [tcywiki] Add Portal and Draft namespaces and its talk (T409329) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:25 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [tcywiki] Add Portal and Draft namespaces and its talk (T409329)
- 14:23 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 14:23 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Update types for WatchArticleHook/UnwatchArticleHook, LQT Import: Fix quadratic time explosion in finding next offset (T405080) (duration: 07m 26s)
- 14:23 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 14:20 topranks: move private1-c-eqiad sub-interface from ae3 to et-1/0/5 on cr2-eqiad (T405579)
- 14:19 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 14:19 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 14:19 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, tchanders: Continuing with sync
- 14:18 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, tchanders: Backport for Update types for WatchArticleHook/UnwatchArticleHook, LQT Import: Fix quadratic time explosion in finding next offset (T405080) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P85010 and previous config saved to /var/cache/conftool/dbconfig/20251106-141635-marostegui.json
- 14:16 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Update types for WatchArticleHook/UnwatchArticleHook, LQT Import: Fix quadratic time explosion in finding next offset (T405080)
- 14:13 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for EventStreamConfig: Remove mediawiki.reference_previews stream (T242127), EventStreamConfig: Remove mediawiki.wikistories_* streams (T408178) (duration: 09m 12s)
- 14:09 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 14:09 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, phuedx: Continuing with sync
- 14:09 kartik@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 14:07 topranks: move public1-c-eqiad sub-interface from ae3 to et-1/0/5 on cr1-eqiad (T405579)
- 14:07 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, phuedx: Backport for EventStreamConfig: Remove mediawiki.reference_previews stream (T242127), EventStreamConfig: Remove mediawiki.wikistories_* streams (T408178) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:04 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for EventStreamConfig: Remove mediawiki.reference_previews stream (T242127), EventStreamConfig: Remove mediawiki.wikistories_* streams (T408178)
- 14:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T407997)', diff saved to https://phabricator.wikimedia.org/P85008 and previous config saved to /var/cache/conftool/dbconfig/20251106-140127-marostegui.json
- 14:00 topranks: move public1-c-eqiad sub-interface from ae3 to et-1/0/5 on cr2-eqiad (T405579)
- 13:57 kartik@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2152 gradually with 4 steps - Migration of db2152.codfw.wmnet completed
- 13:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1218 (T407997)', diff saved to https://phabricator.wikimedia.org/P85005 and previous config saved to /var/cache/conftool/dbconfig/20251106-134013-marostegui.json
- 13:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1218.eqiad.wmnet with reason: Maintenance
- 13:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T407997)', diff saved to https://phabricator.wikimedia.org/P85004 and previous config saved to /var/cache/conftool/dbconfig/20251106-133949-marostegui.json
- 13:36 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2152 - Upgrading db2152.codfw.wmnet
- 13:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P85002 and previous config saved to /var/cache/conftool/dbconfig/20251106-132442-marostegui.json
- 13:20 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 13:20 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 13:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 13:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 13:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P85001 and previous config saved to /var/cache/conftool/dbconfig/20251106-130934-marostegui.json
- 13:09 fceratto@cumin1003: START - Cookbook sre.mysql.depool db2152 - Upgrading db2152.codfw.wmnet
- 13:08 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 12:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T407997)', diff saved to https://phabricator.wikimedia.org/P85000 and previous config saved to /var/cache/conftool/dbconfig/20251106-125427-marostegui.json
- 12:36 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1206 (T407997)', diff saved to https://phabricator.wikimedia.org/P84999 and previous config saved to /var/cache/conftool/dbconfig/20251106-123507-marostegui.json
- 12:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1206.eqiad.wmnet with reason: Maintenance
- 12:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T407997)', diff saved to https://phabricator.wikimedia.org/P84998 and previous config saved to /var/cache/conftool/dbconfig/20251106-123444-marostegui.json
- 12:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P84996 and previous config saved to /var/cache/conftool/dbconfig/20251106-121937-marostegui.json
- 12:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P84995 and previous config saved to /var/cache/conftool/dbconfig/20251106-120429-marostegui.json
- 11:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T407997)', diff saved to https://phabricator.wikimedia.org/P84994 and previous config saved to /var/cache/conftool/dbconfig/20251106-114921-marostegui.json
- 11:39 cmooney@cumin1003: START - Cookbook sre.hosts.dhcp for host sretest1005.eqiad.wmnet
- 11:38 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 11:37 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 11:37 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 11:36 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 11:36 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host sretest1006.eqiad.wmnet
- 11:29 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1196 (T407997)', diff saved to https://phabricator.wikimedia.org/P84993 and previous config saved to /var/cache/conftool/dbconfig/20251106-112910-marostegui.json
- 11:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 11:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1196.eqiad.wmnet with reason: Maintenance
- 11:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T407997)', diff saved to https://phabricator.wikimedia.org/P84992 and previous config saved to /var/cache/conftool/dbconfig/20251106-112827-marostegui.json
- 11:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P84990 and previous config saved to /var/cache/conftool/dbconfig/20251106-111319-marostegui.json
- 10:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P84989 and previous config saved to /var/cache/conftool/dbconfig/20251106-105812-marostegui.json
- 10:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T407997)', diff saved to https://phabricator.wikimedia.org/P84988 and previous config saved to /var/cache/conftool/dbconfig/20251106-104304-marostegui.json
- 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1195 (T407997)', diff saved to https://phabricator.wikimedia.org/P84986 and previous config saved to /var/cache/conftool/dbconfig/20251106-101954-marostegui.json
- 10:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1195.eqiad.wmnet with reason: Maintenance
- 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T407997)', diff saved to https://phabricator.wikimedia.org/P84985 and previous config saved to /var/cache/conftool/dbconfig/20251106-101929-marostegui.json
- 10:15 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.clone (exit_code=97) of db1176.eqiad.wmnet onto db2230.codfw.wmnet
- 10:05 brouberol@dns1004: END - running authdns-update
- 10:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P84984 and previous config saved to /var/cache/conftool/dbconfig/20251106-100421-marostegui.json
- 10:04 brouberol@dns1004: START - running authdns-update
- 09:59 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db1176.eqiad.wmnet onto db2230.codfw.wmnet
- 09:57 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db1176.eqiad.wmnet onto db2230.codfw.wmnet
- 09:56 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db1176.eqiad.wmnet onto db2230.codfw.wmnet
- 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps1009.eqiad.wmnet
- 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1009.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1009.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:53 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
- 09:53 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
- 09:52 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
- 09:51 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
- 09:51 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 09:50 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 09:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 09:50 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 09:50 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 09:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P84983 and previous config saved to /var/cache/conftool/dbconfig/20251106-094914-marostegui.json
- 09:40 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps1009.eqiad.wmnet
- 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps1010.eqiad.wmnet
- 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:37 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T407997)', diff saved to https://phabricator.wikimedia.org/P84982 and previous config saved to /var/cache/conftool/dbconfig/20251106-093406-marostegui.json
- 09:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 09:32 mszwarc@deploy2002: Finished scap sync-world: Backport for Revert "Use OutputPageBeforeHTML instead of BeforePageDisplay to add modules" (T409367) (duration: 08m 52s)
- 09:28 mszwarc@deploy2002: mszwarc: Continuing with sync
- 09:27 mszwarc@deploy2002: mszwarc: Backport for Revert "Use OutputPageBeforeHTML instead of BeforePageDisplay to add modules" (T409367) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:27 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps1010.eqiad.wmnet
- 09:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps1008.eqiad.wmnet
- 09:26 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:26 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1008.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:24 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1008.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:24 brouberol@dns1004: END - running authdns-update
- 09:24 mszwarc@deploy2002: Started scap sync-world: Backport for Revert "Use OutputPageBeforeHTML instead of BeforePageDisplay to add modules" (T409367)
- 09:23 brouberol@dns1004: START - running authdns-update
- 09:20 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 09:20 brouberol@dns1004: END - running authdns-update
- 09:20 elukey: upgrade python3-conftool and spicerack on cumin1003 and cumin2002 hosts
- 09:19 brouberol@dns1004: START - running authdns-update
- 09:19 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 09:18 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1186 (T407997)', diff saved to https://phabricator.wikimedia.org/P84981 and previous config saved to /var/cache/conftool/dbconfig/20251106-091401-marostegui.json
- 09:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1186.eqiad.wmnet with reason: Maintenance
- 09:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T407997)', diff saved to https://phabricator.wikimedia.org/P84980 and previous config saved to /var/cache/conftool/dbconfig/20251106-091337-marostegui.json
- 09:12 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 09:12 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 09:12 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps1008.eqiad.wmnet
- 09:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps1007.eqiad.wmnet
- 09:11 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:11 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1007.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1007.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 09:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 09:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P84979 and previous config saved to /var/cache/conftool/dbconfig/20251106-085830-marostegui.json
- 08:56 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps1007.eqiad.wmnet
- 08:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps1006.eqiad.wmnet
- 08:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1006.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1006.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:51 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 08:51 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:51 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 08:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 08:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:44 dcausse: UTC morning backport window done
- 08:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P84978 and previous config saved to /var/cache/conftool/dbconfig/20251106-084322-marostegui.json
- 08:42 dcausse@deploy2002: Finished scap sync-world: Backport for "hide logged in users" is no longer working with "non-JavaScript interface" (T409157), cirrus: enable default_sort on en, fr and he wikipedias (T404858), cirrus: enable alt index with default_sort on a set of wikis (T404858) (duration: 12m 49s)
- 08:41 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps1006.eqiad.wmnet
- 08:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps1005.eqiad.wmnet
- 08:41 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:41 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:40 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:37 dcausse@deploy2002: dcausse, tstarling: Continuing with sync
- 08:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:32 dcausse@deploy2002: dcausse, tstarling: Backport for "hide logged in users" is no longer working with "non-JavaScript interface" (T409157), cirrus: enable default_sort on en, fr and he wikipedias (T404858), cirrus: enable alt index with default_sort on a set of wikis (T404858) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes c
- 08:30 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps1005.eqiad.wmnet
- 08:29 dcausse@deploy2002: Started scap sync-world: Backport for "hide logged in users" is no longer working with "non-JavaScript interface" (T409157), cirrus: enable default_sort on en, fr and he wikipedias (T404858), cirrus: enable alt index with default_sort on a set of wikis (T404858)
- 08:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T407997)', diff saved to https://phabricator.wikimedia.org/P84977 and previous config saved to /var/cache/conftool/dbconfig/20251106-082814-marostegui.json
- 08:23 brouberol@dns1004: END - running authdns-update
- 08:22 brouberol@dns1004: START - running authdns-update
- 08:22 brouberol@dns1004: START - running authdns-update
- 08:21 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts maps2009.codfw.wmnet
- 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1193 gradually with 4 steps - Repooling after upgrade
- 08:13 kharlan@deploy2002: Finished scap sync-world: Backport for Allow temporary accounts to create in fawiki/enwiki Draft namespace (T409366) (duration: 10m 07s)
- 08:09 kharlan@deploy2002: kharlan, novemlinguae: Continuing with sync
- 08:07 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1169 (T407997)', diff saved to https://phabricator.wikimedia.org/P84975 and previous config saved to /var/cache/conftool/dbconfig/20251106-080746-marostegui.json
- 08:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 08:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T407997)', diff saved to https://phabricator.wikimedia.org/P84974 and previous config saved to /var/cache/conftool/dbconfig/20251106-080723-marostegui.json
- 08:06 kharlan@deploy2002: kharlan, novemlinguae: Backport for Allow temporary accounts to create in fawiki/enwiki Draft namespace (T409366) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:05 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps2009.codfw.wmnet
- 08:03 kharlan@deploy2002: Started scap sync-world: Backport for Allow temporary accounts to create in fawiki/enwiki Draft namespace (T409366)
- 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P84972 and previous config saved to /var/cache/conftool/dbconfig/20251106-075215-marostegui.json
- 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P84970 and previous config saved to /var/cache/conftool/dbconfig/20251106-073707-marostegui.json
- 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1193 gradually with 4 steps - Repooling after upgrade
- 07:28 musikanimal@deploy2002: Finished scap sync-world: Backport for Hide the WikiEditor search button (duration: 107m 34s)
- 07:23 musikanimal@deploy2002: musikanimal: Continuing with sync
- 07:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1193 - Depool db1193 for migration to mariadb 10.11
- 07:22 marostegui@cumin1003: START - Cookbook sre.mysql.depool db1193 - Depool db1193 for migration to mariadb 10.11
- 07:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1193.eqiad.wmnet with reason: Maintenance
- 07:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T407997)', diff saved to https://phabricator.wikimedia.org/P84968 and previous config saved to /var/cache/conftool/dbconfig/20251106-072200-marostegui.json
- 07:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1193 T409299', diff saved to https://phabricator.wikimedia.org/P84967 and previous config saved to /var/cache/conftool/dbconfig/20251106-071949-marostegui.json
- 07:19 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1209 to s8 primary T409299', diff saved to https://phabricator.wikimedia.org/P84966 and previous config saved to /var/cache/conftool/dbconfig/20251106-071911-marostegui.json
- 07:18 marostegui: Starting s8 eqiad failover from db1193 to db1209 - T409299
- 07:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s8 T409299
- 07:15 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1209 with weight 0 T409299', diff saved to https://phabricator.wikimedia.org/P84965 and previous config saved to /var/cache/conftool/dbconfig/20251106-071506-marostegui.json
- 07:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1163 (T407997)', diff saved to https://phabricator.wikimedia.org/P84964 and previous config saved to /var/cache/conftool/dbconfig/20251106-070128-marostegui.json
- 07:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 06:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2205.codfw.wmnet with reason: Maintenance
- 05:42 musikanimal@deploy2002: musikanimal: Backport for Hide the WikiEditor search button synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 05:40 musikanimal@deploy2002: Started scap sync-world: Backport for Hide the WikiEditor search button
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 32s)
- 01:01 wfan: civicrm upgraded from f1f68f1c to 75455a21
- 01:01 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:49 ryankemper@cumin1002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 00:37 eileen: civicrm upgraded from 0f49dd1d to f1f68f1c
- 00:29 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1262.eqiad.wmnet with reason: HW issues, T409374
- 00:27 cdobbins@cumin2002: dbctl commit (dc=all): 'Depool db1262', diff saved to https://phabricator.wikimedia.org/P84962 and previous config saved to /var/cache/conftool/dbconfig/20251106-002737-cdobbins.json
2025-11-05
- 23:35 larssandergreen: civicrm upgraded from 40198c3f to 0f49dd1d
- 23:26 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-misc: apply
- 23:25 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-misc: apply
- 23:19 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-misc: apply
- 23:19 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-misc: apply
- 23:17 swfrench@deploy2002: Stopping before sync operations
- 23:17 swfrench@deploy2002: Started scap sync-world: No-deployment scap run to switch mw-misc to PHP 8.3 - T405955
- 23:00 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 23:00 ryankemper@cumin1002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 23:00 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin1002 - T390860
- 22:58 ryankemper@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=cirrussearch2089.codfw.wmnet
- 22:48 ryankemper@cumin1002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart (test new spicerack version) - ryankemper@cumin1002 - T390860
- 22:22 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart (test new spicerack version) - ryankemper@cumin1002 - T390860
- 22:19 ryankemper@cumin1002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_eqiad
- 22:19 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_eqiad
- 22:19 ryankemper@cumin1002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cirrussearch1068.eqiad.wmnet for test new spicerack elasticsearch library - ryankemper@cumin1002 - T390860
- 22:19 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.ban Banning hosts: cirrussearch1068.eqiad.wmnet for test new spicerack elasticsearch library - ryankemper@cumin1002 - T390860
- 22:18 ryankemper@cumin1002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cirrussearch1068.eqiad.wmnet for test new spicerack elasticsearch library - ryankemper@cumin1002 - T390860
- 22:18 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.ban Banning hosts: cirrussearch1068.eqiad.wmnet for test new spicerack elasticsearch library - ryankemper@cumin1002 - T390860
- 22:17 ryankemper@cumin1002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: foobar1001.eqiad.wmnet for test new spicerack elasticsearch library - ryankemper@cumin1002 - T390860
- 22:17 ryankemper@cumin1002: START - Cookbook sre.elasticsearch.ban Banning hosts: foobar1001.eqiad.wmnet for test new spicerack elasticsearch library - ryankemper@cumin1002 - T390860
- 22:14 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy7001.magru.wmnet with OS trixie
- 22:13 ryankemper: [WDQS] Restarting blazegraph across all codfw `wdqs-main` hosts, hoping it resolves the lag issues although it's likely that it won't
- 22:12 ryankemper: T366248 `sudo rm -rfv /srv/dumps/xmldatadumps/public/other/cirrus_search_index/cirrus-search-index/` on `clouddumps100[1,2].wikimedia.org`
- 21:58 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy7001.magru.wmnet with reason: host reimage
- 21:53 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy7001.magru.wmnet with reason: host reimage
- 21:51 catrope@deploy2002: Finished scap sync-world: Backport for Configure HTTP proxy for EmailAuth AccountRecovery (T399742) (duration: 08m 01s)
- 21:47 catrope@deploy2002: catrope: Continuing with sync
- 21:46 catrope@deploy2002: catrope: Backport for Configure HTTP proxy for EmailAuth AccountRecovery (T399742) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:43 catrope@deploy2002: Started scap sync-world: Backport for Configure HTTP proxy for EmailAuth AccountRecovery (T399742)
- 21:37 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2239.codfw.wmnet with reason: Maintenance
- 21:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T407997)', diff saved to https://phabricator.wikimedia.org/P84960 and previous config saved to /var/cache/conftool/dbconfig/20251105-213734-marostegui.json
- 21:30 catrope@deploy2002: Finished scap sync-world: Backport for Set up Special:AccountRecovery and enable on testwiki (T399742) (duration: 08m 21s)
- 21:25 catrope@deploy2002: catrope: Continuing with sync
- 21:25 larssandergreen: civicrm upgraded from 8efb2be1 to 40198c3f
- 21:24 catrope@deploy2002: catrope: Backport for Set up Special:AccountRecovery and enable on testwiki (T399742) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:23 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy7001.magru.wmnet with OS trixie
- 21:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P84958 and previous config saved to /var/cache/conftool/dbconfig/20251105-212226-marostegui.json
- 21:21 catrope@deploy2002: Started scap sync-world: Backport for Set up Special:AccountRecovery and enable on testwiki (T399742)
- 21:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P84957 and previous config saved to /var/cache/conftool/dbconfig/20251105-210718-marostegui.json
- 20:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T407997)', diff saved to https://phabricator.wikimedia.org/P84955 and previous config saved to /var/cache/conftool/dbconfig/20251105-205211-marostegui.json
- 20:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2227 (T407997)', diff saved to https://phabricator.wikimedia.org/P84953 and previous config saved to /var/cache/conftool/dbconfig/20251105-203438-marostegui.json
- 20:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2227.codfw.wmnet with reason: Maintenance
- 20:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T407997)', diff saved to https://phabricator.wikimedia.org/P84952 and previous config saved to /var/cache/conftool/dbconfig/20251105-203413-marostegui.json
- 20:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P84951 and previous config saved to /var/cache/conftool/dbconfig/20251105-201905-marostegui.json
- 20:09 dancy@deploy2002: Installation of scap version "4.224.0" completed for 2 hosts
- 20:07 dancy@deploy2002: Installing scap version "4.224.0" for 2 host(s)
- 20:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P84950 and previous config saved to /var/cache/conftool/dbconfig/20251105-200357-marostegui.json
- 19:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T407997)', diff saved to https://phabricator.wikimedia.org/P84949 and previous config saved to /var/cache/conftool/dbconfig/20251105-194850-marostegui.json
- 19:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2209 (T407997)', diff saved to https://phabricator.wikimedia.org/P84948 and previous config saved to /var/cache/conftool/dbconfig/20251105-193126-marostegui.json
- 19:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2209.codfw.wmnet with reason: Maintenance
- 19:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T407997)', diff saved to https://phabricator.wikimedia.org/P84947 and previous config saved to /var/cache/conftool/dbconfig/20251105-193102-marostegui.json
- 19:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P84946 and previous config saved to /var/cache/conftool/dbconfig/20251105-191553-marostegui.json
- 19:13 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.1 refs T408271
- 19:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P84945 and previous config saved to /var/cache/conftool/dbconfig/20251105-190046-marostegui.json
- 18:55 larssandergreen: tools upgraded from 8e3ed11c to 773e8d11
- 18:52 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 18:51 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 18:46 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 18:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T407997)', diff saved to https://phabricator.wikimedia.org/P84944 and previous config saved to /var/cache/conftool/dbconfig/20251105-184538-marostegui.json
- 18:45 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 18:42 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 18:42 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 18:40 swfrench@deploy2002: Stopping before sync operations
- 18:39 swfrench@deploy2002: Started scap sync-world: No-deployment scap run to switch mw-parsoid to PHP 8.3 - T405955
- 18:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2194 (T407997)', diff saved to https://phabricator.wikimedia.org/P84943 and previous config saved to /var/cache/conftool/dbconfig/20251105-182805-marostegui.json
- 18:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2194.codfw.wmnet with reason: Maintenance
- 18:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T407997)', diff saved to https://phabricator.wikimedia.org/P84942 and previous config saved to /var/cache/conftool/dbconfig/20251105-182741-marostegui.json
- 18:25 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:25 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:25 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:25 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:20 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P84941 and previous config saved to /var/cache/conftool/dbconfig/20251105-181233-marostegui.json
- 18:08 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:07 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:07 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 17:58 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 17:58 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2161 gradually with 4 steps - Migration of db2161.codfw.wmnet completed
- 17:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P84939 and previous config saved to /var/cache/conftool/dbconfig/20251105-175726-marostegui.json
- 17:53 ejegg: donorwiki upgraded from 8fe00530 to c2a4b377
- 17:53 ejegg: payments-wiki upgraded from 8fe00530 to c2a4b377
- 17:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T407997)', diff saved to https://phabricator.wikimedia.org/P84937 and previous config saved to /var/cache/conftool/dbconfig/20251105-174218-marostegui.json
- 17:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2190 (T407997)', diff saved to https://phabricator.wikimedia.org/P84935 and previous config saved to /var/cache/conftool/dbconfig/20251105-172347-marostegui.json
- 17:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2190.codfw.wmnet with reason: Maintenance
- 17:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T407997)', diff saved to https://phabricator.wikimedia.org/P84934 and previous config saved to /var/cache/conftool/dbconfig/20251105-172324-marostegui.json
- 17:12 fceratto@cumin1003: START - Cookbook sre.mysql.pool db2161 gradually with 4 steps - Migration of db2161.codfw.wmnet completed
- 17:10 swfrench-wmf: rolling run-puppet-agent on A:cp hosts for haproxy config change
- 17:08 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 17:08 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 17:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P84932 and previous config saved to /var/cache/conftool/dbconfig/20251105-170816-marostegui.json
- 17:08 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 17:08 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 17:00 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 17:00 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 17:00 swfrench-wmf: disable-puppet on A:cp hosts for haproxy config change
- 16:58 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
- 16:58 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
- 16:58 dancy@deploy2002: Installation of scap version "4.223.0" completed for 2 hosts
- 16:56 dancy@deploy2002: Installing scap version "4.223.0" for 2 host(s)
- 16:55 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Update BetaFeatures comments (duration: 07m 38s)
- 16:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P84931 and previous config saved to /var/cache/conftool/dbconfig/20251105-165309-marostegui.json
- 16:51 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Continuing with sync
- 16:50 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for Update BetaFeatures comments synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:48 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Update BetaFeatures comments
- 16:47 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 16:47 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 16:47 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 16:47 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:47 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 16:46 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:44 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:44 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:42 papaul_: pfw1a/b-codfw Junos downgrade complete
- 16:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T407997)', diff saved to https://phabricator.wikimedia.org/P84930 and previous config saved to /var/cache/conftool/dbconfig/20251105-163801-marostegui.json
- 16:36 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2161 - Upgrading db2161.codfw.wmnet
- 16:36 fceratto@cumin1003: START - Cookbook sre.mysql.depool db2161 - Upgrading db2161.codfw.wmnet
- 16:35 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 16:31 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 16:30 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 16:29 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 16:29 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 16:27 javiermonton@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 16:27 javiermonton@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 16:27 javiermonton@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 16:27 javiermonton@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
- 16:26 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:26 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:24 topranks: add peering to NL-ix route servers from drmrs T386986
- 16:21 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:21 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:20 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2177 (T407997)', diff saved to https://phabricator.wikimedia.org/P84928 and previous config saved to /var/cache/conftool/dbconfig/20251105-162055-marostegui.json
- 16:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2177.codfw.wmnet with reason: Maintenance
- 16:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T407997)', diff saved to https://phabricator.wikimedia.org/P84927 and previous config saved to /var/cache/conftool/dbconfig/20251105-162032-marostegui.json
- 16:15 javiermonton@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 16:14 javiermonton@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
- 16:12 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:12 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:08 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:08 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:06 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
- 16:06 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
- 16:05 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 16:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P84926 and previous config saved to /var/cache/conftool/dbconfig/20251105-160523-marostegui.json
- 16:05 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 16:04 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 16:04 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 16:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2087.codfw.wmnet with OS bullseye
- 16:02 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:02 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:00 papaul_: ongoing pfw1b-codfw Junos downgrade
- 15:58 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:58 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P84924 and previous config saved to /var/cache/conftool/dbconfig/20251105-155015-marostegui.json
- 15:47 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pfw1-codfw with reason: pfw1a/b-codfw
- 15:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2087.codfw.wmnet with reason: host reimage
- 15:46 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
- 15:45 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:45 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add ipv6 reverse dns for nl-ix port marseille - cmooney@cumin1003"
- 15:45 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
- 15:45 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add ipv6 reverse dns for nl-ix port marseille - cmooney@cumin1003"
- 15:45 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:45 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 15:45 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:45 moritzm: running racadm racreset on maps2009
- 15:45 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 15:45 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 15:44 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 15:44 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 15:44 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2087.codfw.wmnet with reason: host reimage
- 15:43 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 15:43 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 15:43 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 15:39 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps2008.codfw.wmnet
- 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2008.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 15:38 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 15:38 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 15:36 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2008.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 15:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T407997)', diff saved to https://phabricator.wikimedia.org/P84922 and previous config saved to /var/cache/conftool/dbconfig/20251105-153508-marostegui.json
- 15:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 15:31 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2087.codfw.wmnet with OS bullseye
- 15:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2086.codfw.wmnet with OS bullseye
- 15:30 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:29 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:27 marostegui@cumin1003: dbctl commit (dc=all): 'db1209 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84921 and previous config saved to /var/cache/conftool/dbconfig/20251105-152716-root.json
- 15:27 jforrester@deploy2002: Finished scap sync-world: Backport for Enable embedded Wikifunctions calls on bnwiki and seven Wiktionaries (T406342) (duration: 09m 35s)
- 15:26 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:26 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:26 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:25 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:24 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps2008.codfw.wmnet
- 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps2007.codfw.wmnet
- 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2007.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 15:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2007.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 15:20 jforrester@deploy2002: jforrester: Continuing with sync
- 15:19 jforrester@deploy2002: jforrester: Backport for Enable embedded Wikifunctions calls on bnwiki and seven Wiktionaries (T406342) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2156 (T407997)', diff saved to https://phabricator.wikimedia.org/P84920 and previous config saved to /var/cache/conftool/dbconfig/20251105-151802-marostegui.json
- 15:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2156.codfw.wmnet with reason: Maintenance
- 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T407997)', diff saved to https://phabricator.wikimedia.org/P84919 and previous config saved to /var/cache/conftool/dbconfig/20251105-151738-marostegui.json
- 15:17 jforrester@deploy2002: Started scap sync-world: Backport for Enable embedded Wikifunctions calls on bnwiki and seven Wiktionaries (T406342)
- 15:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 15:15 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:14 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:14 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2086.codfw.wmnet with reason: host reimage
- 15:13 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:13 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:13 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:12 marostegui@cumin1003: dbctl commit (dc=all): 'db1209 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84918 and previous config saved to /var/cache/conftool/dbconfig/20251105-151210-root.json
- 15:11 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps2007.codfw.wmnet
- 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps2006.codfw.wmnet
- 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 15:10 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:10 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:10 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 15:09 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2086.codfw.wmnet with reason: host reimage
- 15:09 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:08 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:07 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:07 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:06 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P84917 and previous config saved to /var/cache/conftool/dbconfig/20251105-150230-marostegui.json
- 15:02 Lucas_WMDE: UTC afternoon backport+config window done
- 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'db1209 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84916 and previous config saved to /var/cache/conftool/dbconfig/20251105-145704-root.json
- 14:56 arthurtaylor@deploy2002: Finished scap sync-world: Backport for Revert "Enable the MEX / wbui2025 beta feature on testwikidata" (T407737) (duration: 08m 23s)
- 14:56 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2086.codfw.wmnet with OS bullseye
- 14:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2085.codfw.wmnet with OS bullseye
- 14:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 (T403362)', diff saved to https://phabricator.wikimedia.org/P84915 and previous config saved to /var/cache/conftool/dbconfig/20251105-145457-ladsgroup.json
- 14:52 arthurtaylor@deploy2002: arthurtaylor: Continuing with sync
- 14:50 arthurtaylor@deploy2002: arthurtaylor: Backport for Revert "Enable the MEX / wbui2025 beta feature on testwikidata" (T407737) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:48 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:48 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:48 arthurtaylor@deploy2002: Started scap sync-world: Backport for Revert "Enable the MEX / wbui2025 beta feature on testwikidata" (T407737)
- 14:47 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1220* gradually with 4 steps - Work done
- 14:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P84913 and previous config saved to /var/cache/conftool/dbconfig/20251105-144723-marostegui.json
- 14:45 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:45 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:45 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:45 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:42 arthurtaylor@deploy2002: Sync cancelled.
- 14:41 marostegui@cumin1003: dbctl commit (dc=all): 'db1209 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84912 and previous config saved to /var/cache/conftool/dbconfig/20251105-144158-root.json
- 14:41 elukey: uploaded spicerack_12.0.0 to apt.wikimedia.org bullseye-wikimedia,bookworm-wikimedia
- 14:39 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P84911 and previous config saved to /var/cache/conftool/dbconfig/20251105-143949-ladsgroup.json
- 14:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2085.codfw.wmnet with reason: host reimage
- 14:35 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
- 14:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1209 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84910 and previous config saved to /var/cache/conftool/dbconfig/20251105-143419-marostegui.json
- 14:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1209.eqiad.wmnet with reason: Maintenance
- 14:33 arthurtaylor@deploy2002: arthurtaylor: Backport for Enable the MEX / wbui2025 beta feature on testwikidata (T407737) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T407997)', diff saved to https://phabricator.wikimedia.org/P84908 and previous config saved to /var/cache/conftool/dbconfig/20251105-143215-marostegui.json
- 14:31 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2085.codfw.wmnet with reason: host reimage
- 14:30 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:30 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 14:30 arthurtaylor@deploy2002: Started scap sync-world: Backport for Enable the MEX / wbui2025 beta feature on testwikidata (T407737)
- 14:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P84907 and previous config saved to /var/cache/conftool/dbconfig/20251105-142441-ladsgroup.json
- 14:24 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
- 14:24 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
- 14:24 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
- 14:19 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2085.codfw.wmnet with OS bullseye
- 14:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2149 (T407997)', diff saved to https://phabricator.wikimedia.org/P84905 and previous config saved to /var/cache/conftool/dbconfig/20251105-141507-marostegui.json
- 14:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2149.codfw.wmnet with reason: Maintenance
- 14:12 Amir1: cumin2024@db2205.codfw.wmnet[(none)]> drop database if exists jamestemp; (T297297)
- 14:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1189.eqiad.wmnet with reason: Maintenance
- 14:09 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 (T403362)', diff saved to https://phabricator.wikimedia.org/P84904 and previous config saved to /var/cache/conftool/dbconfig/20251105-140934-ladsgroup.json
- 14:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 14:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 14:02 ladsgroup@cumin1003: START - Cookbook sre.mysql.pool db1220* gradually with 4 steps - Work done
- 14:00 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps2006.codfw.wmnet
- 13:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 13:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T407997)', diff saved to https://phabricator.wikimedia.org/P84902 and previous config saved to /var/cache/conftool/dbconfig/20251105-135831-marostegui.json
- 13:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps2005.codfw.wmnet
- 13:53 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 13:53 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 13:50 Amir1: cumin2024@db2205.codfw.wmnet[(none)]> drop database if exists katesdb; (T297297)
- 13:48 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 13:44 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps2005.codfw.wmnet
- 13:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P84901 and previous config saved to /var/cache/conftool/dbconfig/20251105-134323-marostegui.json
- 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps2010.codfw.wmnet
- 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 13:39 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: maps2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 13:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 13:28 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts maps2010.codfw.wmnet
- 13:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P84900 and previous config saved to /var/cache/conftool/dbconfig/20251105-132816-marostegui.json
- 13:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T407997)', diff saved to https://phabricator.wikimedia.org/P84899 and previous config saved to /var/cache/conftool/dbconfig/20251105-131308-marostegui.json
- 13:07 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1223 (T407997)', diff saved to https://phabricator.wikimedia.org/P84898 and previous config saved to /var/cache/conftool/dbconfig/20251105-130750-marostegui.json
- 13:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1223.eqiad.wmnet with reason: Maintenance
- 13:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T407997)', diff saved to https://phabricator.wikimedia.org/P84897 and previous config saved to /var/cache/conftool/dbconfig/20251105-130726-marostegui.json
- 13:05 brouberol@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 9 hosts with reason: rebalancing
- 12:55 marostegui: Deploy schema change on s3 master for vewikimedia T409282 T396130
- 12:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P84896 and previous config saved to /var/cache/conftool/dbconfig/20251105-125219-marostegui.json
- 12:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P84895 and previous config saved to /var/cache/conftool/dbconfig/20251105-123711-marostegui.json
- 12:28 marostegui@cumin1003: dbctl commit (dc=all): 'db1167 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84894 and previous config saved to /var/cache/conftool/dbconfig/20251105-122828-root.json
- 12:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T407997)', diff saved to https://phabricator.wikimedia.org/P84893 and previous config saved to /var/cache/conftool/dbconfig/20251105-122203-marostegui.json
- 12:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1212 (T407997)', diff saved to https://phabricator.wikimedia.org/P84892 and previous config saved to /var/cache/conftool/dbconfig/20251105-121647-marostegui.json
- 12:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 12:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1212.eqiad.wmnet with reason: Maintenance
- 12:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T407997)', diff saved to https://phabricator.wikimedia.org/P84891 and previous config saved to /var/cache/conftool/dbconfig/20251105-121616-marostegui.json
- 12:13 marostegui@cumin1003: dbctl commit (dc=all): 'db1167 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84890 and previous config saved to /var/cache/conftool/dbconfig/20251105-121323-root.json
- 12:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P84889 and previous config saved to /var/cache/conftool/dbconfig/20251105-120108-marostegui.json
- 11:58 marostegui@cumin1003: dbctl commit (dc=all): 'db1167 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84888 and previous config saved to /var/cache/conftool/dbconfig/20251105-115817-root.json
- 11:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2215 (T403362)', diff saved to https://phabricator.wikimedia.org/P84887 and previous config saved to /var/cache/conftool/dbconfig/20251105-115437-ladsgroup.json
- 11:54 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
- 11:52 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
- 11:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P84886 and previous config saved to /var/cache/conftool/dbconfig/20251105-114600-marostegui.json
- 11:43 marostegui@cumin1003: dbctl commit (dc=all): 'db1167 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84885 and previous config saved to /var/cache/conftool/dbconfig/20251105-114311-root.json
- 11:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1167 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84884 and previous config saved to /var/cache/conftool/dbconfig/20251105-113522-marostegui.json
- 11:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 11:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Migration
- 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T407997)', diff saved to https://phabricator.wikimedia.org/P84883 and previous config saved to /var/cache/conftool/dbconfig/20251105-113053-marostegui.json
- 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1198 (T407997)', diff saved to https://phabricator.wikimedia.org/P84882 and previous config saved to /var/cache/conftool/dbconfig/20251105-112556-marostegui.json
- 11:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1198.eqiad.wmnet with reason: Maintenance
- 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T407997)', diff saved to https://phabricator.wikimedia.org/P84881 and previous config saved to /var/cache/conftool/dbconfig/20251105-112532-marostegui.json
- 11:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P84880 and previous config saved to /var/cache/conftool/dbconfig/20251105-111025-marostegui.json
- 11:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-launcher1002.eqiad.wmnet
- 11:04 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:04 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-launcher1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
- 11:04 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-launcher1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
- 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 100%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84879 and previous config saved to /var/cache/conftool/dbconfig/20251105-110000-root.json
- 10:59 btullis@cumin1003: START - Cookbook sre.dns.netbox
- 10:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P84878 and previous config saved to /var/cache/conftool/dbconfig/20251105-105517-marostegui.json
- 10:45 btullis@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-launcher1002.eqiad.wmnet
- 10:44 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 75%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84876 and previous config saved to /var/cache/conftool/dbconfig/20251105-104454-root.json
- 10:41 btullis@deploy2002: Finished deploy [analytics/refinery@39e92e9]: Updating the deployment on an-launcher1003 (duration: 01m 06s)
- 10:40 btullis@deploy2002: Started deploy [analytics/refinery@39e92e9]: Updating the deployment on an-launcher1003
- 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T407997)', diff saved to https://phabricator.wikimedia.org/P84875 and previous config saved to /var/cache/conftool/dbconfig/20251105-104010-marostegui.json
- 10:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1175 (T407997)', diff saved to https://phabricator.wikimedia.org/P84874 and previous config saved to /var/cache/conftool/dbconfig/20251105-103513-marostegui.json
- 10:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 10:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T407997)', diff saved to https://phabricator.wikimedia.org/P84873 and previous config saved to /var/cache/conftool/dbconfig/20251105-103449-marostegui.json
- 10:29 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 60%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84872 and previous config saved to /var/cache/conftool/dbconfig/20251105-102948-root.json
- 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P84871 and previous config saved to /var/cache/conftool/dbconfig/20251105-101942-marostegui.json
- 10:14 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 50%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84870 and previous config saved to /var/cache/conftool/dbconfig/20251105-101442-root.json
- 10:06 moritzm: disabling Puppet on buster maps nodes for pending decom T381565
- 10:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P84869 and previous config saved to /var/cache/conftool/dbconfig/20251105-100434-marostegui.json
- 09:59 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 40%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84868 and previous config saved to /var/cache/conftool/dbconfig/20251105-095936-root.json
- 09:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T407997)', diff saved to https://phabricator.wikimedia.org/P84867 and previous config saved to /var/cache/conftool/dbconfig/20251105-094926-marostegui.json
- 09:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1166 (T407997)', diff saved to https://phabricator.wikimedia.org/P84866 and previous config saved to /var/cache/conftool/dbconfig/20251105-094431-marostegui.json
- 09:44 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 30%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84865 and previous config saved to /var/cache/conftool/dbconfig/20251105-094431-root.json
- 09:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 09:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T407997)', diff saved to https://phabricator.wikimedia.org/P84864 and previous config saved to /var/cache/conftool/dbconfig/20251105-094408-marostegui.json
- 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 25%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84863 and previous config saved to /var/cache/conftool/dbconfig/20251105-092925-root.json
- 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P84862 and previous config saved to /var/cache/conftool/dbconfig/20251105-092859-marostegui.json
- 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'db1174 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P84861 and previous config saved to /var/cache/conftool/dbconfig/20251105-091438-root.json
- 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 20%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84860 and previous config saved to /var/cache/conftool/dbconfig/20251105-091419-root.json
- 09:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P84859 and previous config saved to /var/cache/conftool/dbconfig/20251105-091352-marostegui.json
- 08:59 marostegui@cumin1003: dbctl commit (dc=all): 'db1174 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P84858 and previous config saved to /var/cache/conftool/dbconfig/20251105-085932-root.json
- 08:59 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 15%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84857 and previous config saved to /var/cache/conftool/dbconfig/20251105-085913-root.json
- 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T407997)', diff saved to https://phabricator.wikimedia.org/P84856 and previous config saved to /var/cache/conftool/dbconfig/20251105-085844-marostegui.json
- 08:53 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T407997)', diff saved to https://phabricator.wikimedia.org/P84855 and previous config saved to /var/cache/conftool/dbconfig/20251105-085347-marostegui.json
- 08:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1157.eqiad.wmnet with reason: Maintenance
- 08:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-growthbook: apply
- 08:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook: apply
- 08:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 08:44 marostegui@cumin1003: dbctl commit (dc=all): 'db1174 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P84854 and previous config saved to /var/cache/conftool/dbconfig/20251105-084426-root.json
- 08:44 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 10%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84853 and previous config saved to /var/cache/conftool/dbconfig/20251105-084407-root.json
- 08:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2218.codfw.wmnet with reason: Maintenance
- 08:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1181.eqiad.wmnet with reason: Maintenance
- 08:29 marostegui@cumin1003: dbctl commit (dc=all): 'db1174 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P84852 and previous config saved to /var/cache/conftool/dbconfig/20251105-082920-root.json
- 08:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T407997)', diff saved to https://phabricator.wikimedia.org/P84851 and previous config saved to /var/cache/conftool/dbconfig/20251105-082642-marostegui.json
- 08:25 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1174 (T407997)', diff saved to https://phabricator.wikimedia.org/P84850 and previous config saved to /var/cache/conftool/dbconfig/20251105-082533-marostegui.json
- 08:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 08:22 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 5%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84849 and previous config saved to /var/cache/conftool/dbconfig/20251105-082209-root.json
- 08:21 Emperor: run gitlab-package-puller by hand on apt-staging2001
- 08:13 brouberol@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on an-launcher1002.eqiad.wmnet with reason: host is being decommissioned
- 08:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T407997)', diff saved to https://phabricator.wikimedia.org/P84848 and previous config saved to /var/cache/conftool/dbconfig/20251105-080849-marostegui.json
- 08:07 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 4%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84847 and previous config saved to /var/cache/conftool/dbconfig/20251105-080702-root.json
- 08:00 marostegui@cumin1003: dbctl commit (dc=all): 'db2212 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84846 and previous config saved to /var/cache/conftool/dbconfig/20251105-080027-root.json
- 07:59 eileen: ivicrm upgraded from 1eeb1a46 to 8efb2be1
- 07:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P84845 and previous config saved to /var/cache/conftool/dbconfig/20251105-075341-marostegui.json
- 07:51 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 3%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84844 and previous config saved to /var/cache/conftool/dbconfig/20251105-075156-root.json
- 07:45 marostegui@cumin1003: dbctl commit (dc=all): 'db2212 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84843 and previous config saved to /var/cache/conftool/dbconfig/20251105-074521-root.json
- 07:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P84842 and previous config saved to /var/cache/conftool/dbconfig/20251105-073833-marostegui.json
- 07:36 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 2%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84841 and previous config saved to /var/cache/conftool/dbconfig/20251105-073651-root.json
- 07:33 marostegui@cumin1003: dbctl commit (dc=all): 'db1203 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84840 and previous config saved to /var/cache/conftool/dbconfig/20251105-073347-root.json
- 07:30 marostegui@cumin1003: dbctl commit (dc=all): 'db2215 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P84839 and previous config saved to /var/cache/conftool/dbconfig/20251105-073033-root.json
- 07:30 marostegui@cumin1003: dbctl commit (dc=all): 'db2212 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84838 and previous config saved to /var/cache/conftool/dbconfig/20251105-073016-root.json
- 07:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T407997)', diff saved to https://phabricator.wikimedia.org/P84837 and previous config saved to /var/cache/conftool/dbconfig/20251105-072326-marostegui.json
- 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'es1033 (re)pooling @ 1%: Testing Debian Trixie in es2', diff saved to https://phabricator.wikimedia.org/P84836 and previous config saved to /var/cache/conftool/dbconfig/20251105-072145-root.json
- 07:18 marostegui@cumin1003: dbctl commit (dc=all): 'db1203 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84835 and previous config saved to /var/cache/conftool/dbconfig/20251105-071841-root.json
- 07:16 marostegui@cumin1003: dbctl commit (dc=all): 'Add es1033 to es2 depooled T409257 T407472', diff saved to https://phabricator.wikimedia.org/P84834 and previous config saved to /var/cache/conftool/dbconfig/20251105-071605-marostegui.json
- 07:15 marostegui@cumin1003: dbctl commit (dc=all): 'db2215 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P84833 and previous config saved to /var/cache/conftool/dbconfig/20251105-071527-root.json
- 07:15 marostegui@cumin1003: dbctl commit (dc=all): 'db2212 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84832 and previous config saved to /var/cache/conftool/dbconfig/20251105-071510-root.json
- 07:07 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2212 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84831 and previous config saved to /var/cache/conftool/dbconfig/20251105-070707-marostegui.json
- 07:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2212.codfw.wmnet with reason: Maintenance
- 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1170 (T407997)', diff saved to https://phabricator.wikimedia.org/P84830 and previous config saved to /var/cache/conftool/dbconfig/20251105-070540-marostegui.json
- 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T407997)', diff saved to https://phabricator.wikimedia.org/P84828 and previous config saved to /var/cache/conftool/dbconfig/20251105-070516-marostegui.json
- 07:03 marostegui@cumin1003: dbctl commit (dc=all): 'db1203 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84827 and previous config saved to /var/cache/conftool/dbconfig/20251105-070335-root.json
- 07:00 marostegui@cumin1003: dbctl commit (dc=all): 'db2215 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P84826 and previous config saved to /var/cache/conftool/dbconfig/20251105-070021-root.json
- 06:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es1034.eqiad.wmnet
- 06:52 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 06:52 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es1034.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 06:51 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es1034.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P84825 and previous config saved to /var/cache/conftool/dbconfig/20251105-065008-marostegui.json
- 06:48 marostegui@cumin1003: dbctl commit (dc=all): 'db1203 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84824 and previous config saved to /var/cache/conftool/dbconfig/20251105-064829-root.json
- 06:48 marostegui@cumin1003: START - Cookbook sre.dns.netbox
- 06:47 eileen: civicrm upgraded from a7c697e9 to 1eeb1a46
- 06:45 marostegui@cumin1003: dbctl commit (dc=all): 'db2215 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P84823 and previous config saved to /var/cache/conftool/dbconfig/20251105-064515-root.json
- 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts es1034.eqiad.wmnet
- 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1203 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84822 and previous config saved to /var/cache/conftool/dbconfig/20251105-064028-marostegui.json
- 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1203.eqiad.wmnet with reason: Maintenance
- 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P84821 and previous config saved to /var/cache/conftool/dbconfig/20251105-063458-marostegui.json
- 06:32 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1:00:00 on 14 hosts with reason: Primary switchover x1 T409168
- 06:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: Primary switchover x1 T409168
- 06:30 marostegui@cumin1003: dbctl commit (dc=all): 'db2215 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P84820 and previous config saved to /var/cache/conftool/dbconfig/20251105-063009-root.json
- 06:29 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2215 T409168', diff saved to https://phabricator.wikimedia.org/P84819 and previous config saved to /var/cache/conftool/dbconfig/20251105-062920-marostegui.json
- 06:29 marostegui@dns1006: END - running authdns-update
- 06:28 marostegui@dns1006: START - running authdns-update
- 06:27 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2191 to x1 primary and set section read-write T409168', diff saved to https://phabricator.wikimedia.org/P84818 and previous config saved to /var/cache/conftool/dbconfig/20251105-062745-marostegui.json
- 06:27 marostegui@cumin1003: dbctl commit (dc=all): 'Set x1 codfw as read-only for maintenance - T409168', diff saved to https://phabricator.wikimedia.org/P84817 and previous config saved to /var/cache/conftool/dbconfig/20251105-062723-marostegui.json
- 06:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: Primary switchover x1 T409168
- 06:22 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2191 with weight 0 T409168', diff saved to https://phabricator.wikimedia.org/P84816 and previous config saved to /var/cache/conftool/dbconfig/20251105-062230-marostegui.json
- 06:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T407997)', diff saved to https://phabricator.wikimedia.org/P84815 and previous config saved to /var/cache/conftool/dbconfig/20251105-061950-marostegui.json
- 06:17 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1158 (T407997)', diff saved to https://phabricator.wikimedia.org/P84814 and previous config saved to /var/cache/conftool/dbconfig/20251105-061737-marostegui.json
- 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 05:14 eileen: civicrm upgraded from 53b042e5 to a7c697e9
- 04:38 eileen: civicrm upgraded from 090cd474 to 53b042e5
- 03:21 eileen: config revision changed from 553c9c90 to 18e60944
- 03:01 eileen: civicrm upgraded from 3a637a8b to 090cd474
- 02:58 tstarling@deploy2002: Finished scap sync-world: Backport for recentchanges: Fix watchlistactivity=all, i.e. seen/unseen conflict (T408167) (duration: 10m 39s)
- 02:53 tstarling@deploy2002: tstarling: Continuing with sync
- 02:50 tstarling@deploy2002: tstarling: Backport for recentchanges: Fix watchlistactivity=all, i.e. seen/unseen conflict (T408167) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 02:47 tstarling@deploy2002: Started scap sync-world: Backport for recentchanges: Fix watchlistactivity=all, i.e. seen/unseen conflict (T408167)
- 00:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T407997)', diff saved to https://phabricator.wikimedia.org/P84810 and previous config saved to /var/cache/conftool/dbconfig/20251105-000151-marostegui.json
2025-11-04
- 23:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P84809 and previous config saved to /var/cache/conftool/dbconfig/20251104-234643-marostegui.json
- 23:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P84808 and previous config saved to /var/cache/conftool/dbconfig/20251104-233135-marostegui.json
- 23:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T407997)', diff saved to https://phabricator.wikimedia.org/P84807 and previous config saved to /var/cache/conftool/dbconfig/20251104-231628-marostegui.json
- 23:08 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 23:08 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 23:07 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 23:07 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 22:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2222 (T407997)', diff saved to https://phabricator.wikimedia.org/P84806 and previous config saved to /var/cache/conftool/dbconfig/20251104-225853-marostegui.json
- 22:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2222.codfw.wmnet with reason: Maintenance
- 22:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T407997)', diff saved to https://phabricator.wikimedia.org/P84805 and previous config saved to /var/cache/conftool/dbconfig/20251104-225829-marostegui.json
- 22:53 aaron@deploy2002: Finished scap sync-world: Backport for Add a wgRestSandboxSpecs entry for wikimedia.org (math) specs (T396805) (duration: 07m 48s)
- 22:49 aaron@deploy2002: aaron: Continuing with sync
- 22:47 aaron@deploy2002: aaron: Backport for Add a wgRestSandboxSpecs entry for wikimedia.org (math) specs (T396805) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:45 aaron@deploy2002: Started scap sync-world: Backport for Add a wgRestSandboxSpecs entry for wikimedia.org (math) specs (T396805)
- 22:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P84804 and previous config saved to /var/cache/conftool/dbconfig/20251104-224321-marostegui.json
- 22:39 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
- 22:39 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
- 22:38 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
- 22:38 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
- 22:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P84803 and previous config saved to /var/cache/conftool/dbconfig/20251104-222814-marostegui.json
- 22:24 eileen: civicrm upgraded from ee0b5d3c to 3a637a8b
- 22:19 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1201731 T405808 (duration: 05m 39s)
- 22:14 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1201731 T405808
- 22:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T407997)', diff saved to https://phabricator.wikimedia.org/P84802 and previous config saved to /var/cache/conftool/dbconfig/20251104-221306-marostegui.json
- 21:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2221 (T407997)', diff saved to https://phabricator.wikimedia.org/P84801 and previous config saved to /var/cache/conftool/dbconfig/20251104-215649-marostegui.json
- 21:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2221.codfw.wmnet with reason: Maintenance
- 21:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T407997)', diff saved to https://phabricator.wikimedia.org/P84800 and previous config saved to /var/cache/conftool/dbconfig/20251104-215625-marostegui.json
- 21:48 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 21:48 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 21:43 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 21:43 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 21:43 eileen: civicrm upgraded from 2e7879c3 to ee0b5d3c
- 21:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P84799 and previous config saved to /var/cache/conftool/dbconfig/20251104-214117-marostegui.json
- 21:28 bvibber@deploy2002: Finished scap sync-world: Backport for cirrus: Start near match A/B test (T408154) (duration: 07m 53s)
- 21:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P84798 and previous config saved to /var/cache/conftool/dbconfig/20251104-212609-marostegui.json
- 21:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2009.codfw.wmnet with OS trixie
- 21:26 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 21:25 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 21:24 bvibber@deploy2002: bvibber, ebernhardson: Continuing with sync
- 21:24 bvibber@deploy2002: bvibber, ebernhardson: Backport for cirrus: Start near match A/B test (T408154) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:20 bvibber@deploy2002: Started scap sync-world: Backport for cirrus: Start near match A/B test (T408154)
- 21:18 bvibber@deploy2002: Finished scap sync-world: Backport for Guard against some null dereferences in CroppedImage (T409123 T409126), Guard against some null dereferences in CroppedImage (T409123 T409126) (duration: 11m 23s)
- 21:12 bvibber@deploy2002: bvibber: Continuing with sync
- 21:11 bvibber@deploy2002: bvibber: Backport for Guard against some null dereferences in CroppedImage (T409123 T409126), Guard against some null dereferences in CroppedImage (T409123 T409126) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T407997)', diff saved to https://phabricator.wikimedia.org/P84797 and previous config saved to /var/cache/conftool/dbconfig/20251104-211102-marostegui.json
- 21:08 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
- 21:07 bvibber@deploy2002: Started scap sync-world: Backport for Guard against some null dereferences in CroppedImage (T409123 T409126), Guard against some null dereferences in CroppedImage (T409123 T409126)
- 21:05 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
- 20:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2220 (T407997)', diff saved to https://phabricator.wikimedia.org/P84796 and previous config saved to /var/cache/conftool/dbconfig/20251104-205433-marostegui.json
- 20:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2220.codfw.wmnet with reason: Maintenance
- 20:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T407997)', diff saved to https://phabricator.wikimedia.org/P84795 and previous config saved to /var/cache/conftool/dbconfig/20251104-205420-marostegui.json
- 20:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
- 20:51 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2009.codfw.wmnet with OS trixie
- 20:43 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 20:43 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 20:41 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 20:41 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 20:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P84794 and previous config saved to /var/cache/conftool/dbconfig/20251104-203912-marostegui.json
- 20:34 eileen: civicrm upgraded from 77cad331 to 2e7879c3
- 20:29 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.1 refs T408271
- 20:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P84793 and previous config saved to /var/cache/conftool/dbconfig/20251104-202405-marostegui.json
- 20:17 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
- 20:16 eevans@deploy2002: helmfile [staging] START helmfile.d/services/data-gateway: apply
- 20:13 jhuneidi@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.1 refs T408271 (duration: 12m 07s)
- 20:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T407997)', diff saved to https://phabricator.wikimedia.org/P84792 and previous config saved to /var/cache/conftool/dbconfig/20251104-200857-marostegui.json
- 20:01 jhuneidi@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.1 refs T408271
- 19:57 brett: import ncmonitor 3.0.0 into bookworm-wikimedia
- 19:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (T407997)', diff saved to https://phabricator.wikimedia.org/P84791 and previous config saved to /var/cache/conftool/dbconfig/20251104-195203-marostegui.json
- 19:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2208.codfw.wmnet with reason: Maintenance
- 19:48 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ncmonitor1001.eqiad.wmnet
- 19:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance
- 19:34 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
- 19:27 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
- 19:24 brett: import ncmonitor 3.0.0 into bookworm-wikimedia
- 19:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2198.codfw.wmnet with reason: Maintenance
- 19:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T407997)', diff saved to https://phabricator.wikimedia.org/P84790 and previous config saved to /var/cache/conftool/dbconfig/20251104-192142-marostegui.json
- 19:13 jhuneidi@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.1 refs T408271
- 19:09 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
- 19:09 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1220 (T403362)', diff saved to https://phabricator.wikimedia.org/P84789 and previous config saved to /var/cache/conftool/dbconfig/20251104-190946-ladsgroup.json
- 19:09 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
- 19:09 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
- 19:08 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
- 19:08 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
- 19:07 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
- 19:06 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
- 19:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P84788 and previous config saved to /var/cache/conftool/dbconfig/20251104-190634-marostegui.json
- 19:06 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
- 19:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
- 19:05 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
- 19:05 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
- 19:04 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
- 19:04 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
- 19:03 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
- 19:03 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
- 18:55 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
- 18:55 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
- 18:52 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
- 18:51 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
- 18:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P84785 and previous config saved to /var/cache/conftool/dbconfig/20251104-185126-marostegui.json
- 18:51 dancy@deploy2002: Installation of scap version "4.222.0" completed for 2 hosts
- 18:51 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
- 18:50 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
- 18:49 dancy@deploy2002: Installing scap version "4.222.0" for 2 host(s)
- 18:48 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
- 18:48 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
- 18:40 swfrench@deploy2002: Finished scap sync-world: Fully migrate mw-(api-int|jobrunner) to 8.3 - T405955 (duration: 07m 49s)
- 18:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T407997)', diff saved to https://phabricator.wikimedia.org/P84784 and previous config saved to /var/cache/conftool/dbconfig/20251104-183619-marostegui.json
- 18:32 swfrench@deploy2002: Started scap sync-world: Fully migrate mw-(api-int|jobrunner) to 8.3 - T405955
- 18:21 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudweb2002-dev.wikimedia.org with OS trixie
- 18:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2182 (T407997)', diff saved to https://phabricator.wikimedia.org/P84783 and previous config saved to /var/cache/conftool/dbconfig/20251104-181648-marostegui.json
- 18:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2182.codfw.wmnet with reason: Maintenance
- 18:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T407997)', diff saved to https://phabricator.wikimedia.org/P84782 and previous config saved to /var/cache/conftool/dbconfig/20251104-181623-marostegui.json
- 18:12 fnegri@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database thwikimedia (T409201)
- 18:09 jhuneidi@deploy2002: sync-world aborted: testwikis to 1.46.0-wmf.1 refs T408271 (duration: 03m 00s)
- 18:06 jhuneidi@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.1 refs T408271
- 18:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P84781 and previous config saved to /var/cache/conftool/dbconfig/20251104-180116-marostegui.json
- 17:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P84780 and previous config saved to /var/cache/conftool/dbconfig/20251104-174608-marostegui.json
- 17:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T407997)', diff saved to https://phabricator.wikimedia.org/P84779 and previous config saved to /var/cache/conftool/dbconfig/20251104-173100-marostegui.json
- 17:26 fnegri@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database tokwiki (T404570)
- 17:26 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database tokwiki (T404570)
- 17:24 fnegri@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database tokwiki (T404566)
- 17:24 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database tokwiki (T404566)
- 17:24 fnegri@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database tokwiki (T404703)
- 17:23 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database tokwiki (T404703)
- 17:23 fnegri@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database mswikiquote (T404703)
- 17:23 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database mswikiquote (T404703)
- 17:16 wfan: donorwiki upgraded from 09caf170 to 8fe00530
- 17:15 wfan: payments-wiki upgraded from 0132998e to 8fe00530
- 17:15 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database thwikimedia (T409201)
- 17:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2168 (T407997)', diff saved to https://phabricator.wikimedia.org/P84778 and previous config saved to /var/cache/conftool/dbconfig/20251104-171333-marostegui.json
- 17:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 17:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T407997)', diff saved to https://phabricator.wikimedia.org/P84777 and previous config saved to /var/cache/conftool/dbconfig/20251104-171320-marostegui.json
- 16:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P84776 and previous config saved to /var/cache/conftool/dbconfig/20251104-165812-marostegui.json
- 16:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P84775 and previous config saved to /var/cache/conftool/dbconfig/20251104-164304-marostegui.json
- 16:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T407997)', diff saved to https://phabricator.wikimedia.org/P84774 and previous config saved to /var/cache/conftool/dbconfig/20251104-162754-marostegui.json
- 16:27 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-eqiad
- 16:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 16:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 16:18 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 16:17 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 16:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2159 (T407997)', diff saved to https://phabricator.wikimedia.org/P84773 and previous config saved to /var/cache/conftool/dbconfig/20251104-161027-marostegui.json
- 16:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2159.codfw.wmnet with reason: Maintenance
- 16:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T407997)', diff saved to https://phabricator.wikimedia.org/P84772 and previous config saved to /var/cache/conftool/dbconfig/20251104-161003-marostegui.json
- 16:08 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-eqiad
- 16:06 brennen@deploy2002: Finished deploy [phabricator/deployment@e9011f3]: deploy phab1004 for T409193 (duration: 02m 29s)
- 16:04 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
- 16:03 brennen@deploy2002: Started deploy [phabricator/deployment@e9011f3]: deploy phab1004 for T409193
- 16:03 brennen@deploy2002: Finished deploy [phabricator/deployment@e9011f3]: deploy phab2002 for T409193 (duration: 00m 31s)
- 16:03 brennen@deploy2002: Started deploy [phabricator/deployment@e9011f3]: deploy phab2002 for T409193
- 16:02 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
- 15:59 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:56 jhancock@cumin1003: START - Cookbook sre.dns.netbox
- 15:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P84771 and previous config saved to /var/cache/conftool/dbconfig/20251104-155455-marostegui.json
- 15:49 topranks: upgrade lsw1-c3-eqiad and lsw1-d3-eqiad to SR-Linux v24.10.4
- 15:47 marostegui@cumin1003: dbctl commit (dc=all): 'db2213 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P84770 and previous config saved to /var/cache/conftool/dbconfig/20251104-154755-root.json
- 15:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P84768 and previous config saved to /var/cache/conftool/dbconfig/20251104-153948-marostegui.json
- 15:39 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be1088.eqiad.wmnet with OS trixie
- 15:39 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:36 jhancock@cumin1003: START - Cookbook sre.dns.netbox
- 15:33 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudweb2002-dev.wikimedia.org with reason: host reimage
- 15:32 marostegui@cumin1003: dbctl commit (dc=all): 'db2213 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P84767 and previous config saved to /var/cache/conftool/dbconfig/20251104-153249-root.json
- 15:30 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudweb2002-dev.wikimedia.org with reason: host reimage
- 15:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T407997)', diff saved to https://phabricator.wikimedia.org/P84766 and previous config saved to /var/cache/conftool/dbconfig/20251104-152440-marostegui.json
- 15:18 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-codfw
- 15:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'db2213 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P84765 and previous config saved to /var/cache/conftool/dbconfig/20251104-151744-root.json
- 15:13 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 15:13 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudweb2002-dev.wikimedia.org with OS trixie
- 15:06 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2150 (T407997)', diff saved to https://phabricator.wikimedia.org/P84764 and previous config saved to /var/cache/conftool/dbconfig/20251104-150623-marostegui.json
- 15:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2150.codfw.wmnet with reason: Maintenance
- 15:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2192.codfw.wmnet with reason: Maintenance
- 15:01 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-codfw
- 15:00 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1088.eqiad.wmnet with OS trixie
- 14:58 fceratto@cumin1002: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test2001.codfw.wmnet
- 14:58 fceratto@cumin1002: END (ERROR) - Cookbook sre.mysql.clone (exit_code=97) of db2230.codfw.wmnet onto db-test2001.codfw.wmnet
- 14:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P84763 and previous config saved to /var/cache/conftool/dbconfig/20251104-145506-marostegui.json
- 14:48 Lucas_WMDE: lucaswerkmeister-wmde@deploy2002 $ printf 'https://en.wikipedia.org/static/images/mobile/copyright/wiktionary-%s-az.svg\n' tagline wordmark | mwscript-k8s --comment='T408147' --attach -- purgeList enwiki
- 14:42 Lucas_WMDE: UTC afternoon backport+config window done
- 14:41 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for azwiktionary: use new wordmark and tagline (T408147), Remove wmgULSPosition for special wikis (T400067) (duration: 09m 33s)
- 14:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T407997)', diff saved to https://phabricator.wikimedia.org/P84762 and previous config saved to /var/cache/conftool/dbconfig/20251104-143958-marostegui.json
- 14:37 lucaswerkmeister-wmde@deploy2002: ekrem, lucaswerkmeister-wmde, abi: Continuing with sync
- 14:37 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet2007-dev.codfw.wmnet with OS trixie
- 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2213 (T407997)', diff saved to https://phabricator.wikimedia.org/P84761 and previous config saved to /var/cache/conftool/dbconfig/20251104-143546-marostegui.json
- 14:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2213.codfw.wmnet with reason: Maintenance
- 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T407997)', diff saved to https://phabricator.wikimedia.org/P84760 and previous config saved to /var/cache/conftool/dbconfig/20251104-143519-marostegui.json
- 14:34 lucaswerkmeister-wmde@deploy2002: ekrem, lucaswerkmeister-wmde, abi: Backport for azwiktionary: use new wordmark and tagline (T408147), Remove wmgULSPosition for special wikis (T400067) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:32 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for azwiktionary: use new wordmark and tagline (T408147), Remove wmgULSPosition for special wikis (T400067)
- 14:31 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 14:31 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 14:31 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 14:30 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 14:30 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 14:30 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 14:27 fceratto@cumin1002: START - Cookbook sre.mysql.clone of db2230.codfw.wmnet onto db-test2001.codfw.wmnet
- 14:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P84759 and previous config saved to /var/cache/conftool/dbconfig/20251104-142010-marostegui.json
- 14:18 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2007-dev.codfw.wmnet with reason: host reimage
- 14:15 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2007-dev.codfw.wmnet with reason: host reimage
- 14:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P84758 and previous config saved to /var/cache/conftool/dbconfig/20251104-140503-marostegui.json
- 13:57 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudnet2007-dev.codfw.wmnet with OS trixie
- 13:53 topranks: downgrade lsw1-c3-eqiad to SR-Linux v24.7.2
- 13:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T407997)', diff saved to https://phabricator.wikimedia.org/P84757 and previous config saved to /var/cache/conftool/dbconfig/20251104-134955-marostegui.json
- 13:45 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2211 (T407997)', diff saved to https://phabricator.wikimedia.org/P84756 and previous config saved to /var/cache/conftool/dbconfig/20251104-134545-marostegui.json
- 13:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2211.codfw.wmnet with reason: Maintenance
- 13:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2201.codfw.wmnet with reason: Maintenance
- 13:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T407997)', diff saved to https://phabricator.wikimedia.org/P84755 and previous config saved to /var/cache/conftool/dbconfig/20251104-134314-marostegui.json
- 13:41 moritzm: installing tiff security updates
- 13:35 marostegui@cumin1003: dbctl commit (dc=all): 'db1220 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P84754 and previous config saved to /var/cache/conftool/dbconfig/20251104-133526-root.json
- 13:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P84753 and previous config saved to /var/cache/conftool/dbconfig/20251104-132804-marostegui.json
- 13:20 marostegui@cumin1003: dbctl commit (dc=all): 'db1220 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P84752 and previous config saved to /var/cache/conftool/dbconfig/20251104-132019-root.json
- 13:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P84750 and previous config saved to /var/cache/conftool/dbconfig/20251104-131254-marostegui.json
- 13:05 marostegui@cumin1003: dbctl commit (dc=all): 'db1220 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P84749 and previous config saved to /var/cache/conftool/dbconfig/20251104-130512-root.json
- 12:59 topranks: downgrade lsw1-d3-eqiad to SR-Linux v24.10.1
- 12:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T407997)', diff saved to https://phabricator.wikimedia.org/P84748 and previous config saved to /var/cache/conftool/dbconfig/20251104-125745-marostegui.json
- 12:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2178 (T407997)', diff saved to https://phabricator.wikimedia.org/P84747 and previous config saved to /var/cache/conftool/dbconfig/20251104-125359-marostegui.json
- 12:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2178.codfw.wmnet with reason: Maintenance
- 12:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T407997)', diff saved to https://phabricator.wikimedia.org/P84746 and previous config saved to /var/cache/conftool/dbconfig/20251104-125335-marostegui.json
- 12:50 marostegui@cumin1003: dbctl commit (dc=all): 'db1220 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P84745 and previous config saved to /var/cache/conftool/dbconfig/20251104-125005-root.json
- 12:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1220 T409167', diff saved to https://phabricator.wikimedia.org/P84744 and previous config saved to /var/cache/conftool/dbconfig/20251104-124836-marostegui.json
- 12:48 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1237 to x1 primary T409167', diff saved to https://phabricator.wikimedia.org/P84743 and previous config saved to /var/cache/conftool/dbconfig/20251104-124803-marostegui.json
- 12:47 marostegui: Starting x1 eqiad failover from db1220 to db1237 - T409167
- 12:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: Primary switchover x1 T409167
- 12:45 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1237 with weight 0 T409167', diff saved to https://phabricator.wikimedia.org/P84742 and previous config saved to /var/cache/conftool/dbconfig/20251104-124556-marostegui.json
- 12:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P84741 and previous config saved to /var/cache/conftool/dbconfig/20251104-123827-marostegui.json
- 12:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P84740 and previous config saved to /var/cache/conftool/dbconfig/20251104-122320-marostegui.json
- 12:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T407997)', diff saved to https://phabricator.wikimedia.org/P84739 and previous config saved to /var/cache/conftool/dbconfig/20251104-120812-marostegui.json
- 12:08 fabfur: re-enable puppet on A:cp (T408060)
- 12:04 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2171 (T407997)', diff saved to https://phabricator.wikimedia.org/P84737 and previous config saved to /var/cache/conftool/dbconfig/20251104-120401-marostegui.json
- 12:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 12:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T407997)', diff saved to https://phabricator.wikimedia.org/P84736 and previous config saved to /var/cache/conftool/dbconfig/20251104-120338-marostegui.json
- 12:00 topranks: upgrade lsw1-d3-eqiad to SR-Linux v24.10.3
- 11:57 fabfur: temporary disable puppet on A:cp to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1199247 (T408060)
- 11:52 marostegui@cumin1003: dbctl commit (dc=all): 'db1192 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84735 and previous config saved to /var/cache/conftool/dbconfig/20251104-115217-root.json
- 11:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P84734 and previous config saved to /var/cache/conftool/dbconfig/20251104-114830-marostegui.json
- 11:47 marostegui@cumin1003: dbctl commit (dc=all): 'db2216 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84733 and previous config saved to /var/cache/conftool/dbconfig/20251104-114712-root.json
- 11:38 moritzm: installing Java 8 security updates on Bullseye
- 11:37 marostegui@cumin1003: dbctl commit (dc=all): 'db1192 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84732 and previous config saved to /var/cache/conftool/dbconfig/20251104-113711-root.json
- 11:33 hashar: Upgrading and restarting CI Jenkins | T404856
- 11:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P84731 and previous config saved to /var/cache/conftool/dbconfig/20251104-113322-marostegui.json
- 11:32 marostegui@cumin1003: dbctl commit (dc=all): 'db2216 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84730 and previous config saved to /var/cache/conftool/dbconfig/20251104-113205-root.json
- 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'db1192 (re)pooling @ 50%: 10', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20251104-112201-root.json
- 11:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T407997)', diff saved to https://phabricator.wikimedia.org/P84728 and previous config saved to /var/cache/conftool/dbconfig/20251104-111814-marostegui.json
- 11:16 marostegui@cumin1003: dbctl commit (dc=all): 'db2216 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84727 and previous config saved to /var/cache/conftool/dbconfig/20251104-111658-root.json
- 11:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2157 (T407997)', diff saved to https://phabricator.wikimedia.org/P84726 and previous config saved to /var/cache/conftool/dbconfig/20251104-111401-marostegui.json
- 11:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2157.codfw.wmnet with reason: Maintenance
- 11:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
- 11:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 11:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 11:06 marostegui@cumin1003: dbctl commit (dc=all): 'db1192 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84725 and previous config saved to /var/cache/conftool/dbconfig/20251104-110655-root.json
- 11:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T407997)', diff saved to https://phabricator.wikimedia.org/P84724 and previous config saved to /var/cache/conftool/dbconfig/20251104-110643-marostegui.json
- 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'db2216 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84723 and previous config saved to /var/cache/conftool/dbconfig/20251104-110152-root.json
- 10:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1192 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84722 and previous config saved to /var/cache/conftool/dbconfig/20251104-105851-marostegui.json
- 10:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1192.eqiad.wmnet with reason: Maintenance
- 10:54 moritzm: uploaded openjdk-8 8u472-ga-1~deb11u1 to apt.wikimedia.org (forward port of latest Java 8 security updates)
- 10:53 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2216 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84721 and previous config saved to /var/cache/conftool/dbconfig/20251104-105339-marostegui.json
- 10:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2216.codfw.wmnet with reason: Maintenance
- 10:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P84720 and previous config saved to /var/cache/conftool/dbconfig/20251104-105136-marostegui.json
- 10:42 jmm@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
- 10:40 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:40 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:40 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:40 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:39 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:39 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P84719 and previous config saved to /var/cache/conftool/dbconfig/20251104-103629-marostegui.json
- 10:25 jmm@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
- 10:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T407997)', diff saved to https://phabricator.wikimedia.org/P84718 and previous config saved to /var/cache/conftool/dbconfig/20251104-102121-marostegui.json
- 10:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1230 (T407997)', diff saved to https://phabricator.wikimedia.org/P84717 and previous config saved to /var/cache/conftool/dbconfig/20251104-101845-marostegui.json
- 10:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
- 10:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T407997)', diff saved to https://phabricator.wikimedia.org/P84716 and previous config saved to /var/cache/conftool/dbconfig/20251104-101713-marostegui.json
- 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P84715 and previous config saved to /var/cache/conftool/dbconfig/20251104-100206-marostegui.json
- 10:01 btullis@deploy2002: Finished deploy [analytics/hdfs-tools/deploy@bb26b34]: Deploying after updating targets (duration: 00m 24s)
- 10:01 btullis@deploy2002: Started deploy [analytics/hdfs-tools/deploy@bb26b34]: Deploying after updating targets
- 09:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P84714 and previous config saved to /var/cache/conftool/dbconfig/20251104-094658-marostegui.json
- 09:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T407997)', diff saved to https://phabricator.wikimedia.org/P84713 and previous config saved to /var/cache/conftool/dbconfig/20251104-093148-marostegui.json
- 09:29 ozge@deploy2002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
- 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1207 (T407997)', diff saved to https://phabricator.wikimedia.org/P84712 and previous config saved to /var/cache/conftool/dbconfig/20251104-092913-marostegui.json
- 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 09:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T407997)', diff saved to https://phabricator.wikimedia.org/P84711 and previous config saved to /var/cache/conftool/dbconfig/20251104-092850-marostegui.json
- 09:28 ozge@deploy2002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
- 09:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P84710 and previous config saved to /var/cache/conftool/dbconfig/20251104-091342-marostegui.json
- 08:59 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS bookworm
- 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P84709 and previous config saved to /var/cache/conftool/dbconfig/20251104-085834-marostegui.json
- 08:55 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS bookworm
- 08:54 moritzm: installing squid security updates
- 08:54 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:53 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:53 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:50 marostegui@cumin1003: dbctl commit (dc=all): 'db1178 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84708 and previous config saved to /var/cache/conftool/dbconfig/20251104-085043-root.json
- 08:47 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T407997)', diff saved to https://phabricator.wikimedia.org/P84707 and previous config saved to /var/cache/conftool/dbconfig/20251104-084327-marostegui.json
- 08:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1200 (T407997)', diff saved to https://phabricator.wikimedia.org/P84706 and previous config saved to /var/cache/conftool/dbconfig/20251104-084056-marostegui.json
- 08:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1200.eqiad.wmnet with reason: Maintenance
- 08:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T407997)', diff saved to https://phabricator.wikimedia.org/P84705 and previous config saved to /var/cache/conftool/dbconfig/20251104-084032-marostegui.json
- 08:35 marostegui@cumin1003: dbctl commit (dc=all): 'db1178 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84704 and previous config saved to /var/cache/conftool/dbconfig/20251104-083538-root.json
- 08:29 dcausse: UTC morning backport window done
- 08:29 dcausse@deploy2002: Finished scap sync-world: Backport for Revert^3 "cirrus: enable completion search with defaultsort A/B test" (duration: 09m 20s)
- 08:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P84703 and previous config saved to /var/cache/conftool/dbconfig/20251104-082525-marostegui.json
- 08:24 dcausse@deploy2002: dcausse: Continuing with sync
- 08:22 marostegui@cumin1003: dbctl commit (dc=all): 'db2176 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84702 and previous config saved to /var/cache/conftool/dbconfig/20251104-082226-root.json
- 08:21 dcausse@deploy2002: dcausse: Backport for Revert^3 "cirrus: enable completion search with defaultsort A/B test" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:20 marostegui@cumin1003: dbctl commit (dc=all): 'db1178 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84701 and previous config saved to /var/cache/conftool/dbconfig/20251104-082031-root.json
- 08:19 dcausse@deploy2002: Started scap sync-world: Backport for Revert^3 "cirrus: enable completion search with defaultsort A/B test"
- 08:14 tchanders@deploy2002: Finished scap sync-world: Backport for Deploy temporary accounts to enwiki (T409079) (duration: 12m 22s)
- 08:10 ozge@deploy2002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
- 08:10 tchanders@deploy2002: tchanders, stran: Continuing with sync
- 08:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P84700 and previous config saved to /var/cache/conftool/dbconfig/20251104-081017-marostegui.json
- 08:08 ozge@deploy2002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
- 08:07 marostegui@cumin1003: dbctl commit (dc=all): 'db2176 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84699 and previous config saved to /var/cache/conftool/dbconfig/20251104-080719-root.json
- 08:05 marostegui@cumin1003: dbctl commit (dc=all): 'db1178 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84698 and previous config saved to /var/cache/conftool/dbconfig/20251104-080522-root.json
- 08:04 tchanders@deploy2002: tchanders, stran: Backport for Deploy temporary accounts to enwiki (T409079) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:02 tchanders@deploy2002: Started scap sync-world: Backport for Deploy temporary accounts to enwiki (T409079)
- 08:02 ozge@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
- 08:00 ozge@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
- 07:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1178 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84697 and previous config saved to /var/cache/conftool/dbconfig/20251104-075718-marostegui.json
- 07:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1178.eqiad.wmnet with reason: Maintenance
- 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T407997)', diff saved to https://phabricator.wikimedia.org/P84696 and previous config saved to /var/cache/conftool/dbconfig/20251104-075510-marostegui.json
- 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1185 (T407997)', diff saved to https://phabricator.wikimedia.org/P84695 and previous config saved to /var/cache/conftool/dbconfig/20251104-075239-marostegui.json
- 07:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1185.eqiad.wmnet with reason: Maintenance
- 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'db2176 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84694 and previous config saved to /var/cache/conftool/dbconfig/20251104-075213-root.json
- 07:48 ozge@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
- 07:47 ozge@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
- 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'db2176 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84693 and previous config saved to /var/cache/conftool/dbconfig/20251104-073707-root.json
- 07:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2176 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84692 and previous config saved to /var/cache/conftool/dbconfig/20251104-072854-marostegui.json
- 07:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 07:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P84691 and previous config saved to /var/cache/conftool/dbconfig/20251104-072201-marostegui.json
- 07:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T407997)', diff saved to https://phabricator.wikimedia.org/P84690 and previous config saved to /var/cache/conftool/dbconfig/20251104-070653-marostegui.json
- 07:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1161 (T407997)', diff saved to https://phabricator.wikimedia.org/P84689 and previous config saved to /var/cache/conftool/dbconfig/20251104-070356-marostegui.json
- 07:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 07:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 07:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T407997)', diff saved to https://phabricator.wikimedia.org/P84688 and previous config saved to /var/cache/conftool/dbconfig/20251104-070311-marostegui.json
- 06:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P84687 and previous config saved to /var/cache/conftool/dbconfig/20251104-064803-marostegui.json
- 06:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P84686 and previous config saved to /var/cache/conftool/dbconfig/20251104-063253-marostegui.json
- 06:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T407997)', diff saved to https://phabricator.wikimedia.org/P84685 and previous config saved to /var/cache/conftool/dbconfig/20251104-061745-marostegui.json
- 06:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T407997)', diff saved to https://phabricator.wikimedia.org/P84684 and previous config saved to /var/cache/conftool/dbconfig/20251104-061449-marostegui.json
- 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1159.eqiad.wmnet with reason: Maintenance
- 06:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2204.codfw.wmnet with reason: Maintenance
- 05:02 mwpresync@deploy2002: Pruned MediaWiki: 1.45.0-wmf.23 (duration: 02m 28s)
- 04:51 eileen: civicrm upgraded from c9f9d2b5 to 77cad331
- 03:03 inflatador: bking@cumin2002 restart wdqs-blazegraph.service in CODFW to apply 1201326 T409132
- 02:30 eileen: civicrm upgraded from 1c0619b6 to c9f9d2b5
- 00:58 eileen: civicrm upgraded from 025f3ef3 to 1c0619b6
- 00:32 zabe@deploy2002: Finished scap sync-world: Backport for Using Hadoop for MostTranscludedPages on enwiki (T309738) (duration: 09m 05s)
- 00:26 zabe@deploy2002: zabe: Continuing with sync
- 00:25 zabe@deploy2002: zabe: Backport for Using Hadoop for MostTranscludedPages on enwiki (T309738) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:23 zabe@deploy2002: Started scap sync-world: Backport for Using Hadoop for MostTranscludedPages on enwiki (T309738)
- 00:10 cdanis@dns1004: END - running authdns-update
- 00:09 cdanis@dns1004: START - running authdns-update
- 00:05 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
- 00:04 dzahn@dns1004: END - running authdns-update
- 00:03 dzahn@dns1004: START - running authdns-update
2025-11-03
- 23:40 eileen: civicrm upgraded from b0c68b4a to 025f3ef3
- 23:01 inflatador: bking@cumin2002 repool wdqs2008 and 2012
- 22:56 inflatador: bking@cumin2002 depool wdqs2008 and 2012 so they can catch up on lag
- 22:54 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
- 22:54 ryankemper@cumin2002: END (ERROR) - Cookbook sre.wdqs.restart (exit_code=97)
- 22:54 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
- 22:51 ryankemper: [WDQS] Restarting all codfw wdqs-main hosts; we're getting slammed by increased triple count (same issue we've been seeing intermittently for a week or two)
- 22:28 eileen: civicrm upgraded from 29d3c24f to b0c68b4a
- 22:16 arlolra@deploy2002: Finished scap sync-world: Backport for Deploy Parsoid Read Views to 7 wikis (T408765) (duration: 08m 01s)
- 22:11 arlolra@deploy2002: arlolra: Continuing with sync
- 22:10 arlolra@deploy2002: arlolra: Backport for Deploy Parsoid Read Views to 7 wikis (T408765) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:08 arlolra@deploy2002: Started scap sync-world: Backport for Deploy Parsoid Read Views to 7 wikis (T408765)
- 22:07 inflatador: bking@cumin2002 suppress wdqs2009 alerts for next 90 days T409117
- 22:06 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 90 days, 0:00:00 on wdqs2009.codfw.wmnet with reason: no SLO for this endpoint
- 22:01 arlolra@deploy2002: Finished scap sync-world: Backport for [enwikivoyage] Enable block feature for AbuseFilter (T408885), zhwiki: Add SecurePoll Rights to CheckUser (T408902) (duration: 07m 05s)
- 21:56 arlolra@deploy2002: superpes, zhaofjx, arlolra: Continuing with sync
- 21:56 arlolra@deploy2002: superpes, zhaofjx, arlolra: Backport for [enwikivoyage] Enable block feature for AbuseFilter (T408885), zhwiki: Add SecurePoll Rights to CheckUser (T408902) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:54 arlolra@deploy2002: Started scap sync-world: Backport for [enwikivoyage] Enable block feature for AbuseFilter (T408885), zhwiki: Add SecurePoll Rights to CheckUser (T408902)
- 21:46 kemayo@deploy2002: Finished scap sync-world: Backport for Edit check: allow MWVE_FORCE_EDIT_CHECK_ENABLED to override ecenable (T408890) (duration: 09m 21s)
- 21:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T407997)', diff saved to https://phabricator.wikimedia.org/P84683 and previous config saved to /var/cache/conftool/dbconfig/20251103-214610-marostegui.json
- 21:42 kemayo@deploy2002: kemayo: Continuing with sync
- 21:39 kemayo@deploy2002: kemayo: Backport for Edit check: allow MWVE_FORCE_EDIT_CHECK_ENABLED to override ecenable (T408890) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:37 kemayo@deploy2002: Started scap sync-world: Backport for Edit check: allow MWVE_FORCE_EDIT_CHECK_ENABLED to override ecenable (T408890)
- 21:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P84682 and previous config saved to /var/cache/conftool/dbconfig/20251103-213102-marostegui.json
- 21:30 eileen: civicrm upgraded from 443ec62e to 29d3c24f
- 21:25 aaron@deploy2002: Finished scap sync-world: Backport for Set wgRestSandboxSpecs['wmf-restbase'] to use the static specs everywhere (T396805) (duration: 07m 31s)
- 21:21 aaron@deploy2002: aaron: Continuing with sync
- 21:20 aaron@deploy2002: aaron: Backport for Set wgRestSandboxSpecs['wmf-restbase'] to use the static specs everywhere (T396805) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:18 aaron@deploy2002: Started scap sync-world: Backport for Set wgRestSandboxSpecs['wmf-restbase'] to use the static specs everywhere (T396805)
- 21:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P84681 and previous config saved to /var/cache/conftool/dbconfig/20251103-211552-marostegui.json
- 21:15 aaron@deploy2002: Finished scap sync-world: Backport for Set wgRestSandboxSpecs['wmf-restbase'] on testwiki to use the static specs (T396805) (duration: 07m 16s)
- 21:14 eileen: civicrm upgraded from 443ec62e to 29d3c24f
- 21:11 aaron@deploy2002: aaron: Continuing with sync
- 21:10 aaron@deploy2002: aaron: Backport for Set wgRestSandboxSpecs['wmf-restbase'] on testwiki to use the static specs (T396805) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:08 aaron@deploy2002: Started scap sync-world: Backport for Set wgRestSandboxSpecs['wmf-restbase'] on testwiki to use the static specs (T396805)
- 21:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T407997)', diff saved to https://phabricator.wikimedia.org/P84680 and previous config saved to /var/cache/conftool/dbconfig/20251103-210044-marostegui.json
- 20:54 eileen: civicrm upgraded from 66c0e233 to 443ec62e
- 20:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2238 (T407997)', diff saved to https://phabricator.wikimedia.org/P84679 and previous config saved to /var/cache/conftool/dbconfig/20251103-204844-marostegui.json
- 20:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2238.codfw.wmnet with reason: Maintenance
- 20:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T407997)', diff saved to https://phabricator.wikimedia.org/P84678 and previous config saved to /var/cache/conftool/dbconfig/20251103-204820-marostegui.json
- 20:39 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 20:39 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 20:39 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 20:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 20:38 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 20:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 20:38 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 20:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 20:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P84677 and previous config saved to /var/cache/conftool/dbconfig/20251103-203312-marostegui.json
- 20:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 20:31 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 20:31 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 20:31 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 20:27 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 20:26 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 20:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P84676 and previous config saved to /var/cache/conftool/dbconfig/20251103-201803-marostegui.json
- 20:17 kharlan@deploy2002: Finished scap sync-world: Backport for Hooks: Fetch correct SimpleCaptcha instance in onEditPage__attemptSave_after (T408975) (duration: 07m 22s)
- 20:17 eileen: civicrm upgraded from ed25fa88 to 66c0e233
- 20:13 kharlan@deploy2002: kharlan: Continuing with sync
- 20:12 kharlan@deploy2002: kharlan: Backport for Hooks: Fetch correct SimpleCaptcha instance in onEditPage__attemptSave_after (T408975) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:10 kharlan@deploy2002: Started scap sync-world: Backport for Hooks: Fetch correct SimpleCaptcha instance in onEditPage__attemptSave_after (T408975)
- 20:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T407997)', diff saved to https://phabricator.wikimedia.org/P84675 and previous config saved to /var/cache/conftool/dbconfig/20251103-200255-marostegui.json
- 20:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2226 (T407997)', diff saved to https://phabricator.wikimedia.org/P84674 and previous config saved to /var/cache/conftool/dbconfig/20251103-200030-marostegui.json
- 20:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2226.codfw.wmnet with reason: Maintenance
- 20:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T407997)', diff saved to https://phabricator.wikimedia.org/P84673 and previous config saved to /var/cache/conftool/dbconfig/20251103-200006-marostegui.json
- 19:58 kharlan@deploy2002: Finished scap sync-world: Backport for SimpleCaptcha: Ensure correct instance is used on page creation (T408975) (duration: 07m 22s)
- 19:53 kharlan@deploy2002: kharlan: Continuing with sync
- 19:52 kharlan@deploy2002: kharlan: Backport for SimpleCaptcha: Ensure correct instance is used on page creation (T408975) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 19:50 kharlan@deploy2002: Started scap sync-world: Backport for SimpleCaptcha: Ensure correct instance is used on page creation (T408975)
- 19:45 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 19:45 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P84672 and previous config saved to /var/cache/conftool/dbconfig/20251103-194457-marostegui.json
- 19:37 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: use ve.newTarget hook to avoid globals (T408670) (duration: 07m 47s)
- 19:32 kharlan@deploy2002: kharlan: Continuing with sync
- 19:31 kharlan@deploy2002: kharlan: Backport for hCaptcha: use ve.newTarget hook to avoid globals (T408670) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 19:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P84670 and previous config saved to /var/cache/conftool/dbconfig/20251103-192950-marostegui.json
- 19:29 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: use ve.newTarget hook to avoid globals (T408670)
- 19:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T407997)', diff saved to https://phabricator.wikimedia.org/P84669 and previous config saved to /var/cache/conftool/dbconfig/20251103-191442-marostegui.json
- 19:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2225 (T407997)', diff saved to https://phabricator.wikimedia.org/P84668 and previous config saved to /var/cache/conftool/dbconfig/20251103-190237-marostegui.json
- 19:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2225.codfw.wmnet with reason: Maintenance
- 19:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T407997)', diff saved to https://phabricator.wikimedia.org/P84667 and previous config saved to /var/cache/conftool/dbconfig/20251103-190214-marostegui.json
- 18:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P84666 and previous config saved to /var/cache/conftool/dbconfig/20251103-184706-marostegui.json
- 18:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
- 18:37 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
- 18:36 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
- 18:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
- 18:36 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
- 18:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
- 18:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
- 18:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
- 18:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
- 18:32 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
- 18:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P84665 and previous config saved to /var/cache/conftool/dbconfig/20251103-183159-marostegui.json
- 18:31 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
- 18:30 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
- 18:30 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
- 18:30 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
- 18:29 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
- 18:29 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
- 18:22 swfrench@deploy2002: Finished scap sync-world: Backport for Enroll 100% of client sessions in PHP 8.3 (T405955) (duration: 07m 34s)
- 18:17 swfrench@deploy2002: swfrench: Continuing with sync
- 18:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T407997)', diff saved to https://phabricator.wikimedia.org/P84664 and previous config saved to /var/cache/conftool/dbconfig/20251103-181650-marostegui.json
- 18:16 swfrench@deploy2002: swfrench: Backport for Enroll 100% of client sessions in PHP 8.3 (T405955) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:14 swfrench@deploy2002: Started scap sync-world: Backport for Enroll 100% of client sessions in PHP 8.3 (T405955)
- 18:10 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 18:10 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 18:10 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 18:08 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 18:06 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 18:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 18:05 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 18:05 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 18:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2207 (T407997)', diff saved to https://phabricator.wikimedia.org/P84663 and previous config saved to /var/cache/conftool/dbconfig/20251103-180500-marostegui.json
- 18:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2207.codfw.wmnet with reason: Maintenance
- 17:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2197.codfw.wmnet with reason: Maintenance
- 17:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T407997)', diff saved to https://phabricator.wikimedia.org/P84662 and previous config saved to /var/cache/conftool/dbconfig/20251103-175448-marostegui.json
- 17:48 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:47 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:47 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P84661 and previous config saved to /var/cache/conftool/dbconfig/20251103-173940-marostegui.json
- 17:39 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be1088.eqiad.wmnet with OS trixie
- 17:29 _joe_: ran reprepro cleanvanished on apt-staging to try to clean hanging deb file
- 17:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P84660 and previous config saved to /var/cache/conftool/dbconfig/20251103-172433-marostegui.json
- 17:23 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2203.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T407997)', diff saved to https://phabricator.wikimedia.org/P84659 and previous config saved to /var/cache/conftool/dbconfig/20251103-170924-marostegui.json
- 17:07 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker2203.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:00 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 16:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2189 (T407997)', diff saved to https://phabricator.wikimedia.org/P84658 and previous config saved to /var/cache/conftool/dbconfig/20251103-165733-marostegui.json
- 16:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2189.codfw.wmnet with reason: Maintenance
- 16:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T407997)', diff saved to https://phabricator.wikimedia.org/P84657 and previous config saved to /var/cache/conftool/dbconfig/20251103-165709-marostegui.json
- 16:56 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 16:51 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:51 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for CR interfaces eqiad row D vlans - cmooney@cumin1003"
- 16:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for CR interfaces eqiad row D vlans - cmooney@cumin1003"
- 16:49 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:48 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 16:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 16:43 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1088.eqiad.wmnet with OS trixie
- 16:42 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be1088.eqiad.wmnet with OS trixie
- 16:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P84656 and previous config saved to /var/cache/conftool/dbconfig/20251103-164200-marostegui.json
- 16:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2078.codfw.wmnet with OS bullseye
- 16:36 reedy@deploy2002: Synchronized wmf-config/CommonSettings.php: T404806 (duration: 06m 27s)
- 16:32 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Apply JVM upgrade to 11.0.29 - eevans@cumin1003
- 16:27 topranks: make cr2-eqiad active for row D vlan sub-interfaces on et-1/0/5 T409067
- 16:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P84655 and previous config saved to /var/cache/conftool/dbconfig/20251103-162649-marostegui.json
- 16:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 16:22 topranks: enable row D vlan sub-interfaces on cr2-eqiad et-1/0/5 T409067
- 16:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2078.codfw.wmnet with reason: host reimage
- 16:18 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 16:12 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2078.codfw.wmnet with reason: host reimage
- 16:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T407997)', diff saved to https://phabricator.wikimedia.org/P84653 and previous config saved to /var/cache/conftool/dbconfig/20251103-161142-marostegui.json
- 16:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1088.eqiad.wmnet with OS trixie
- 16:04 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be1088.eqiad.wmnet with OS trixie
- 15:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2175 (T407997)', diff saved to https://phabricator.wikimedia.org/P84652 and previous config saved to /var/cache/conftool/dbconfig/20251103-155902-marostegui.json
- 15:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
- 15:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T407997)', diff saved to https://phabricator.wikimedia.org/P84651 and previous config saved to /var/cache/conftool/dbconfig/20251103-155838-marostegui.json
- 15:57 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Apply JVM upgrade to 11.0.29 - eevans@cumin1003
- 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
- 15:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2078
- 15:54 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2078
- 15:54 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2078.codfw.wmnet with OS bullseye
- 15:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
- 15:51 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on ms-be[2085-2087].codfw.wmnet with reason: awaiting controller swap
- 15:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P84650 and previous config saved to /var/cache/conftool/dbconfig/20251103-154330-marostegui.json
- 15:31 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es2031.codfw.wmnet
- 15:31 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:31 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2031.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 15:31 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2031.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 15:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P84649 and previous config saved to /var/cache/conftool/dbconfig/20251103-152822-marostegui.json
- 15:26 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 15:21 fceratto@cumin1003: START - Cookbook sre.hosts.decommission for hosts es2031.codfw.wmnet
- 15:19 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 15:16 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 15:15 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es2030.codfw.wmnet
- 15:15 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:15 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2030.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 15:14 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2030.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 15:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T407997)', diff saved to https://phabricator.wikimedia.org/P84648 and previous config saved to /var/cache/conftool/dbconfig/20251103-151315-marostegui.json
- 15:05 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 15:05 topranks: enable link from asw2-d7-eqiad to ssw1-d8-eqiad T409067
- 15:03 Lucas_WMDE: UTC afternoon backport+config window done
- 15:03 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for i18n: all behavior switches should start/end with __ (part 2), i18n: Remove deprecated behavior switches without underscores in et/sh-latn/vep (T407289) (duration: 09m 45s)
- 15:02 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1088.eqiad.wmnet with OS trixie
- 15:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2148 (T407997)', diff saved to https://phabricator.wikimedia.org/P84647 and previous config saved to /var/cache/conftool/dbconfig/20251103-150029-marostegui.json
- 15:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2148.codfw.wmnet with reason: Maintenance
- 14:58 lucaswerkmeister-wmde@deploy2002: cscott, lucaswerkmeister-wmde: Continuing with sync
- 14:58 fceratto@cumin1003: START - Cookbook sre.hosts.decommission for hosts es2030.codfw.wmnet
- 14:57 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS bookworm
- 14:56 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es2029.codfw.wmnet
- 14:56 fceratto@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:56 fceratto@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2029.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 14:56 topranks: disable et-1/1/3 on cr2-eqiad connecting to asw2-d-eqiad T409067
- 14:56 lucaswerkmeister-wmde@deploy2002: cscott, lucaswerkmeister-wmde: Backport for i18n: all behavior switches should start/end with __ (part 2), i18n: Remove deprecated behavior switches without underscores in et/sh-latn/vep (T407289) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:55 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS bookworm
- 14:53 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for i18n: all behavior switches should start/end with __ (part 2), i18n: Remove deprecated behavior switches without underscores in et/sh-latn/vep (T407289)
- 14:50 fceratto@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2029.codfw.wmnet decommissioned, removing all IPs except the asset tag one - fceratto@cumin1003"
- 14:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repool db1259', diff saved to https://phabricator.wikimedia.org/P84646 and previous config saved to /var/cache/conftool/dbconfig/20251103-145018-marostegui.json
- 14:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 14:48 topranks: make cr1-eqiad VRRP primary for row D vlans T409067
- 14:47 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for upload: Remove stashed file in UploadFromStash when upload completed (T408610), recentchanges: Fix highlights where more than one action is defined (T409020) (duration: 12m 10s)
- 14:44 fceratto@cumin1003: START - Cookbook sre.dns.netbox
- 14:42 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, matmarex: Continuing with sync
- 14:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P84643 and previous config saved to /var/cache/conftool/dbconfig/20251103-144215-marostegui.json
- 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Cleanup T408408 T408409 T408410', diff saved to https://phabricator.wikimedia.org/P84642 and previous config saved to /var/cache/conftool/dbconfig/20251103-144204-fceratto.json
- 14:39 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
- 14:39 topranks: enable cr1-eqiad sub-interfaces for row D vlans T409067
- 14:39 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, matmarex: Backport for upload: Remove stashed file in UploadFromStash when upload completed (T408610), recentchanges: Fix highlights where more than one action is defined (T409020) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:37 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1007.eqiad.wmnet with reason: schema change
- 14:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
- 14:35 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for upload: Remove stashed file in UploadFromStash when upload completed (T408610), recentchanges: Fix highlights where more than one action is defined (T409020)
- 14:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1009.eqiad.wmnet with reason: schema change
- 14:34 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Enable pagination on Special:EditWatchlist everywhere (T41510) (duration: 12m 08s)
- 14:29 lucaswerkmeister-wmde@deploy2002: cparle, lucaswerkmeister-wmde: Continuing with sync
- 14:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:28 fceratto@cumin1003: START - Cookbook sre.hosts.decommission for hosts es2029.codfw.wmnet
- 14:26 lucaswerkmeister-wmde@deploy2002: cparle, lucaswerkmeister-wmde: Backport for Enable pagination on Special:EditWatchlist everywhere (T41510) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:22 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Enable pagination on Special:EditWatchlist everywhere (T41510)
- 14:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P84641 and previous config saved to /var/cache/conftool/dbconfig/20251103-142204-marostegui.json
- 14:20 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Revert "Adding Movepage-summary to wgForceUIMsgAsContentMsg to allow" (T183848), Freeze LiquidThreads on huwiki and svwikisource (T406026 T406227) (duration: 14m 16s)
- 14:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) es2031 - Depool es2031 T408410
- 14:16 fceratto@cumin1003: START - Cookbook sre.mysql.depool es2031 - Depool es2031 T408410
- 14:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) es2030 - Depool es2030 T408409
- 14:16 fceratto@cumin1003: START - Cookbook sre.mysql.depool es2030 - Depool es2030 T408409
- 14:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) es2030 - Depool es2030 T408409
- 14:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool es2030 - Depool es2030 T408409
- 14:12 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, esanders, func: Continuing with sync
- 14:10 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, esanders, func: Backport for Revert "Adding Movepage-summary to wgForceUIMsgAsContentMsg to allow" (T183848), Freeze LiquidThreads on huwiki and svwikisource (T406026 T406227) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T407997)', diff saved to https://phabricator.wikimedia.org/P84638 and previous config saved to /var/cache/conftool/dbconfig/20251103-140653-marostegui.json
- 14:05 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Revert "Adding Movepage-summary to wgForceUIMsgAsContentMsg to allow" (T183848), Freeze LiquidThreads on huwiki and svwikisource (T406026 T406227)
- 14:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 13:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 13:56 topranks: shut down cr1-eqiad link to asw2-d-eqiad to migrate traffic via Nokia spines T409067
- 13:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) es2029 - Depool es2029 T408408
- 13:55 fceratto@cumin1003: START - Cookbook sre.mysql.depool es2029 - Depool es2029 T408408
- 13:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1259 (T407997)', diff saved to https://phabricator.wikimedia.org/P84636 and previous config saved to /var/cache/conftool/dbconfig/20251103-135400-marostegui.json
- 13:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
- 13:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T407997)', diff saved to https://phabricator.wikimedia.org/P84635 and previous config saved to /var/cache/conftool/dbconfig/20251103-135336-marostegui.json
- 13:53 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on asw2-d-eqiad,cr[1-2]-eqiad with reason: moving uplinks from CRs to Nokia Spines on asw2-d-eqiad
- 13:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 13:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 13:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P84634 and previous config saved to /var/cache/conftool/dbconfig/20251103-133828-marostegui.json
- 13:33 fceratto@cumin1003: dbctl commit (dc=all): 'Update masters for T402859', diff saved to https://phabricator.wikimedia.org/P84633 and previous config saved to /var/cache/conftool/dbconfig/20251103-133342-fceratto.json
- 13:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P84632 and previous config saved to /var/cache/conftool/dbconfig/20251103-132320-marostegui.json
- 13:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T407997)', diff saved to https://phabricator.wikimedia.org/P84631 and previous config saved to /var/cache/conftool/dbconfig/20251103-130812-marostegui.json
- 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Update masters for T402859', diff saved to https://phabricator.wikimedia.org/P84630 and previous config saved to /var/cache/conftool/dbconfig/20251103-130011-fceratto.json
- 12:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1254 (T407997)', diff saved to https://phabricator.wikimedia.org/P84629 and previous config saved to /var/cache/conftool/dbconfig/20251103-125643-marostegui.json
- 12:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
- 12:55 fceratto@dns1004: END - running authdns-update
- 12:54 fceratto@dns1004: START - running authdns-update
- 12:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
- 12:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T407997)', diff saved to https://phabricator.wikimedia.org/P84628 and previous config saved to /var/cache/conftool/dbconfig/20251103-124632-marostegui.json
- 12:35 topranks: move analytics1-c-eqiad gateway IPs to new spine switch port cr2-eqiad T405579
- 12:33 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:33 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for analytics1-c-eqiad IPs cr1-eqiad - cmooney@cumin1003"
- 12:33 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for analytics1-c-eqiad IPs cr1-eqiad - cmooney@cumin1003"
- 12:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P84627 and previous config saved to /var/cache/conftool/dbconfig/20251103-123125-marostegui.json
- 12:27 topranks: adjust VRRP priority for analytics1-d-eqiad to make cr1-eqiad active gateway T405579
- 12:26 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 12:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P84626 and previous config saved to /var/cache/conftool/dbconfig/20251103-121617-marostegui.json
- 12:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T407997)', diff saved to https://phabricator.wikimedia.org/P84625 and previous config saved to /var/cache/conftool/dbconfig/20251103-120108-marostegui.json
- 11:58 topranks: move analytics1-c-eqiad gateway IPs to new spine switch ports eqiad T405579
- 11:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1233 (T407997)', diff saved to https://phabricator.wikimedia.org/P84624 and previous config saved to /var/cache/conftool/dbconfig/20251103-114913-marostegui.json
- 11:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
- 11:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T407997)', diff saved to https://phabricator.wikimedia.org/P84623 and previous config saved to /var/cache/conftool/dbconfig/20251103-114849-marostegui.json
- 11:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P84622 and previous config saved to /var/cache/conftool/dbconfig/20251103-113341-marostegui.json
- 11:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
- 11:28 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 11:27 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 11:18 brouberol@dns1004: END - running authdns-update
- 11:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P84621 and previous config saved to /var/cache/conftool/dbconfig/20251103-111834-marostegui.json
- 11:18 brouberol@dns1004: START - running authdns-update
- 11:10 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-test-worker1001.eqiad.wmnet with OS bullseye
- 11:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T407997)', diff saved to https://phabricator.wikimedia.org/P84620 and previous config saved to /var/cache/conftool/dbconfig/20251103-110326-marostegui.json
- 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1229 (T407997)', diff saved to https://phabricator.wikimedia.org/P84619 and previous config saved to /var/cache/conftool/dbconfig/20251103-110111-marostegui.json
- 11:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
- 10:52 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
- 10:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 10:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T407997)', diff saved to https://phabricator.wikimedia.org/P84618 and previous config saved to /var/cache/conftool/dbconfig/20251103-105038-marostegui.json
- 10:46 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
- 10:44 marostegui@dns1006: END - running authdns-update
- 10:44 marostegui: Switch m3 (phabricator) proxy to dbproxy1028 T408956
- 10:44 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-test-worker1001.eqiad.wmnet with reason: host reimage
- 10:44 marostegui@dns1006: START - running authdns-update
- 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 100%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84617 and previous config saved to /var/cache/conftool/dbconfig/20251103-104152-root.json
- 10:38 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-test-worker1001.eqiad.wmnet with reason: host reimage
- 10:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20251103-103527-marostegui.json
- 10:26 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 75%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84616 and previous config saved to /var/cache/conftool/dbconfig/20251103-102645-root.json
- 10:22 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host an-test-worker1001.eqiad.wmnet with OS bullseye
- 10:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P84614 and previous config saved to /var/cache/conftool/dbconfig/20251103-102018-marostegui.json
- 10:17 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
- 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 60%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84612 and previous config saved to /var/cache/conftool/dbconfig/20251103-101138-root.json
- 10:07 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
- 10:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T407997)', diff saved to https://phabricator.wikimedia.org/P84611 and previous config saved to /var/cache/conftool/dbconfig/20251103-100511-marostegui.json
- 10:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
- 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1197 (T407997)', diff saved to https://phabricator.wikimedia.org/P84610 and previous config saved to /var/cache/conftool/dbconfig/20251103-100257-marostegui.json
- 10:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T407997)', diff saved to https://phabricator.wikimedia.org/P84609 and previous config saved to /var/cache/conftool/dbconfig/20251103-100233-marostegui.json
- 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 50%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84608 and previous config saved to /var/cache/conftool/dbconfig/20251103-095632-root.json
- 09:50 moritzm: installing intel-microcode security updates
- 09:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P84607 and previous config saved to /var/cache/conftool/dbconfig/20251103-094726-marostegui.json
- 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 30%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84606 and previous config saved to /var/cache/conftool/dbconfig/20251103-094126-root.json
- 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
- 09:40 elukey@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
- 09:39 elukey@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
- 09:38 elukey@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
- 09:37 elukey@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
- 09:35 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
- 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P84605 and previous config saved to /var/cache/conftool/dbconfig/20251103-093218-marostegui.json
- 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 25%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84604 and previous config saved to /var/cache/conftool/dbconfig/20251103-092618-root.json
- 09:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T407997)', diff saved to https://phabricator.wikimedia.org/P84603 and previous config saved to /var/cache/conftool/dbconfig/20251103-091708-marostegui.json
- 09:15 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
- 09:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1188 (T407997)', diff saved to https://phabricator.wikimedia.org/P84602 and previous config saved to /var/cache/conftool/dbconfig/20251103-091452-marostegui.json
- 09:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T407997)', diff saved to https://phabricator.wikimedia.org/P84601 and previous config saved to /var/cache/conftool/dbconfig/20251103-091435-marostegui.json
- 09:11 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
- 09:11 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 15%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84600 and previous config saved to /var/cache/conftool/dbconfig/20251103-091109-root.json
- 09:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1174.eqiad.wmnet onto db1231.eqiad.wmnet
- 09:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1174 gradually with 4 steps - Pool db1174.eqiad.wmnet in after cloning
- 09:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1174 gradually with 4 steps - Pool db1174.eqiad.wmnet in after cloning
- 09:06 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) db1174 gradually with 4 steps - Pool db1174.eqiad.wmnet in after cloning
- 09:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:00 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix uncommitted changes for mwdebug2002 - elukey@cumin1003"
- 08:59 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix uncommitted changes for mwdebug2002 - elukey@cumin1003"
- 08:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P84599 and previous config saved to /var/cache/conftool/dbconfig/20251103-085925-marostegui.json
- 08:56 elukey@cumin1003: START - Cookbook sre.dns.netbox
- 08:56 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 10%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84598 and previous config saved to /var/cache/conftool/dbconfig/20251103-085600-root.json
- 08:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P84596 and previous config saved to /var/cache/conftool/dbconfig/20251103-084417-marostegui.json
- 08:41 godog: silence wikitech-static icinga alert for a couple of weeks - T409029
- 08:40 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 5%: After moving it to s7', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20251103-084049-root.json
- 08:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T407997)', diff saved to https://phabricator.wikimedia.org/P84594 and previous config saved to /var/cache/conftool/dbconfig/20251103-082909-marostegui.json
- 08:25 marostegui@cumin1003: dbctl commit (dc=all): 'db1231 (re)pooling @ 1%: After moving it to s7', diff saved to https://phabricator.wikimedia.org/P84593 and previous config saved to /var/cache/conftool/dbconfig/20251103-082543-root.json
- 08:20 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1174 gradually with 4 steps - Pool db1174.eqiad.wmnet in after cloning
- 08:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1182 (T407997)', diff saved to https://phabricator.wikimedia.org/P84591 and previous config saved to /var/cache/conftool/dbconfig/20251103-081238-marostegui.json
- 08:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 08:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T407997)', diff saved to https://phabricator.wikimedia.org/P84590 and previous config saved to /var/cache/conftool/dbconfig/20251103-081214-marostegui.json
- 07:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P84589 and previous config saved to /var/cache/conftool/dbconfig/20251103-075706-marostegui.json
- 07:51 marostegui@cumin1003: dbctl commit (dc=all): 'db1177 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84588 and previous config saved to /var/cache/conftool/dbconfig/20251103-075130-root.json
- 07:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P84587 and previous config saved to /var/cache/conftool/dbconfig/20251103-074156-marostegui.json
- 07:36 marostegui@cumin1003: dbctl commit (dc=all): 'db1177 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84586 and previous config saved to /var/cache/conftool/dbconfig/20251103-073624-root.json
- 07:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T407997)', diff saved to https://phabricator.wikimedia.org/P84585 and previous config saved to /var/cache/conftool/dbconfig/20251103-072647-marostegui.json
- 07:25 marostegui@cumin1003: dbctl commit (dc=all): 'Remove es1034 from dbctl T409025', diff saved to https://phabricator.wikimedia.org/P84584 and previous config saved to /var/cache/conftool/dbconfig/20251103-072527-marostegui.json
- 07:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1162 (T407997)', diff saved to https://phabricator.wikimedia.org/P84583 and previous config saved to /var/cache/conftool/dbconfig/20251103-072431-marostegui.json
- 07:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 07:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T407997)', diff saved to https://phabricator.wikimedia.org/P84582 and previous config saved to /var/cache/conftool/dbconfig/20251103-072405-marostegui.json
- 07:23 marostegui@cumin1003: dbctl commit (dc=all): 'db2174 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P84581 and previous config saved to /var/cache/conftool/dbconfig/20251103-072303-root.json
- 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'db1177 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84580 and previous config saved to /var/cache/conftool/dbconfig/20251103-072118-root.json
- 07:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P84579 and previous config saved to /var/cache/conftool/dbconfig/20251103-070853-marostegui.json
- 07:07 marostegui@cumin1003: dbctl commit (dc=all): 'db2174 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P84578 and previous config saved to /var/cache/conftool/dbconfig/20251103-070753-root.json
- 07:06 marostegui@cumin1003: dbctl commit (dc=all): 'db1177 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84577 and previous config saved to /var/cache/conftool/dbconfig/20251103-070612-root.json
- 06:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1177 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84576 and previous config saved to /var/cache/conftool/dbconfig/20251103-065808-marostegui.json
- 06:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1177.eqiad.wmnet with reason: Maintenance
- 06:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P84575 and previous config saved to /var/cache/conftool/dbconfig/20251103-065346-marostegui.json
- 06:52 marostegui@cumin1003: dbctl commit (dc=all): 'db2174 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P84574 and previous config saved to /var/cache/conftool/dbconfig/20251103-065248-root.json
- 06:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T407997)', diff saved to https://phabricator.wikimedia.org/P84573 and previous config saved to /var/cache/conftool/dbconfig/20251103-063838-marostegui.json
- 06:38 marostegui: Drop afl_ip related triggers from s2 T408780
- 06:37 marostegui@cumin1003: dbctl commit (dc=all): 'db2174 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P84572 and previous config saved to /var/cache/conftool/dbconfig/20251103-063742-root.json
- 06:29 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2174 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P84571 and previous config saved to /var/cache/conftool/dbconfig/20251103-062919-marostegui.json
- 06:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 06:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1156 (T407997)', diff saved to https://phabricator.wikimedia.org/P84570 and previous config saved to /var/cache/conftool/dbconfig/20251103-062603-marostegui.json
- 06:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 06:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1174 - Depool db1174.eqiad.wmnet to then clone it to db1231.eqiad.wmnet - marostegui@cumin1003
- 06:20 marostegui@cumin1003: START - Cookbook sre.mysql.depool db1174 - Depool db1174.eqiad.wmnet to then clone it to db1231.eqiad.wmnet - marostegui@cumin1003
- 06:20 marostegui@cumin1003: START - Cookbook sre.mysql.clone of db1174.eqiad.wmnet onto db1231.eqiad.wmnet
- 06:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db[1174,1231].eqiad.wmnet with reason: Moving db1231 to s7
- 06:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1231 T408829', diff saved to https://phabricator.wikimedia.org/P84568 and previous config saved to /var/cache/conftool/dbconfig/20251103-061906-marostegui.json
- 06:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
- 04:07 musikanimal@deploy2002: Finished scap sync-world: Backport for AbstractRenderer: ensure OutputPage::setDisplayTitle() gets passed safe HTML (duration: 39m 55s)
- 03:53 musikanimal@deploy2002: musikanimal: Continuing with sync
- 03:52 musikanimal@deploy2002: musikanimal: Backport for AbstractRenderer: ensure OutputPage::setDisplayTitle() gets passed safe HTML synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 03:27 musikanimal@deploy2002: Started scap sync-world: Backport for AbstractRenderer: ensure OutputPage::setDisplayTitle() gets passed safe HTML
- 01:15 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 15m 04s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-11-02
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 14m 01s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-11-01
- 23:50 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1002-dev.eqiad.wmnet with OS trixie
- 22:27 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudbackup1002-dev.eqiad.wmnet with reason: host reimage
- 22:22 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudbackup1002-dev.eqiad.wmnet with reason: host reimage
- 22:10 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudbackup1002-dev.eqiad.wmnet with OS trixie
- 11:01 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudbackup1001-dev.eqiad.wmnet with OS trixie
2000s
- Archive 1: 2004 Jun - 2004 Sep
- Archive 2: 2004 Oct - 2004 Nov
- Archive 3: 2004 Dec - 2005 Mar
- Archive 4: 2005 Apr - 2005 Jul
- Archive 5: 2005 Aug - 2005 Oct, with revision history 2004-06-23 to 2005-11-25
- Archive 6: 2005 Nov - 2006 Feb
- Archive 7: 2006 Mar - 2006 Jun
- Archive 8: 2006 Jul - 2006 Sep
- Archive 9: 2006 Oct - 2007 Jan, with revision history 2005-11-25 to 2007-02-21
- Archive 10: 2007 Feb - 2007 Jun
- Archive 11: 2007 Jul - 2007 Dec
- Archive 12: 2008 Jan - 2008 Jul
- Archive 12a: 2008 Aug
- Archive 12b: 2008 Sept
- Archive 13: 2008 Oct - 2009 Jun
- Archive 14: 2009 Jun - 2009 Dec
2010s
- Archive 15: 2010 Jan - 2010 Jun
- Archive 16: 2010 Jul - 2010 Oct
- Archive 17: 2010 Nov - 2010 Dec
- Archive 18: 2011 Jan - 2011 Jun
- Archive 19: 2011 Jul - 2011 Dec
- Archive 20: 2011 Dec - 2012 Jun, with revision history 2007-02-21 to 2012-03-27
- Archive 21: 2012 Jul - 2013 Jan
- Archive 22: 2013 Jan - 2013 Jul
- Archive 23: 2013 Aug - 2013 Dec
- Archive 24: 2014 Jan - 2014 Mar
- Archive 25: 2014 April - 2014 September
- Archive 26: 2014 October - 2014 December
- Archive 27: 2015 January - 2015 July
- Archive 28: 2015 August - 2015 December
- Archive 29: 2016 January - 2016 May
- Archive 30: 2016 June - 2016 August
- Archive 31: 2016 September - 2016 December
- Archive 32: 2017 January - 2017 July
- Archive 33: 2017 August - 2017 December
- Archive 34: 2018 January - 2018 April
- Archive 35: 2018 May - 2018 August
- Archive 36: 2018 September - 2018 December
- Archive 37: 2019 January - 2019 April
- Archive 38: 2019 May - 2019 August
- Archive 39: 2019 September - 2019 December
2020-2024
- Archive 40: 2020 January - 2020 April
- Archive 41: 2020 May - 2020 July
- Archive 42: 2020 August - 2020 November
- Archive 43: 2020 December
- Archive 44: 2021 January - 2021 April
- Archive 45: 2021 May - 2021 July
- Archive 46: 2021 August - 2021 October
- Archive 47: 2021 November - 2021 December
- Archive 48: 2022 January
- Archive 49: 2022 February
- Archive 50: 2022 March
- Archive 51: 2022 April 1-15
- Archive 52: 2022 April 16-30
- Archive 53: 2022 May
- Archive 54: 2022 June
- Archive 55: 2022 July
- Archive 56: 2022 August
- Archive 57: 2022 September
- Archive 58: 2022 October
- Archive 59: 2022 November 1-15
- Archive 60: 2022 November 16-30
- Archive 61: 2022 December
- Archive 62: 2023 January
- Archive 63: 2023 February
- Archive 64: 2023 March
- Archive 65: 2023 April
- Archive 66: 2023 May
- Archive 67: 2023 June
- Archive 68: 2023 July
- Archive 69: 2023 August 1-15
- Archive 70: 2023 August 16-31
- Archive 71: 2023 September
- Archive 72: 2023 October
- Archive 73: 2023 November
- Archive 74: 2023 December
- Archive 75: 2024 January
- Archive 76: 2024 February
- Archive 77: 2024 March
- Archive 78: 2024 April
- Archive 79: 2024 May 1-15
- Archive 80: 2024 May 16-31
- Archive 81: 2024 June 1-15
- Archive 82: 2024 June 16-30
- Archive 83: 2024 July
- Archive 84: 2024 August
- Archive 85: 2024 September
- Archive 86: 2024 October
- Archive 87: 2024 November
- Archive 88: 2024 December