08:48 andrewbogott: rebooting cloudvirt1023 to investigate some nova things
08:10 elukey: rm /var/log user.log.1 messages.1 daemon.log.1 kafkatee.log.1 syslog.1 on netflow2001 to free space (logs spammed with the same error message over and over)
08:07 elukey: restart kafkatee-webrequest.service on netflow1001 (spamming logs about not being able to bind to address:port)
08:07 elukey: restart fastmon on netflow2001 as attempt to stop spamming logs (failed)
08:06 elukey: restart kafkatee-webrequest.service on netflow2001 (spamming logs about not being able to bind to address:port)
07:56 onimisionipe: depool maps1002 for postgres init. - T239728
07:55 elukey: execute clear bfd session address fe80::ee38:7300:17e8:a04e on cr3-knams to restore BFD session with eqdfw (OSPF3 status ok on cr3-knams)
06:30 moritzm: installing libice security updates
00:32 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: GrowthExperiments: Begin "initiation test" for suggested edits (T238888) (duration: 00m 55s)
00:21 catrope@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/GrowthExperiments/: GrowthExperiments: record suggestededits pre-activation as a preference (T238888) (duration: 00m 55s)
00:10 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: GrowthExperiments: Align help panel new account enabling with homepage (T232396) (duration: 00m 56s)
2019-12-12
22:48 eileen: process-control config revision is d195531033 jobs temporarily disabled
21:23 arlolra@deploy1001: Started deploy [parsoid/deploy@75d72e8]: Updating Parsoid to 28d7c21
20:54 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Drop wgMediaInfoEnableOtherStatements, wgDepictsQualifierProperties, and wgDisableRollbackConfirmationFeature (duration: 00m 58s)
20:52 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Drop wgSpamBlacklistEventLogging, no longer read (duration: 00m 58s)
20:48 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T240546 Enable the Wikisource extension on all Wikisources except old Wikisource (duration: 00m 57s)
20:46 jforrester@deploy1001: Synchronized static/images/project-logos/wikimaniawiki.png: T240578 Change wikimaniawiki logo back to general version, 1x (duration: 00m 56s)
20:45 jforrester@deploy1001: Synchronized static/images/project-logos/wikimaniawiki-1.5x.png: T240578 Change wikimaniawiki logo back to general version, 1.5x (duration: 00m 55s)
20:43 jforrester@deploy1001: Synchronized static/images/project-logos/wikimaniawiki-2x.png: T240578 Change wikimaniawiki logo back to general version, 2x (duration: 00m 56s)
20:20 cdanis@cumin2001: dbctl commit (dc=all): 'T229686 add sections es1/es2/es3/x1 and their instances', diff saved to https://phabricator.wikimedia.org/P9866 and previous config saved to /var/cache/conftool/dbconfig/20191212-202023-cdanis.json
20:18 cdanis: T229686 adding sections es1/es2/es3/x1 to dbctl's section data
20:18 dduvall@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.10
20:17 cdanis: T229686 adding instances backing es1/es2/es3/x1 to dbctl's instance data
20:14 ejegg: updated fundraising internal dashboard from cc6d5cdde7 to 1105bf1796
20:02 onimisionipe: pool maps1001 - postgres re-init is complete - T239728
16:43 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/Wikibase/lib/includes/Store/Sql/SqlEntityInfoBuilder.php: Consider any type of empty value as uncached in SqlEntityInfoBuilder (T237984) (duration: 01m 03s)
16:39 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Wikibase/lib/includes/Store/Sql/SqlEntityInfoBuilder.php: Consider any type of empty value as uncached in SqlEntityInfoBuilder (T237984) (duration: 01m 03s)
06:27 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db2070 from config as it will be decommissioned T239684', diff saved to https://phabricator.wikimedia.org/P9848 and previous config saved to /var/cache/conftool/dbconfig/20191211-062700-marostegui.json
21:48 shdubsh: restart prometheus on prometheus2003 -- T238807
21:45 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@aa65057]: Update mobileapps to f9771ab
21:32 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@08cfd70]: Set location of ivy cache for spark (duration: 00m 24s)
21:32 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@08cfd70]: Set location of ivy cache for spark
19:18 Urbanecm: Morning SWAT done
19:18 Urbanecm: Purge several logo files (T150618)
19:18 Urbanecm: Run namespaceDupes.php for eswikisource (T240050)
19:16 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: 32da89f: Upload HD logos for en, fi and nl arbcom wikis (2/2, T150618) (duration: 01m 00s)
19:14 urbanecm@deploy1001: Synchronized static/images/project-logos/: SWAT: 32da89f: Upload HD logos for en, fi and nl arbcom wikis (1/2, T150618) (duration: 01m 01s)
19:07 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: f984d18: Add aliases for Help and Project on eswikisource (T240050) (duration: 01m 00s)
19:01 onimisionipe: continue osm-import on maps1004 - T239728
18:37 herron: enabling lvs for kibana-next elk7 upgrade environment, in case any alerts fire relating to this please disreagard them
23:44 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T235263 Turn off redirect on exact search match for Commons (duration: 01m 00s)
23:04 ebernhardson: [cloudelastic-chi] reduce indices.recovery.max_bytes_per_sec from 512mb->128mb
22:30 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
19:35 mutante: Icinga: delete all downtimes for mw2259. Scheduling Icinga downtimes is tricky business. If you add some for hardware failure and they are too short you cause Icinga spam, if they are too long and the dcops operator is amazingly fast like Papaul then your server is back in production but not monitored and you have to click a million times in the web UI to remove them to avoid that.
19:34 bblack: ns1.wikimedia.org: re-route authdns traffic from authdns2001 (to be reimaged) -> dns2001 temporarily - T239667
16:47 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@87b25f2]: initial airflow dags/plugins
16:40 brion: running `requeueTranscodes.php --error --throttle` on mwmaint1002 to clean up T239831-related broken video transcodes. will raise usage on video scalers for a while.
16:33 elukey: execute clear bfd session address fe80::5e5e:ab00:d3d:85ce on cr3-knams
16:32 elukey: execute clear bfd session address fe80::7a4f:9b00:d4e:8004 on cr1-eqiad
16:20 elukey: execute clear bfd session address 208.80.154.208 on cr2-eqord
08:09 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1062 from etcd T239188', diff saved to https://phabricator.wikimedia.org/P9821 and previous config saved to /var/cache/conftool/dbconfig/20191205-080909-marostegui.json
08:03 elukey: remove logstash_cleanup_indices_apifeatureusage-search.svc.codfw.wmnet and logstash_cleanup_indices_apifeatureusage-search.svc.eqiad.wmnet from logstash1025,logstash1024,logstash1023,logstash2024,logstash2025 to reduce cronspam - T234854
07:42 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1099:3311, db1099:3318', diff saved to https://phabricator.wikimedia.org/P9820 and previous config saved to /var/cache/conftool/dbconfig/20191205-074200-marostegui.json
07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1099:3311, db1099:3318', diff saved to https://phabricator.wikimedia.org/P9819 and previous config saved to /var/cache/conftool/dbconfig/20191205-073209-marostegui.json
07:29 _joe_: ran apt-get install manually on kubestagetcd1001 to fix broken packages
07:25 _joe_: manually running package_builder_Clean_up_build_directory.service on boron
07:23 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1099:3311, db1099:3318', diff saved to https://phabricator.wikimedia.org/P9818 and previous config saved to /var/cache/conftool/dbconfig/20191205-072314-marostegui.json
07:22 _joe_: umounting /proc,/sys,/dev from /var/cache/pbuilder/build/cow.6815 on boron to allow reaping it away
07:14 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1099:3311, db1099:3318', diff saved to https://phabricator.wikimedia.org/P9817 and previous config saved to /var/cache/conftool/dbconfig/20191205-071445-marostegui.json
07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3311, db1099:3318 for upgrade', diff saved to https://phabricator.wikimedia.org/P9816 and previous config saved to /var/cache/conftool/dbconfig/20191205-070631-marostegui.json
06:55 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1101:3317, db1101:3318', diff saved to https://phabricator.wikimedia.org/P9815 and previous config saved to /var/cache/conftool/dbconfig/20191205-065536-marostegui.json
06:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
06:48 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1101:3317, db1101:3318', diff saved to https://phabricator.wikimedia.org/P9814 and previous config saved to /var/cache/conftool/dbconfig/20191205-064845-marostegui.json
06:31 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1101:3317, db1101:3318', diff saved to https://phabricator.wikimedia.org/P9813 and previous config saved to /var/cache/conftool/dbconfig/20191205-063103-marostegui.json
06:14 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1101:3317, db1101:3318', diff saved to https://phabricator.wikimedia.org/P9812 and previous config saved to /var/cache/conftool/dbconfig/20191205-061453-marostegui.json
05:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101:3317 for upgrade', diff saved to https://phabricator.wikimedia.org/P9811 and previous config saved to /var/cache/conftool/dbconfig/20191205-055756-marostegui.json
03:37 twentyafterfour: leaving phabricator on phab1003 for tonight while phab1001 raid syncs, will pick it up tomorrow to decide where to go from here
00:43 cwhite@cumin1001: dbctl commit (dc=all): 'Depool db1062 T239874', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20191205-004256-cwhite.json
2019-12-04
23:38 brennen@deploy1001: rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.35.0-wmf.5"
23:35 brennen@deploy1001: Scap failed!: 9/11 canaries failed their endpoint checks(http://en.wikipedia.org)
22:39 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@e6afe36]: Update mobileapps to 9e9b042
22:39 bstorm_: powered off cloudstore1008, disabled sync from cloudstore1009, and downtimed both cloudstore1008 and cloudstore1009 for memory module replacement T239569
22:37 bstorm_: poweroff cloudstore1008 for memory module replacement
22:24 RoanKattouw: T208369 ran mwscript extensions/GrowthExperiments/maintenance/deleteOldSurveys.php kowiki --cutoff 350
22:21 RoanKattouw: T208369 ran mwscript extensions/GrowthExperiments/maintenance/deleteOldSurveys.php cswiki --cutoff 350
21:48 eileen: civicrm revision changed from 6812488f3a to 4a72ad4e63, config revision is 9f4db1edad (CiviCRM security patches )
18:24 arlolra@deploy1001: Started deploy [parsoid/deploy@0910e18]: Updating Parsoid to b81bbf4
18:08 bblack: dns1002: back to normal state
18:05 bblack: dns1002: stopping recursive dns to test failure theory (same method as prere-imaging earlier, intended to not cause impact)
17:54 bblack: dns1001: back to normal state
17:51 bblack: dns1001: stopping recursive dns to test failure theory (same method as prere-imaging earlier, intended to not cause impact)
17:50 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikibase/repo/includes/ParserOutput/FullEntityParserOutputGenerator.php: T229407, part III (duration: 01m 01s)
15:29 moritzm: installing mariadb 10.3 updates from Buster 10.2 point release (client libs/tools only)
15:28 mobrovac@deploy1001: Finished deploy [restbase/deploy@f4b752e]: Parsoid: Set title when sending html2html reqs; Mirror 6% of html2html reqs to Parsoid/PHP - T239768T239643 (duration: 16m 02s)
15:26 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
15:15 ejegg: disabled debug logging for Ingenico on payments-wiki
15:12 mobrovac@deploy1001: Started deploy [restbase/deploy@f4b752e]: Parsoid: Set title when sending html2html reqs; Mirror 6% of html2html reqs to Parsoid/PHP - T239768T239643
15:09 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikibase/repo/includes/ParserOutput/FullEntityParserOutputGenerator.php: T229407, part II (duration: 01m 02s)
15:07 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
14:53 marostegui@cumin1001: dbctl commit (dc=all): 'Set db2135 as master for s10 in codfw', diff saved to https://phabricator.wikimedia.org/P9806 and previous config saved to /var/cache/conftool/dbconfig/20191204-145349-marostegui.json
14:51 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db2135 in m5 codfw', diff saved to https://phabricator.wikimedia.org/P9805 and previous config saved to /var/cache/conftool/dbconfig/20191204-145145-marostegui.json
06:13 mutante: phab1001 - running rsync of /srv/repos with --delete because it's larger than the source by about 5GB - deleting objects to match phab1003, former prod server. now both 50G (T238956)
06:04 marostegui: Depool labsdb1011
06:01 mutante: rsyncing /srv/repos data once again. pulling from phab1003 to phab1001 (T238956)
05:51 marostegui: Deploy schema change on s3 primary master (db1123)
04:59 mutante: removed downtime for phabricator.wikimedia.org meta service (paging)
04:58 mutante: phabricator maintenance ended for today - now running on phab1001 (buster)
04:58 mutante: install1002 - restarted isc-dhcpd
04:39 mutante: phab1001 - rebooting for BIOS config change
02:06 mutante: re-enabling puppet on phab1003 and phab1001.. switching active_server for puppet
22:34 volans: disabled temporarily icinga meta-monitoring (disk full on the wikitech-static host)
22:34 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable the Wikisource extension on frwikisource T239731 (duration: 01m 00s)
22:22 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Read wmgDoNotRedirectOnSearchMatch to decide to enable auto-redirect search result change T235263 (duration: 01m 00s)
22:21 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set wmgDoNotRedirectOnSearchMatch, default off, on for Test Commons T235263 (duration: 01m 01s)
22:03 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set wgXmlDumpSchemaVersion to 0.1.0 everywhere T238921T174031 (duration: 01m 03s)
21:40 eileen: civicrm revision changed from 26b788378e to 0f51030071, config revision is 17b6730a72 - includes 3 possible performance improvements - logging reduction, cache a query result & cache file existence
20:00 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@c21a1ca]: Bump preq version for better logging around MW API timeouts (duration: 05m 46s)
19:54 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@c21a1ca]: Bump preq version for better logging around MW API timeouts
19:53 ejegg: shifted 20 more sec / cycle from donations QC to thank you mailer
19:41 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
19:38 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
17:07 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@498c3d1]: repair bulk daemon swift listings
16:52 bblack: reimaging dns3002 + dns5002
16:30 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/MachineVision: Remove slow result randomization from the suggestions query (duration: 01m 03s)
16:02 ejegg: reduced donations queue consumer 10 sec per cycle and increased TY mail sender 10 sec per cycle
15:54 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .
15:44 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .
13:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1112 after schema change', diff saved to https://phabricator.wikimedia.org/P9802 and previous config saved to /var/cache/conftool/dbconfig/20191203-133231-marostegui.json
05:50 ema: cp3050: ats-be restart with proxy.config.http.server_session_sharing.pool=thread T238494
05:47 marostegui: Remove ar_comment triggers from s3 db1124:3313 - T234704
05:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1112 for schema change', diff saved to https://phabricator.wikimedia.org/P9798 and previous config saved to /var/cache/conftool/dbconfig/20191203-054528-marostegui.json
03:53 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@d00c6ad]: Fix: Apply language headers to zhwiki mobile-html responses (T239659) (duration: 05m 51s)
03:47 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@d00c6ad]: Fix: Apply language headers to zhwiki mobile-html responses (T239659)
02:54 mutante: mw1269 restarted nginx, php
02:48 mutante: mw1320, mw1321 restarted php-fpm
02:32 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T78711 Display 'twice a month' or 'once a month' on cached reports (duration: 01m 19s)
02:25 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Stop setting testwiki => true for wmgUseCentralAuth, already implied by default (duration: 01m 24s)
02:19 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T237698 Set wmgUseDynamicPageList, less cryptic form of wmgUseDPL (duration: 01m 16s)
02:16 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Stop setting wgTorLoadNodes, not read for a while (duration: 01m 14s)
02:13 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Stop setting wgGEHelpPanelSearchEnabled, no longer used (duration: 01m 08s)
02:04 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T239091 Enable Translate extension on sewikimedia, second try (duration: 01m 24s)
01:58 jforrester@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/VisualEditor/: T239209 Sanitize HTML on paste (duration: 01m 33s)
01:55 jforrester@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/VisualEditor/: T239209 Sanitize HTML on paste (duration: 01m 24s)
01:42 dzahn@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
00:25 jforrester@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Echo/includes/DiscussionParser.php: T239275 Fix type hint fatal from getUserLinks() (duration: 01m 16s)
00:11 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
00:09 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
22:44 bblack: reimaging dns4002 to buster - T239667
22:07 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/MachineVision: Update text for no personal uploads message (T238873) (duration: 01m 03s)
22:05 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
22:03 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
22:01 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
21:25 mholloway-shell@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
21:23 mholloway-shell@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
21:22 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'wikifeeds' for release 'staging' .
20:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1078 after schema change', diff saved to https://phabricator.wikimedia.org/P9796 and previous config saved to /var/cache/conftool/dbconfig/20191202-205904-marostegui.json
19:22 mobrovac@deploy1001: Finished deploy [restbase/deploy@e69e2e5] (dev-cluster): Switch everything but enwiki to Parsoid/PHP (duration: 06m 38s)
19:16 mobrovac@deploy1001: Started deploy [restbase/deploy@e69e2e5] (dev-cluster): Switch everything but enwiki to Parsoid/PHP
19:04 mobrovac@deploy1001: Finished deploy [restbase/deploy@6a24685]: Parsoid Proxy: Direct html2html traffic to JS; Stop honouring the variant header; Switch sr and zh wikis to PHP - T229015 (duration: 14m 11s)
18:58 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
18:50 mobrovac@deploy1001: Started deploy [restbase/deploy@6a24685]: Parsoid Proxy: Direct html2html traffic to JS; Stop honouring the variant header; Switch sr and zh wikis to PHP - T229015
15:30 mobrovac@deploy1001: Finished deploy [restbase/deploy@d6d5a6e]: Parsoid Proxy: Do not use the fall-back for linting transforms - T239607 (duration: 14m 51s)
15:26 effie: Rolling restart mw1345-1348
15:15 mobrovac@deploy1001: Started deploy [restbase/deploy@d6d5a6e]: Parsoid Proxy: Do not use the fall-back for linting transforms - T239607
14:46 ema: cp-ats: set server_session_sharing.match=2 everywhere (puppet re-enable and run) T238494
13:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1078 for schema change', diff saved to https://phabricator.wikimedia.org/P9794 and previous config saved to /var/cache/conftool/dbconfig/20191202-135643-marostegui.json
13:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1075 after schema change', diff saved to https://phabricator.wikimedia.org/P9793 and previous config saved to /var/cache/conftool/dbconfig/20191202-135543-marostegui.json
05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1075 for schema change', diff saved to https://phabricator.wikimedia.org/P9791 and previous config saved to /var/cache/conftool/dbconfig/20191202-055546-marostegui.json
05:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3318 for compression', diff saved to https://phabricator.wikimedia.org/P9790 and previous config saved to /var/cache/conftool/dbconfig/20191202-055245-marostegui.json
2019-12-01
23:27 ladsgroup@deploy1001: Started restart [mobileapps/deploy@70154b4]: Rolling restart of mobileapps
23:20 bblack: restarting AQS services in eqiad
23:15 eileen: process-control config revision is 9750c318a0 - jobs disabled
21:39 andrewbogott: restarted nova conductor and api on cloudcontrol1003 and 1004 to free up db connections (T239168)
2019-11-30
15:47 Urbanecm: Reset email of SUL user Hayk.arabaget (T239462)
10:48 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
10:47 elukey@deploy1001: Finished deploy [analytics/refinery@97015e4] (thin): Deploy thin Analytics Refinery (no jars/git-fat-obj) to notebook and labstore hosts (duration: 00m 08s)
10:47 elukey@deploy1001: Started deploy [analytics/refinery@97015e4] (thin): Deploy thin Analytics Refinery (no jars/git-fat-obj) to notebook and labstore hosts
05:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1134 after schema change', diff saved to https://phabricator.wikimedia.org/P9781 and previous config saved to /var/cache/conftool/dbconfig/20191129-055845-marostegui.json
18:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1134 for schema change', diff saved to https://phabricator.wikimedia.org/P9780 and previous config saved to /var/cache/conftool/dbconfig/20191128-183918-marostegui.json
18:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1106 after schema change', diff saved to https://phabricator.wikimedia.org/P9779 and previous config saved to /var/cache/conftool/dbconfig/20191128-180517-marostegui.json
17:54 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
13:31 marostegui: Remove ar_comment triggers from db1124:3311 for enwiki.archive - T234704
13:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1106 for schema change, temporarily pool db1080 as vslow,dump', diff saved to https://phabricator.wikimedia.org/P9778 and previous config saved to /var/cache/conftool/dbconfig/20191128-133013-marostegui.json
13:28 volans: cleanup root's crontab entries on netmon hosts from netbox/postres stuff - T238919
13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1118 after schema change', diff saved to https://phabricator.wikimedia.org/P9777 and previous config saved to /var/cache/conftool/dbconfig/20191128-132647-marostegui.json
06:02 marostegui: Remove db2067 from tendril and zarcillo T233185
05:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1118 for schema change', diff saved to https://phabricator.wikimedia.org/P9776 and previous config saved to /var/cache/conftool/dbconfig/20191128-055212-marostegui.json
05:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1119 after schema change', diff saved to https://phabricator.wikimedia.org/P9775 and previous config saved to /var/cache/conftool/dbconfig/20191128-055025-marostegui.json
03:03 vgutierrez: restarting keyholder on acmechief[12]001
01:41 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
01:38 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
19:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1119 for schema change', diff saved to https://phabricator.wikimedia.org/P9773 and previous config saved to /var/cache/conftool/dbconfig/20191127-193528-marostegui.json
19:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1080 after schema change', diff saved to https://phabricator.wikimedia.org/P9772 and previous config saved to /var/cache/conftool/dbconfig/20191127-193227-marostegui.json
19:32 ebernhardson@deploy1001: Finished deploy [search/airflow@45b7790]: Allow airflow virtualenv to import system site packages to facilitate libmysqlclient (duration: 00m 45s)
19:31 ebernhardson@deploy1001: Started deploy [search/airflow@45b7790]: Allow airflow virtualenv to import system site packages to facilitate libmysqlclient
19:08 ebernhardson@deploy1001: Started deploy [search/airflow@57f4caa]: Install mysqlclient to airflow instance
19:00 mutante: an-airflow1001: cd /etc/ ; chown airflow airflow; systemctl start airflow-webserver to let airflow write unittests.cfg (it tries to write this on first start and did not have permissions to do so) T236180
18:58 mutante: an-airflow1001: cd /etc/ ; chown airflow airflow; systemctl start airflow-webserver to let airflow write unittests.cfg
18:57 eileen: process-control config revision is b95355c0c0 - repair omnirecipient job off
16:32 cdanis@deploy1001: Synchronized wmf-config/PoolCounterSettings.php: dd4c76d3d SpecialContributions: max concurrency 3 (instead of 10) T234450 (duration: 01m 17s)
16:22 ejegg: shifted daily silverpop export start time one hour earlier
16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1080 for schema change', diff saved to https://phabricator.wikimedia.org/P9768 and previous config saved to /var/cache/conftool/dbconfig/20191127-161525-marostegui.json
16:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1089 after schema change', diff saved to https://phabricator.wikimedia.org/P9767 and previous config saved to /var/cache/conftool/dbconfig/20191127-161450-marostegui.json
16:06 ema: cp3050: set proxy.config.http.server_session_sharing.match to "ip" T238494
15:57 _joe_: restarting pybal on lvs1015
15:56 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
15:04 ema: cp-ats: rolling ats-{tls,backend} restart to enable lua reload T233274
15:02 moritzm: remove trapperkeeper-webserver-jetty9-clojure debs from apt.wikimedia.org/buster-wikimedia (these were needed to unbreak TLS on Puppetdb in Buster, but an update landed in Buster 10.2, which replaces our custom hotfix)
14:56 marostegui: Add new grants for nova_cell0 database on m5 - T239170
14:50 marostegui: Create nova_cell0 database on m5 master - T239170
14:43 effie: reimage mw1346, mw1336, mw1326
14:35 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
13:57 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set all of testwikidatawiki to read from the new term store for items (T225057) (duration: 00m 56s)
13:44 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
13:44 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
13:42 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
13:42 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
13:42 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
13:42 ema: cp1075: repool with tslua reloads enabled T233274
13:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1089 for schema change', diff saved to https://phabricator.wikimedia.org/P9766 and previous config saved to /var/cache/conftool/dbconfig/20191127-132359-marostegui.json
13:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3311 after schema change', diff saved to https://phabricator.wikimedia.org/P9765 and previous config saved to /var/cache/conftool/dbconfig/20191127-132220-marostegui.json
06:25 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
06:05 marostegui: Promote db2135 to codfw m5 master T238183
06:02 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Add db2135 to the config T238183 (duration: 00m 59s)
06:01 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Add db2135 to the config T238183 (duration: 01m 11s)
05:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2125 T239042', diff saved to https://phabricator.wikimedia.org/P9759 and previous config saved to /var/cache/conftool/dbconfig/20191127-054809-marostegui.json
05:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3311 for schema change', diff saved to https://phabricator.wikimedia.org/P9758 and previous config saved to /var/cache/conftool/dbconfig/20191127-054056-marostegui.json
20:05 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@2b713d6]: Partition CirrusSearchElasticaWrite jobs T230495
19:59 Pchelolo: create partitioned topics for cirrusSearchElasticaWrite on kafka-main T239135
19:57 Urbanecm: Reset email of TheklanBot (T239233)
19:46 brennen@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.35.0-wmf.8
19:39 brennen@deploy1001: Finished scap: testwiki to php-1.35.0-wmf.8 and rebuild l10n cache (duration: 32m 52s)
19:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1099:3311 after schema change', diff saved to https://phabricator.wikimedia.org/P9753 and previous config saved to /var/cache/conftool/dbconfig/20191126-192724-marostegui.json
19:22 shdubsh: restore codfw logstash to baseline - T215904
19:09 shdubsh: stop logstash codfw, generate some consumer lag, and set batch size to 2000 - T215904
19:07 ebernhardson@deploy1001: Finished deploy [search/airflow@6ab2cd1]: Align deploy groups in scap.cfg and checks.yaml (duration: 00m 29s)
19:07 ebernhardson@deploy1001: Started deploy [search/airflow@6ab2cd1]: Align deploy groups in scap.cfg and checks.yaml
19:06 brennen@deploy1001: Started scap: testwiki to php-1.35.0-wmf.8 and rebuild l10n cache
16:36 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Drop Scribunto special-case for HHVM, never reached T235142 (duration: 00m 52s)
12:07 XioNoX: power down mr1-esams for replacement - T238174
11:36 elukey: reboot stat1007
11:35 marostegui: Deploy schema change on db1139:3311
11:35 effie: enable puppet on mw canary servers, and restart apaches
10:50 hashar: Updated jenkins job operations-puppet-tests-stretch-docker to use latest Docker container
10:30 godog: swift eqiad-prod: add ms-be105[7-9] - T237438
10:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3311 for schema change', diff saved to https://phabricator.wikimedia.org/P9749 and previous config saved to /var/cache/conftool/dbconfig/20191126-102442-marostegui.json
10:08 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
09:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1087 into s8 vslow,dump', diff saved to https://phabricator.wikimedia.org/P9748 and previous config saved to /var/cache/conftool/dbconfig/20191126-092409-marostegui.json
09:18 marostegui: Run maintain-views for wikidatawiki.protected_title view on labsdb hosts T233135
07:53 mobrovac@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Parsoid: Switch Flow to Parsoid/PHP on mw.org -- T229015 (duration: 00m 52s)
07:43 mobrovac@deploy1001: Finished deploy [restbase/deploy@378f504]: Do not use duplicate filter definitions T234266 (duration: 14m 24s)
07:29 mobrovac@deploy1001: Started deploy [restbase/deploy@378f504]: Do not use duplicate filter definitions T234266
07:28 mobrovac@deploy1001: Finished deploy [restbase/deploy@378f504] (dev-cluster): Do not use duplicate filter definitions (duration: 07m 36s)
07:21 mobrovac@deploy1001: Started deploy [restbase/deploy@378f504] (dev-cluster): Do not use duplicate filter definitions
07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1061 from config - T238624', diff saved to https://phabricator.wikimedia.org/P9745 and previous config saved to /var/cache/conftool/dbconfig/20191126-071746-marostegui.json
06:51 marostegui: Run compare.py for db2125 - T239042
06:44 marostegui: Remove triggers for ar_comment on db1124:3318 T234704
06:43 marostegui: Deploy schema change on db1087 with replication, lag will be generated on s8 for labsdb hosts
06:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1087 from vslow, and pool db1092 temporarily as vslow,dump for s8, for a schema change on db1087', diff saved to https://phabricator.wikimedia.org/P9744 and previous config saved to /var/cache/conftool/dbconfig/20191126-064200-marostegui.json
06:34 XioNoX: Rename cr2-knams to cr3-knams - T237030
06:01 marostegui@cumin2001: dbctl commit (dc=all): 'Promote db1086 on s7 master and remove read-only from s7 T238044', diff saved to https://phabricator.wikimedia.org/P9743 and previous config saved to /var/cache/conftool/dbconfig/20191126-060108-marostegui.json
06:00 marostegui@cumin2001: dbctl commit (dc=all): 'Set s7 as read-only for maintenance T238044', diff saved to https://phabricator.wikimedia.org/P9742 and previous config saved to /var/cache/conftool/dbconfig/20191126-060023-marostegui.json
06:00 marostegui: Starting s7 failover from db1062 to db1086 - T238044
05:49 marostegui: Deploy schema change on dbstore1003:3311
05:10 marostegui@cumin1001: dbctl commit (dc=all): 'Set weight 0 to db1086 as it will be the new s7 master - T238044', diff saved to https://phabricator.wikimedia.org/P9741 and previous config saved to /var/cache/conftool/dbconfig/20191126-051034-marostegui.json
05:08 marostegui: Start pre-steps for s7 failover - T238044
14:38 marostegui: Remove triggers from archive table on s1 codfw sanitarium T234704
14:37 marostegui: Deploy schema change on s1 codfw (this will generate lag on codfw) - T234066T233135
14:23 moritzm: upgrading OpenJDK 11 on an-conf*
14:04 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
13:27 elukey: set global read_only=1 on db1108's log database - T159170
13:16 XioNoX: cleanup config on cr3-esams - T237031
13:15 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
13:11 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
13:06 XioNoX: cleanup config on cr2-esams - T237031
13:02 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
12:59 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
12:48 XioNoX: bundle esams-knams links on knams side - T237031
12:42 XioNoX: bundle esams-knams links on esams side - T237031
12:27 XioNoX: disable BGP to knams transits - T237031
11:48 marostegui@cumin1001: dbctl commit (dc=all): 'Increase main traffic weight for db1126', diff saved to https://phabricator.wikimedia.org/P9735 and previous config saved to /var/cache/conftool/dbconfig/20191125-114821-marostegui.json
11:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1126 after schema change', diff saved to https://phabricator.wikimedia.org/P9734 and previous config saved to /var/cache/conftool/dbconfig/20191125-114733-marostegui.json
11:40 effie: cumin -b 2 -s 10 restart php on API servers
11:31 effie: restart php-fpm on mw1314
11:16 Urbanecm: EU SWAT done
11:16 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/AbuseFilter/extension.json: SWAT: 29a16bd: Restrict viewing Special:Log/AbuseFilter, and remove from recent changes (T34959) (duration: 01m 04s)
11:06 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: 9394f1f: Allow enwikiversity interface admins to remove their own interface administratorship (T238967) (duration: 00m 57s)
09:45 moritzm: installing cron updates from buster point release
09:32 moritzm: installing systemd security/bugfix updates on buster
09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1126 - schema change', diff saved to https://phabricator.wikimedia.org/P9732 and previous config saved to /var/cache/conftool/dbconfig/20191125-093157-marostegui.json
09:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1104 after schema change', diff saved to https://phabricator.wikimedia.org/P9731 and previous config saved to /var/cache/conftool/dbconfig/20191125-093038-marostegui.json
17:30 shdubsh: clean tombstones on prometheus1004 - T238807
17:09 shdubsh: restart prometheus on prometheus1004 - T238807
16:22 shdubsh: clean tombstones on prometheus1003 - T238807
15:40 XioNoX: renumber AS17639 sessions in eqsin
15:16 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikibase/repo/: Stop outputting anything in case of 304 responses in Special:EntityData (T238901) (duration: 00m 57s)
14:49 _joe_: disabling puppet on restbase2018, testing envoy upgrade T238050
14:48 _joe_: uploaded envoyproxy 1.12.1 to {buster,stretch} T237235
13:11 Amir1: start of foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https (T238119T238524T237375T238120)
09:27 gehel: depool wdqs1007 to allow to catch up on lag - T238229
09:23 reedy@deploy1001: Synchronized php-1.35.0-wmf.5/includes/specials/pagers/ContribsPager.php: Remove live hack of limit for T234450 (duration: 00m 54s)
09:04 gehel: remove blazegraph 2.1.5-wmf.11 from archiva, broken upload
08:54 gehel: restarting blazegraph and updater on wdqs1007
08:54 gehel: restarting blazegraph and updater on edqs1007
08:49 ema: depool cp1081 and reimage as text_ats T227432
06:31 marostegui@cumin1001: dbctl commit (dc=all): 'Rebalance weights on s7 in preparation for s7 failover on Tuesday T238044', diff saved to https://phabricator.wikimedia.org/P9722 and previous config saved to /var/cache/conftool/dbconfig/20191122-063145-marostegui.json
03:49 shdubsh: restart prometheus@ops on prometheus1003 T238807
00:46 mutante: xhgui1001/xhgui2001 - rsyncing /srv/mongod from tungsten to /srv/tungsten/mongod/ on both new machines (T158837)
00:37 mutante: tungsten - starting ferm service
00:20 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Move newcomer tasks JSON config from mw.org to local wikis (T237301) (duration: 00m 52s)
00:18 catrope@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/GrowthExperiments/: Make non-remote titles work in RemotePageConfigurationLoader (T237301) (duration: 00m 54s)
11:08 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: e4861ec: Set correct language for shywiktionary (T238105) (duration: 00m 52s)
11:06 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: 68d2003: Restrict editing CNBanner namespace to autoconfirmed on metawiki (T238723) (duration: 00m 54s)
11:05 effie: disable puppet on mw[1-2]*
10:49 volans: restarting tcpircbot-logmsgbot on icinga1001, has failed to log some messages, no useful log on the host
10:22 ema: pool cp2023 with Varnish backend T238817T227432
10:18 arturo: update buster-wikimedia thirdparty/kubeadm-k8s packages (newer version will be used to handle T238654)
09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1090:331{2,7} after upgrade', diff saved to https://phabricator.wikimedia.org/P9714 and previous config saved to /var/cache/conftool/dbconfig/20191121-095401-marostegui.json
09:39 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1090:331{2,7} after upgrade', diff saved to https://phabricator.wikimedia.org/P9713 and previous config saved to /var/cache/conftool/dbconfig/20191121-093958-marostegui.json
09:39 ema: depool cp2023 and reimage back as varnish-be T238817T227432
09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1090:3312, db1090:3317 for upgrade', diff saved to https://phabricator.wikimedia.org/P9712 and previous config saved to /var/cache/conftool/dbconfig/20191121-092554-marostegui.json
09:08 akosiaris@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'wikifeeds' for release 'staging' .
09:06 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1079 after upgrade', diff saved to https://phabricator.wikimedia.org/P9711 and previous config saved to /var/cache/conftool/dbconfig/20191121-090623-marostegui.json
09:03 akosiaris@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'wikifeeds' for release 'staging' .
08:58 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
08:56 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1079 after upgrade', diff saved to https://phabricator.wikimedia.org/P9710 and previous config saved to /var/cache/conftool/dbconfig/20191121-085644-marostegui.json
08:53 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
08:45 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1079 after upgrade', diff saved to https://phabricator.wikimedia.org/P9709 and previous config saved to /var/cache/conftool/dbconfig/20191121-084500-marostegui.json
08:33 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1079 after upgrade', diff saved to https://phabricator.wikimedia.org/P9708 and previous config saved to /var/cache/conftool/dbconfig/20191121-083322-marostegui.json
08:21 marostegui: Upgrade db1079
08:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1079 for upgrade', diff saved to https://phabricator.wikimedia.org/P9707 and previous config saved to /var/cache/conftool/dbconfig/20191121-082108-marostegui.json
07:56 marostegui: Promote db2133 to codfw m2 master - T238183
07:25 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1086 after upgrade', diff saved to https://phabricator.wikimedia.org/P9706 and previous config saved to /var/cache/conftool/dbconfig/20191121-072543-marostegui.json
07:18 marostegui: Upgrade db1125 (sanitarium)
07:17 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1086 after upgrade', diff saved to https://phabricator.wikimedia.org/P9705 and previous config saved to /var/cache/conftool/dbconfig/20191121-071758-marostegui.json
06:24 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1086 after upgrade', diff saved to https://phabricator.wikimedia.org/P9704 and previous config saved to /var/cache/conftool/dbconfig/20191121-062412-marostegui.json
06:17 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1086 after upgrade', diff saved to https://phabricator.wikimedia.org/P9703 and previous config saved to /var/cache/conftool/dbconfig/20191121-061711-marostegui.json
05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1086 for upgrade', diff saved to https://phabricator.wikimedia.org/P9702 and previous config saved to /var/cache/conftool/dbconfig/20191121-055557-marostegui.json
05:53 marostegui: Compress db2073
00:41 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config does not seem to be applying on half the app servers, resyncing (duration: 00m 52s)
16:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1103:3314 after compression', diff saved to https://phabricator.wikimedia.org/P9695 and previous config saved to /var/cache/conftool/dbconfig/20191120-160813-marostegui.json
16:03 gehel: depool wdqs1004 to allow catching up on lag - T238229
15:42 mobrovac@deploy1001: Synchronized wmf-config/LabsServices.php: [BETA-ONLY] Switch Flow to use Parsoid/PHP - T229078 (duration: 00m 52s)
15:40 ema@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
11:38 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: f847380: Set namespace alias for Index: (NS 102/103) for elwikisource (T237253) (duration: 00m 54s)
10:34 ema: depool cp2010 and reimage as text_ats T227432
10:30 marostegui: Upgrade db1116
10:22 mobrovac@deploy1001: Started deploy [restbase/deploy@daa7808]: Revert switching test2.wp to Parsoid/JS - T238716
10:17 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1094', diff saved to https://phabricator.wikimedia.org/P9694 and previous config saved to /var/cache/conftool/dbconfig/20191120-101727-marostegui.json
10:14 marostegui: Compress db2095:3314
10:07 mobrovac@deploy1001: Finished deploy [restbase/deploy@c677063]: Switch test2.wp back to Parsoid/JS temporarily - T238716 (duration: 14m 54s)
09:56 marostegui: Compress db2106
09:52 mobrovac@deploy1001: Started deploy [restbase/deploy@c677063]: Switch test2.wp back to Parsoid/JS temporarily - T238716
09:48 marostegui: Compress dbstore1005:3318
09:47 marostegui: Compress dbstore1004:3314
09:33 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1094 after upgrade', diff saved to https://phabricator.wikimedia.org/P9693 and previous config saved to /var/cache/conftool/dbconfig/20191120-093308-marostegui.json
09:23 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1094 after upgrade', diff saved to https://phabricator.wikimedia.org/P9692 and previous config saved to /var/cache/conftool/dbconfig/20191120-092337-marostegui.json
09:07 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1094 after upgrade', diff saved to https://phabricator.wikimedia.org/P9691 and previous config saved to /var/cache/conftool/dbconfig/20191120-090739-marostegui.json
08:55 marostegui: Upgrade db1094
08:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1094 for upgrade', diff saved to https://phabricator.wikimedia.org/P9690 and previous config saved to /var/cache/conftool/dbconfig/20191120-085448-marostegui.json
08:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
07:43 marostegui: Promote db2132 as m1-codfw master - T238183
07:19 marostegui: Upgrade db2062
07:19 marostegui: Upgrade db2078
07:14 marostegui: Deploy schema change on s3 (testwikidatawiki) directly on s3 primary master T237120
07:05 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1136', diff saved to https://phabricator.wikimedia.org/P9688 and previous config saved to /var/cache/conftool/dbconfig/20191120-070511-marostegui.json
06:57 marostegui@cumin1001: dbctl commit (dc=all): 'Give more traffic to db1136', diff saved to https://phabricator.wikimedia.org/P9687 and previous config saved to /var/cache/conftool/dbconfig/20191120-065718-marostegui.json
06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1136 into s7 api', diff saved to https://phabricator.wikimedia.org/P9686 and previous config saved to /var/cache/conftool/dbconfig/20191120-064022-marostegui.json
06:36 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1136 after upgrade', diff saved to https://phabricator.wikimedia.org/P9685 and previous config saved to /var/cache/conftool/dbconfig/20191120-063628-marostegui.json
06:28 marostegui: Upgrade db1136
06:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1136 for upgrade', diff saved to https://phabricator.wikimedia.org/P9684 and previous config saved to /var/cache/conftool/dbconfig/20191120-062749-marostegui.json
06:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1101:3317 after upgrade', diff saved to https://phabricator.wikimedia.org/P9683 and previous config saved to /var/cache/conftool/dbconfig/20191120-062029-marostegui.json
06:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103:3314 for compression', diff saved to https://phabricator.wikimedia.org/P9682 and previous config saved to /var/cache/conftool/dbconfig/20191120-061938-marostegui.json
05:58 marostegui: Stop MySQL on db1101:3317, db1101:3318 for upgrade and schema change
05:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101:3317 and db1101:3318 for upgrade and schema change', diff saved to https://phabricator.wikimedia.org/P9681 and previous config saved to /var/cache/conftool/dbconfig/20191120-055732-marostegui.json
05:55 marostegui: Depool labsdb1011 for upgrade
05:54 marostegui@cumin2001: dbctl commit (dc=all): 'Repool db1105:3311 db1097:3314 db1098:3316 db1098:3317 after compression', diff saved to https://phabricator.wikimedia.org/P9680 and previous config saved to /var/cache/conftool/dbconfig/20191120-055426-marostegui.json
05:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1092 after schema change', diff saved to https://phabricator.wikimedia.org/P9679 and previous config saved to /var/cache/conftool/dbconfig/20191120-054840-marostegui.json
03:16 tgr: T208369 ran mwscript extensions/GrowthExperiments/maintenance/deleteOldSurveys.php kowiki --cutoff 350
00:10 mutante: phab2001 - restart ssh-phab service after repooling it after buster reinstall, it wasn't listening on the IPv6 IP,causing LVS/pybal alerts
00:06 catrope@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/GrowthExperiments/: Pass token as editing_session_id for suggested edits (T238249) (duration: 00m 53s)
20:09 mutante: phab1003 after merging gerrit:551910 puppet now also stopped the actual aphlict service and removed the systemd unit file. had to manually run 'systemctl reset-failed' though to clean systemd status and avoid icinga alert (T238593)
20:07 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
19:08 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@6e6bd42]: Prevent expensive content transforms from blocking the event loop (T229286) (duration: 06m 49s)
19:01 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@6e6bd42]: Prevent expensive content transforms from blocking the event loop (T229286)
19:00 elukey: regenerate TLS cert for yarn.wikimedia.org (containing SANs for all analytics UIs) to add datasets.w.o SAN (site was failing due to ATS not being able to contact thorium)
18:59 rlazarus: restarted php7.2-fpm on wtp2001, wtp2002
18:56 rlazarus: restarted php7.2-fpm on wtp1025, wtp1026
16:21 mutante: phab1003 - puppet restarts aphlict service even with "phabricator_aphlict_enabled: false" in Hiera. But it does properly remove the proxy config lines from apache. so service is running but not used. (T238593)
16:17 mutante: phab1003 - systemctl stop aphlict (proxy config in apache is disabled as well as disabled in ATS) (T238593)
16:15 gehel: reloading data from wdqs1007 to wdqs1004 - after failed test of merging updater - T212826
13:57 marostegui: Deploy schema change on metawiki directly on s7 master T238370
13:57 marostegui: Deploy schema change on mediawikiwiki directly on s7 master T238370
13:55 marostegui: Deploy schema change on mediawikiwiki directly on s3 master T238370
13:50 marostegui: Deploy schema change on foundationwiki directly on s3 master - T238370
13:46 marostegui: Deploy schema change on labswiki (wikitech) - T238370
13:39 marostegui: Deploy schema change on db1092
13:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1092 for schema change', diff saved to https://phabricator.wikimedia.org/P9673 and previous config saved to /var/cache/conftool/dbconfig/20191119-133850-marostegui.json
13:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1101:3318 after schema change', diff saved to https://phabricator.wikimedia.org/P9672 and previous config saved to /var/cache/conftool/dbconfig/20191119-133704-marostegui.json
13:34 ema@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
11:16 gehel: restarting wdqs updater on wdqs1004 - T231411
10:36 marostegui: Compress and upgrade db1098:3316
10:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 for upgrade and compression', diff saved to https://phabricator.wikimedia.org/P9671 and previous config saved to /var/cache/conftool/dbconfig/20191119-103540-marostegui.json
10:34 marostegui: Compress and upgrade db1098:3317
10:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1098:3317 for upgrade and compression', diff saved to https://phabricator.wikimedia.org/P9670 and previous config saved to /var/cache/conftool/dbconfig/20191119-103426-marostegui.json
10:29 marostegui: Upgrade db2077
10:24 marostegui: Upgrade db2120 db2121 db2122
10:10 marostegui: Upgrade MySQL on db2086 db2087 db2100
10:06 godog: repool centrallog2001
09:40 effie: disable puppet on P:mediawiki::php - T229792
09:08 marostegui: Deploy schema change on db1101:3318
09:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101:3318 for schema change', diff saved to https://phabricator.wikimedia.org/P9669 and previous config saved to /var/cache/conftool/dbconfig/20191119-090823-marostegui.json
09:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1099:3318 after schema change', diff saved to https://phabricator.wikimedia.org/P9668 and previous config saved to /var/cache/conftool/dbconfig/20191119-090745-marostegui.json
09:05 marostegui: Repool labsbdb1010
07:33 mobrovac@deploy1001: Synchronized wmf-config/CommonSettings-labs.php: Enable math links in Beta - T208758 (duration: 00m 53s)
06:26 vgutierrez: Move cp1089 from nginx to ats-tls - T231627
06:20 marostegui: Depool labsdb1010 for upgrade
06:02 marostegui@cumin2001: dbctl commit (dc=all): 'Promote db1131 to s6 master and remove read-only from s6 T235469', diff saved to https://phabricator.wikimedia.org/P9667 and previous config saved to /var/cache/conftool/dbconfig/20191119-060203-marostegui.json
06:01 marostegui@cumin2001: dbctl commit (dc=all): 'Set s6 as read-only for maintenance T235469', diff saved to https://phabricator.wikimedia.org/P9666 and previous config saved to /var/cache/conftool/dbconfig/20191119-060122-marostegui.json
06:01 marostegui: Starting s6 failover from db1061 to db1131 - T235469
05:37 eileen: process control - I reverted the above to check some stuff first
05:36 vgutierrez: Move cp1087 from nginx to ats-tls - T231627
05:26 marostegui: Deploy schema change on db1099:3318
05:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3318 for schema change', diff saved to https://phabricator.wikimedia.org/P9665 and previous config saved to /var/cache/conftool/dbconfig/20191119-052632-marostegui.json
05:25 marostegui: Compress db1097:3314
05:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1097:3314 for compression', diff saved to https://phabricator.wikimedia.org/P9664 and previous config saved to /var/cache/conftool/dbconfig/20191119-052412-marostegui.json
05:17 vgutierrez: Move cp1085 from nginx to ats-tls - T231627
05:14 marostegui: Compress tables on db1105:3311
05:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3311 for compression', diff saved to https://phabricator.wikimedia.org/P9663 and previous config saved to /var/cache/conftool/dbconfig/20191119-051344-marostegui.json
05:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3312 after compression', diff saved to https://phabricator.wikimedia.org/P9662 and previous config saved to /var/cache/conftool/dbconfig/20191119-051259-marostegui.json
05:12 eileen: process-control config revision is 9fbfc79988 - change gap on repair job to 16 hours to reflect the with-daylight-savings ones
05:07 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1131 with weight 0 T235469 ', diff saved to https://phabricator.wikimedia.org/P9661 and previous config saved to /var/cache/conftool/dbconfig/20191119-050748-marostegui.json
00:39 mutante: phab2001 - rsyncing /srv/repos data from phab1003 (T190568)
00:30 mutante: rebooting phab2001
2019-11-18
23:52 catrope@deploy1001: Finished scap: Update GrowthExperiments to master in wmf.5 (includes i18n) (duration: 19m 57s)
23:37 mutante: phab2001 - restart ssh-phab service after reimaging (some race condition binding to the IP before getting it on the interface after fresh install .. reschedule pybal checks (T190568)
23:32 catrope@deploy1001: Started scap: Update GrowthExperiments to master in wmf.5 (includes i18n)
22:58 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
21:39 arlolra@deploy1001: Started deploy [parsoid/deploy@c6a457f]: Updating Parsoid to 2245b8f
20:59 mutante: phab1003 - re-enabling puppet after merging gerrit::551271 - making sure aphlict stays disabled incl. the apache config ProxyPass lines using mod_proxy_wstunnel (T238593)
20:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1098:3316 after some compression', diff saved to https://phabricator.wikimedia.org/P9659 and previous config saved to /var/cache/conftool/dbconfig/20191118-202259-marostegui.json
18:21 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@582d394]: New WDQS build with merging updater (duration: 13m 27s)
18:07 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@582d394]: New WDQS build with merging updater
17:44 cdanis: rebooting grafana1002 (currently test host not used in prod)
17:08 marostegui: Deploy schema change on db1116:3318
16:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1098:3316 for compression', diff saved to https://phabricator.wikimedia.org/P9658 and previous config saved to /var/cache/conftool/dbconfig/20191118-165410-marostegui.json
16:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1096:3316 after compression', diff saved to https://phabricator.wikimedia.org/P9656 and previous config saved to /var/cache/conftool/dbconfig/20191118-164923-marostegui.json
15:13 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set MCR migration stage to NEW on remaining wikis for T198312 (duration: 00m 53s)
06:12 vgutierrez: Move cp2012 from nginx to ats-tls - T231627
06:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 for compression', diff saved to https://phabricator.wikimedia.org/P9652 and previous config saved to /var/cache/conftool/dbconfig/20191118-060508-marostegui.json
06:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3312 for compression', diff saved to https://phabricator.wikimedia.org/P9651 and previous config saved to /var/cache/conftool/dbconfig/20191118-060207-marostegui.json
06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2072, db2088:3311, db2087:3316, db2086:3317 after maintenances and schema changes', diff saved to https://phabricator.wikimedia.org/P9650 and previous config saved to /var/cache/conftool/dbconfig/20191118-060114-marostegui.json
11:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1113:3315 into vslow,dump after schema change', diff saved to https://phabricator.wikimedia.org/P9645 and previous config saved to /var/cache/conftool/dbconfig/20191115-112520-marostegui.json
11:19 marostegui: Reboot dbproxy2002
11:15 marostegui: Reboot dbproxy2004
11:12 marostegui: Reboot dbproxy2001
10:45 marostegui: Run maintain-views for s5 on labsdb1011 T233135
09:50 ema: depool cp3062 and reimage as text_ats T227432
09:47 vgutierrez: Use a synthetic warning for 1% of TLSv1/TLS1v.1 pageviews - T238038
09:18 vgutierrez: Move cp1079 from nginx to ats-tls - T231627
09:13 gehel@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97)
09:02 vgutierrez: Move cp1077 from nginx to ats-tls - T231627
08:42 vgutierrez: Move cp2006 from nginx to ats-tls - T231627
08:30 vgutierrez: Move cp2004 from nginx to ats-tls - T231627
06:41 marostegui: Stop MySQL on db2065 to clone db2134 (this will trigger an haproxy irc alert) - T238183
06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1113:3315 for schema change and temporary pool db1082 into vslow,dump', diff saved to https://phabricator.wikimedia.org/P9643 and previous config saved to /var/cache/conftool/dbconfig/20191115-060807-marostegui.json
06:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2088:3311 for compression', diff saved to https://phabricator.wikimedia.org/P9642 and previous config saved to /var/cache/conftool/dbconfig/20191115-060425-marostegui.json
06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1103:3312 db1082 after schema changes', diff saved to https://phabricator.wikimedia.org/P9641 and previous config saved to /var/cache/conftool/dbconfig/20191115-060300-marostegui.json
05:57 marostegui: Run maintain-views for s5 on labsdb1009, labsdb1010, labsdb1012 (pending labsdb1011 as it is still running the schema change) T233135
05:07 vgutierrez: Move cp3064 from nginx to ats-tls - T231627
04:38 volker-e@deploy1001: Started deploy [design/style-guide@2ad7b1a]: Deploy design/style-guide:
04:17 vgutierrez: Move cp3062 from nginx to ats-tls - T231627
04:00 vgutierrez: Move cp3060 from nginx to ats-tls - T231627
01:35 tstarling@deploy1001: Synchronized php-1.35.0-wmf.5/includes/Rest/Handler/CompareHandler.php: deploying REST compare section feature because iOS team need it for a beta release due very soon (duration: 00m 53s)
01:33 tstarling@deploy1001: Synchronized php-1.35.0-wmf.5/includes/Rest/coreRoutes.json: deploying REST compare section feature because iOS team need it for a beta release due very soon (duration: 00m 52s)
01:32 tstarling@deploy1001: Synchronized php-1.35.0-wmf.5/includes/parser/Parser.php: deploying REST compare section feature because iOS team need it for a beta release due very soon (duration: 00m 54s)
2019-11-14
23:03 mutante: restarting gerrit to ncrease defaultThreadPoolSize to 2
18:31 mutante: phabricator (phab1003, prod server) - upgrade PHP version to 7.2.24 (T237239)
18:17 cdanis@cumin2001: dbctl commit (dc=all): 'alias wikitech section to new s10 section T233236', diff saved to https://phabricator.wikimedia.org/P9638 and previous config saved to /var/cache/conftool/dbconfig/20191114-181732-cdanis.json
17:46 robh: running dell epsa tool on cp3056 per T236497
17:35 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
17:35 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'coredns' .
17:35 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
17:22 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
17:22 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'coredns' .
17:22 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
14:53 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/MachineVision: Fix bug when when looking up entity for an unknown ID (duration: 00m 53s)
14:48 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set MCR migration stage to NEW on group1 for T198312 (duration: 00m 53s)
14:27 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
09:55 gehel: depool wdqs (public) eqiad - high lag - T238229
09:34 ema: depool cp3058 and reimage as text_ats T227432
09:31 marostegui: Compare wikidatawiki.pagelinks between labsdb1011 and labsdb1010 - T233986
09:25 moritzm: installing ghostscript updates on thumbor1001
09:24 marostegui: Stop mysql on db2067 to clone db21133 - T238183
09:20 marostegui@cumin1001: dbctl commit (dc=all): 'Full weight to db1089 on special groups for s1 T223151', diff saved to https://phabricator.wikimedia.org/P9635 and previous config saved to /var/cache/conftool/dbconfig/20191114-092006-marostegui.json
09:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
09:05 marostegui: Compare wikidatawiki.pagelinks between db1124:3318 and labsdb1010 - T233986
08:42 marostegui: Remove ar_comment from triggers on db1124:3315 - T234704
08:41 marostegui: Deploy schema change with replication on db1082, this will generate lag on s5 labs - T233135T234066
08:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
08:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1082 for schema change', diff saved to https://phabricator.wikimedia.org/P9634 and previous config saved to /var/cache/conftool/dbconfig/20191114-084043-marostegui.json
08:38 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
08:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1110 after schema change', diff saved to https://phabricator.wikimedia.org/P9633 and previous config saved to /var/cache/conftool/dbconfig/20191114-083729-marostegui.json
08:03 eileen: process-control config revision is 6adc66a20b re-enable backfill
08:00 marostegui@cumin1001: dbctl commit (dc=all): 'Pool a non partitioned slave db1089 on special groups for s1 T223151', diff saved to https://phabricator.wikimedia.org/P9632 and previous config saved to /var/cache/conftool/dbconfig/20191114-080038-marostegui.json
07:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103:3312 T235599', diff saved to https://phabricator.wikimedia.org/P9631 and previous config saved to /var/cache/conftool/dbconfig/20191114-075449-marostegui.json
07:41 eileen: process-control config revision is b7c2cf7227 - disabled backfill again - some error?
07:29 eileen: process-control config revision is 909108622d re-enable omnirecipient date repair job
07:25 eileen: process-control config revision is d3ebeddcc1 (I renabled the old back fill job)
07:12 moritzm: installing intel-microcode updates
06:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1067', diff saved to https://phabricator.wikimedia.org/P9630 and previous config saved to /var/cache/conftool/dbconfig/20191114-065309-marostegui.json
06:16 marostegui: Stop replication on db1067
06:01 marostegui@cumin2001: dbctl commit (dc=all): 'Promote db1083 to s1 master and remove read-only from s1 T234800', diff saved to https://phabricator.wikimedia.org/P9629 and previous config saved to /var/cache/conftool/dbconfig/20191114-060138-marostegui.json
06:00 marostegui@cumin2001: dbctl commit (dc=all): 'Set s1 as read-only for maintenance T234800', diff saved to https://phabricator.wikimedia.org/P9628 and previous config saved to /var/cache/conftool/dbconfig/20191114-060026-marostegui.json
06:00 marostegui: Starting s1 failover from db1067 to db1083 - T234800
05:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110 for schema change', diff saved to https://phabricator.wikimedia.org/P9627 and previous config saved to /var/cache/conftool/dbconfig/20191114-052400-marostegui.json
05:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1130 after schema change', diff saved to https://phabricator.wikimedia.org/P9626 and previous config saved to /var/cache/conftool/dbconfig/20191114-052303-marostegui.json
05:13 marostegui: Move replicas from db1067 to db1083 T234800
05:09 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1083 with weight 0 T234800', diff saved to https://phabricator.wikimedia.org/P9625 and previous config saved to /var/cache/conftool/dbconfig/20191114-050940-marostegui.json
15:29 gehel: configuration of new elasticsearch servers completed, all working and pooled - T230746
14:55 jynus@cumin1001: dbctl commit (dc=all): 'Depool db2072', diff saved to https://phabricator.wikimedia.org/P9620 and previous config saved to /var/cache/conftool/dbconfig/20191113-145541-jynus.json
13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1130 for schema change', diff saved to https://phabricator.wikimedia.org/P9619 and previous config saved to /var/cache/conftool/dbconfig/20191113-134938-marostegui.json
13:46 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1089 after upgrade', diff saved to https://phabricator.wikimedia.org/P9618 and previous config saved to /var/cache/conftool/dbconfig/20191113-134625-marostegui.json
13:34 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1089 after upgrade', diff saved to https://phabricator.wikimedia.org/P9617 and previous config saved to /var/cache/conftool/dbconfig/20191113-133410-marostegui.json
13:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1089 for upgrade', diff saved to https://phabricator.wikimedia.org/P9616 and previous config saved to /var/cache/conftool/dbconfig/20191113-132216-marostegui.json
13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1096:3315', diff saved to https://phabricator.wikimedia.org/P9615 and previous config saved to /var/cache/conftool/dbconfig/20191113-131530-marostegui.json
11:56 effie: Upgrade to php 7.2.24-1 mediawiki eqiad hosts and restart php-fpm - T237239
11:55 ema: cp-ats: rolling trafficserver (8.0.5-1wm10) and fifo-log-demux (0.6) upgrade and restart
11:46 moritzm: rebooting cloudcontrol2001-dev for microcode debugging
11:45 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
11:04 urbanecm@deploy1001: Synchronized static/images/project-logos/: SWAT: 0a90ef9: Update localized logos for the Fula Wikipedia (T238191) (duration: 00m 54s)
10:53 vgutierrez: Testing ats-tls-restart on cp5007 - T237425
10:43 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1083', diff saved to https://phabricator.wikimedia.org/P9613 and previous config saved to /var/cache/conftool/dbconfig/20191113-104326-marostegui.json
10:32 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1083', diff saved to https://phabricator.wikimedia.org/P9612 and previous config saved to /var/cache/conftool/dbconfig/20191113-103225-marostegui.json
10:27 gehel: start configuration of new elasticsearch servers - T230746
10:20 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1083 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P9610 and previous config saved to /var/cache/conftool/dbconfig/20191113-102054-marostegui.json
10:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1083 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P9609 and previous config saved to /var/cache/conftool/dbconfig/20191113-101127-marostegui.json
09:51 jynus: upgraded wmf-mariadb101-client on cumin hosts
09:50 mobrovac@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'mathoid' for release 'production' .
09:43 mobrovac@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'mathoid' for release 'production' .
09:41 mobrovac@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'mathoid' for release 'staging' .
07:09 marostegui: Fix replication on labsdb1010 - T233986
07:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3315 for schema change', diff saved to https://phabricator.wikimedia.org/P9607 and previous config saved to /var/cache/conftool/dbconfig/20191113-070339-marostegui.json
07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2086:3317 for compression', diff saved to https://phabricator.wikimedia.org/P9606 and previous config saved to /var/cache/conftool/dbconfig/20191113-070055-marostegui.json
06:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2087:3317 after compression', diff saved to https://phabricator.wikimedia.org/P9605 and previous config saved to /var/cache/conftool/dbconfig/20191113-065952-marostegui.json
06:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1097:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P9604 and previous config saved to /var/cache/conftool/dbconfig/20191113-065823-marostegui.json
23:57 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/MachineVision: Fix: Do not return after inserting a single suggestion (duration: 00m 52s)
19:43 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: Sync a previously undeployed change to InitialiseSettings-labs.php that someone forgot to deploy (as a no-op) in production (duration: 00m 52s)
19:41 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set MCR migration stage to NEW on group0 for T198312 (duration: 00m 52s)
16:09 ema: depool cp3052 and observe performance impact T238085 before reimaging as text_ats T227432
15:49 marostegui: Deploy schema change on db1102:3315 T233135T234066
15:45 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/MachineVision: Fixes and tweaks for initial rollout (duration: 00m 53s)
15:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1097:3315 for a schema change T233135T234066', diff saved to https://phabricator.wikimedia.org/P9600 and previous config saved to /var/cache/conftool/dbconfig/20191112-154127-marostegui.json
14:16 Jeff_Green: authdns-update to deploy fundraising-read.wmnet service cname adjustment
14:01 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Revert "Set all of wikidata for write both for term store" (duration: 00m 52s)
11:05 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: SECURITY: Dont allow Wikimedia sysops to see who had 2FA disabled (duration: 00m 53s)
10:44 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1083', diff saved to https://phabricator.wikimedia.org/P9599 and previous config saved to /var/cache/conftool/dbconfig/20191112-104400-marostegui.json
10:36 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1083', diff saved to https://phabricator.wikimedia.org/P9598 and previous config saved to /var/cache/conftool/dbconfig/20191112-103641-marostegui.json
10:35 onimisionipe: resetting cronfile on wdqs hosts
10:33 marostegui: Drop labtestwiki database from m5 master db1133 - T236010
10:30 marostegui: Deploy schema change on dbstore1003:3315
10:07 ema: repool cp3065, nothing interesting in kern.log and SEL T238032
09:52 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1083', diff saved to https://phabricator.wikimedia.org/P9596 and previous config saved to /var/cache/conftool/dbconfig/20191112-095221-marostegui.json
09:42 marostegui: Remove privileges for labtestwiki on m5 - T236010
09:27 gehel: restarting blazegraph on wdqs1004
09:17 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1083', diff saved to https://phabricator.wikimedia.org/P9595 and previous config saved to /var/cache/conftool/dbconfig/20191112-091706-marostegui.json
09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1083 for mariadb upgrade to 10.1.39 - T234800', diff saved to https://phabricator.wikimedia.org/P9594 and previous config saved to /var/cache/conftool/dbconfig/20191112-091158-marostegui.json
09:11 marostegui: Upgrade mariadb to 10.1.39 on db1083 (candidate master for s1)
08:56 moritzm: restarting archiva to pick up Java security updates
08:24 volker-e@deploy1001: Started deploy [design/style-guide@b926b95]: Deploy design/style-guide:
08:15 moritzm: installing curl security updates
08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Increase traffic to db1083', diff saved to https://phabricator.wikimedia.org/P9592 and previous config saved to /var/cache/conftool/dbconfig/20191112-081322-marostegui.json
07:40 marostegui@cumin1001: dbctl commit (dc=all): 'Increase traffic to db1083', diff saved to https://phabricator.wikimedia.org/P9591 and previous config saved to /var/cache/conftool/dbconfig/20191112-074006-marostegui.json
07:36 elukey: remove /etc/logrotate.d/wdqs_autodeployment_log from wdqs1009 (not in puppet anymore and causing cronspam)
07:28 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1083 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P9590 and previous config saved to /var/cache/conftool/dbconfig/20191112-072823-marostegui.json
07:10 marostegui: Upgrade kernel on db1083 (s1 candidate master)
07:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1083 for kernel upgrade - T234800', diff saved to https://phabricator.wikimedia.org/P9589 and previous config saved to /var/cache/conftool/dbconfig/20191112-070436-marostegui.json
06:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
01:06 volker-e@deploy1001: Started deploy [design/style-guide@97fb3ee]: Deploy design/style-guide:
2019-11-08
20:26 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Delay annotation request jobs by 5 mins for testing (duration: 00m 52s)
16:54 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
16:15 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Revert "MachineVision: Enable testers-only mode on testcommonswiki for debugging" (duration: 00m 54s)
15:57 jynus@cumin1001: dbctl commit (dc=all): 'Pool db1118, db1106 at 100%', diff saved to https://phabricator.wikimedia.org/P9582 and previous config saved to /var/cache/conftool/dbconfig/20191108-155700-jynus.json
15:37 herron: beginning rolling service restarts on logstash hosts for java security updates
15:13 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Enable testers-only mode on testcommonswiki for debugging (duration: 00m 52s)
14:56 volans@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
14:50 jynus@cumin1001: dbctl commit (dc=all): 'Pool db1118 at 50%', diff saved to https://phabricator.wikimedia.org/P9581 and previous config saved to /var/cache/conftool/dbconfig/20191108-145028-jynus.json
14:42 volans@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
13:40 jynus: stop and upgrade percona-server on test host db1114
13:27 elukey@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0)
13:12 jynus@cumin1001: dbctl commit (dc=all): 'Pool db1118 at 20%', diff saved to https://phabricator.wikimedia.org/P9580 and previous config saved to /var/cache/conftool/dbconfig/20191108-131257-jynus.json
13:09 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: ee2027c: Change the language of Votewiki back to English (en) (T230614) (duration: 00m 54s)
12:14 jynus@cumin1001: dbctl commit (dc=all): 'Pool db1118 at 10%', diff saved to https://phabricator.wikimedia.org/P9578 and previous config saved to /var/cache/conftool/dbconfig/20191108-121444-jynus.json
12:02 jynus: update and restart db1118
12:01 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1118 fully', diff saved to https://phabricator.wikimedia.org/P9577 and previous config saved to /var/cache/conftool/dbconfig/20191108-120138-jynus.json
11:55 jynus@cumin1001: dbctl commit (dc=all): 'Pool db1118 at 20%', diff saved to https://phabricator.wikimedia.org/P9576 and previous config saved to /var/cache/conftool/dbconfig/20191108-115553-jynus.json
11:27 jynus@cumin1001: dbctl commit (dc=all): 'Pool db1118 at 50%', diff saved to https://phabricator.wikimedia.org/P9575 and previous config saved to /var/cache/conftool/dbconfig/20191108-112733-jynus.json
11:25 jynus@cumin1001: dbctl commit (dc=all): 'repool db2130', diff saved to https://phabricator.wikimedia.org/P9574 and previous config saved to /var/cache/conftool/dbconfig/20191108-112503-jynus.json
11:12 jynus: update and restart db2130
11:11 jynus@cumin1001: dbctl commit (dc=all): 'Repool db2116, depool db2130', diff saved to https://phabricator.wikimedia.org/P9573 and previous config saved to /var/cache/conftool/dbconfig/20191108-111125-jynus.json
10:58 Amir1: running rebuildItemTerms on 8028 items (T234329)
10:51 jynus: update and restart db2116
10:50 jynus@cumin1001: dbctl commit (dc=all): 'Repool db2103, depool db2116', diff saved to https://phabricator.wikimedia.org/P9572 and previous config saved to /var/cache/conftool/dbconfig/20191108-105013-jynus.json
10:38 jynus: update and restart db2103
10:34 jeh: enable IPMI `racadm set iDRAC.IPMILan.Enable 1` on cloudcephmon[1-3] T228102
10:33 jeh: enable IPMI `racadm set iDRAC.IPMILan.Enable 1` on cloudcephosd[1-3] T224188
10:32 jynus@cumin1001: dbctl commit (dc=all): 'Repool db2092, depool db2103', diff saved to https://phabricator.wikimedia.org/P9571 and previous config saved to /var/cache/conftool/dbconfig/20191108-103218-jynus.json
10:19 jynus: update and restart db2092
10:18 jynus@cumin1001: dbctl commit (dc=all): 'Repool db2071, depool db2092', diff saved to https://phabricator.wikimedia.org/P9570 and previous config saved to /var/cache/conftool/dbconfig/20191108-101759-jynus.json
10:09 elukey: restart jvm-based hadoop daemons on an-master100[1,2] to pick up the new openjdk version
10:06 jynus: update and restart db2071
10:03 jynus@cumin1001: dbctl commit (dc=all): 'Depool db2071', diff saved to https://phabricator.wikimedia.org/P9569 and previous config saved to /var/cache/conftool/dbconfig/20191108-100310-jynus.json
10:01 jynus@cumin1001: dbctl commit (dc=all): 'Repool db2072', diff saved to https://phabricator.wikimedia.org/P9568 and previous config saved to /var/cache/conftool/dbconfig/20191108-100128-jynus.json
09:50 moritzm: uploaded openjdk 8u232-b09-1~deb10u1 to component/jdk8 for apt.wikimedia.org/buster-wikimedia
09:41 jynus: update and restart db2072
09:41 jynus@cumin1001: dbctl commit (dc=all): 'Depool db2072', diff saved to https://phabricator.wikimedia.org/P9567 and previous config saved to /var/cache/conftool/dbconfig/20191108-094100-jynus.json
09:39 jynus@cumin1001: dbctl commit (dc=all): 'Repool db1106 at 50%', diff saved to https://phabricator.wikimedia.org/P9566 and previous config saved to /var/cache/conftool/dbconfig/20191108-093958-jynus.json
09:35 elukey@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0)
09:29 jynus: update and restart db2094
09:27 jynus@cumin1001: dbctl commit (dc=all): 'Repool db1106 at 10%', diff saved to https://phabricator.wikimedia.org/P9565 and previous config saved to /var/cache/conftool/dbconfig/20191108-092735-jynus.json
09:10 jynus: update and restart db1106
09:08 moritzm: installing Java security updates on kafka-jumbo
09:07 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1106 fully', diff saved to https://phabricator.wikimedia.org/P9564 and previous config saved to /var/cache/conftool/dbconfig/20191108-090746-jynus.json
09:04 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1106 at 10%', diff saved to https://phabricator.wikimedia.org/P9563 and previous config saved to /var/cache/conftool/dbconfig/20191108-090451-jynus.json
09:00 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1106 at 50%', diff saved to https://phabricator.wikimedia.org/P9562 and previous config saved to /var/cache/conftool/dbconfig/20191108-090012-jynus.json
08:57 elukey@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0)
08:52 jynus: stop and upgrade db1124 (may create temporary lag on wikireplicas)
01:09 twentyafterfour@deploy1001: Finished deploy [releng/phatality@11d4ad8]: deploying one more time, hopefully without killing elastic (duration: 03m 04s)
01:06 twentyafterfour@deploy1001: Started deploy [releng/phatality@11d4ad8]: deploying one more time, hopefully without killing elastic
16:42 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: some alphasorted config (duration: 01m 00s)
16:34 XioNoX: Homer push on cr2-knams: Sampling (disabled), enhanced-hash-key, ospf interfaces re-ordering (noop), policy-statement BGP_from_LVS (unused), lo0 term allow_vmhost
16:32 jynus@cumin1001: dbctl commit (dc=all): 'Repool db1080 at 100%', diff saved to https://phabricator.wikimedia.org/P9553 and previous config saved to /var/cache/conftool/dbconfig/20191107-163235-jynus.json
16:20 XioNoX: add BGP sessions to AS64050 in eqiad
16:15 XioNoX: add BGP sessions to AS57695 in esams and eqiad
16:12 XioNoX: clear v4 BGP sessions to AS7713 in eqsin (hit max prefix limit)
13:50 jynus@cumin1001: dbctl commit (dc=all): 'Repool db1080 at 50%', diff saved to https://phabricator.wikimedia.org/P9551 and previous config saved to /var/cache/conftool/dbconfig/20191107-135018-jynus.json
11:54 jbond42: update puppet_version used by CI 545289
11:50 jbond42: rebuilding compiler1002
11:36 jynus@cumin1001: dbctl commit (dc=all): 'Repool db1080 at 10%', diff saved to https://phabricator.wikimedia.org/P9550 and previous config saved to /var/cache/conftool/dbconfig/20191107-113611-jynus.json
11:16 jynus: stop and upgrade db1080
10:58 moritzm: installing Java security updates on kafka-main/logstash
10:50 moritzm: installing Java security updates on wdqs/maps
10:46 jynus@cumin1001: dbctl commit (dc=all): 'Fully depool db1080', diff saved to https://phabricator.wikimedia.org/P9549 and previous config saved to /var/cache/conftool/dbconfig/20191107-104618-jynus.json
10:28 moritzm: upgrading mw1277-1279 servers to PHP 7.2.24 T237239
10:27 jynus@cumin1001: dbctl commit (dc=all): 'Reduce db1080 weight', diff saved to https://phabricator.wikimedia.org/P9548 and previous config saved to /var/cache/conftool/dbconfig/20191107-102747-jynus.json
22:24 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Delay annotation jobs on commonswiki only (duration: 01m 01s)
22:17 mdholloway: created MachineVision extension tables on commonswiki
22:13 XioNoX: push standard forwarding-options to cr3/4-ulsfo
17:22 jynus@cumin1001: dbctl commit (dc=all): 'Reduce db1126 weight, too much backlog', diff saved to https://phabricator.wikimedia.org/P9542 and previous config saved to /var/cache/conftool/dbconfig/20191106-172235-jynus.json
17:21 ejegg: turned off donation queue consumer for financial_trxn record fix
17:11 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Enable WebAuthn extension if wmgUseWebAuthn is set (false in all of production) T227242 (duration: 01m 00s)
17:09 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set wmgUseWebAuthn false in all of production T227242 (duration: 01m 01s)
17:08 jynus@cumin1001: dbctl commit (dc=all): 'Repool db1074 fully', diff saved to https://phabricator.wikimedia.org/P9541 and previous config saved to /var/cache/conftool/dbconfig/20191106-170852-jynus.json
16:11 mdholloway: MachineVision: Imported Freebase to Wikidata ID mappings on testcommonswiki (T227349)
15:58 mdholloway: created MachineVision tables on testcommonswiki (T227349)
15:52 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Configure MachineVision and enable on testcommonswiki (T227349) (duration: 01m 00s)
15:47 mholloway-shell@deploy1001: Synchronized wmf-config/CommonSettings.php: MachineVision: Use an HTTP proxy in production (T236843) (duration: 01m 01s)
15:42 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: MachineVision: Do not restrict to testing users on Beta (duration: 01m 00s)
15:31 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: MachineVision: Fix Beta config with updated service name (duration: 01m 02s)
14:07 jynus@cumin1001: dbctl commit (dc=all): 'Repool db1074 at 50%', diff saved to https://phabricator.wikimedia.org/P9539 and previous config saved to /var/cache/conftool/dbconfig/20191106-140702-jynus.json
12:38 Urbanecm: EU SWAT done
12:38 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: a239b14: Allow certain users to create account at closed wikis (T222117; 2/2) (duration: 01m 00s)
12:36 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: a239b14: Allow certain users to create account at closed wikis (T222117; 1/2) (duration: 00m 59s)
12:34 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: 3e9ede0: Add 104 (Cookbook) to $wgContentNamespaces for bnwikibooks (T236840) (duration: 01m 00s)
11:35 jynus@cumin1001: dbctl commit (dc=all): 'Repool db1074 at 10%', diff saved to https://phabricator.wikimedia.org/P9537 and previous config saved to /var/cache/conftool/dbconfig/20191106-113510-jynus.json
11:14 jynus: stopping db1074 for maintenance (will create temporary s2 lag on wikireplicas)
11:06 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1074', diff saved to https://phabricator.wikimedia.org/P9536 and previous config saved to /var/cache/conftool/dbconfig/20191106-110603-jynus.json
09:46 moritzm: upgrading mw1262-mw1265,mw1276 servers to PHP 7.2.24 T237239
23:01 twentyafterfour@deploy1001: Started scap: testwikis wikis to 1.35.0-wmf.5 refs T233853
22:51 twentyafterfour@deploy1001: scap failed: CalledProcessError Command 'cp -r "/tmp/scap_l10n_2905573311"/* "/srv/mediawiki-staging/php-1.35.0-wmf.5/cache/l10n"' returned non-zero exit status 1 (duration: 01m 26s)
22:50 twentyafterfour@deploy1001: Started scap: testwikis wikis to 1.35.0-wmf.5 refs T233853
22:39 twentyafterfour@deploy1001: scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_2076118383" --threads=30 --lang en --quiet' returned non-zero exit status 1 (duration: 01m 26s)
22:38 twentyafterfour@deploy1001: Started scap: testwikis wikis to 1.35.0-wmf.5 refs T233853
22:17 twentyafterfour: scap failed with error: A copy of your installation's LocalSettings.php must exist and be readable in the source directory. Use --conf to specify it. refs T233853
22:09 twentyafterfour@deploy1001: scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_840646293" --threads=30 --lang en --quiet' returned non-zero exit status 1 (duration: 04m 54s)
22:04 twentyafterfour@deploy1001: Started scap: testwikis wikis to 1.35.0-wmf.5 refs T233853
22:03 XioNoX: remove 127.0.0.1/32 and ::1/128 from cr2-esams:lo0.0
21:58 XioNoX: remove 127.0.0.1/32 and ::1/128 from cr3-esams:lo0.0
20:45 mutante: shutting down cobalt (formerly gerrit server)
20:44 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
12:11 urbanecm@deploy1001: Synchronized static/images/mobile/copyright/: SWAT: 7c1c64c: Add localized Minerva wordmark for Sindhi Wikipedia (T200870; 1/2) (duration: 00m 53s)
12:08 urbanecm@deploy1001: Synchronized static/images/project-logos/: SWAT: a6d64b1: Update logo for zh-classical Wikipedia (T236905) (duration: 00m 53s)
03:03 andrew@deploy1001: Started deploy [horizon/deploy@9972ed2]: deploying fix for puppet prefix creation
2019-11-02
00:58 mutante: gerrit-replica - created missing /var/lib/gerrit2/review_site/tmp and restarted service - service back up on buster (T176774)
00:34 mutante: gerrit-replica - fixing permissions of files in /srv/gerrit and restarting
00:27 mutante: gerrit2001 - copy mysql-connector-java.jar into /usr/share/java/ and link it into /var/lib/gerrit2/review_site/lib (T176774)
00:05 mutante: rsyncing gerrit plugin dir from gerrit1001 to gerrit2001 (T176774)
2019-11-01
23:45 mutante: rsyncing gerrit git data from gerrit1001 to gerrit2001 (using --delete too!) T176774
22:00 mutante: gerrit - repo sync between gerrit and gerrit-replica in progress .. if you can't clone from replica you can use main gerrit and replica will come back
21:20 jforrester@deploy1001: Synchronized php-1.35.0-wmf.4/extensions/UploadWizard/resources/mw.UploadWizardUploadInterface.js: T237126 Fixing DOM in upload interface of UploadWizard (duration: 00m 56s)
21:06 mutante: scp /usr/share/java/mysql-connector-java.jar from gerrit1001 to gerrit2001 (T176774)
20:46 cdanis: add to bot_blocked_nets the IPs of several EC2 instances sending expensive requests to ORES T237134
19:54 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
23:11 urbanecm@deploy1001: Synchronized static/images/project-logos/: SWAT: 54ee973: Change bawiki logo to an anniversary one (T237035) (duration: 00m 53s)
22:04 twentyafterfour@deploy1001: Started deploy [design/style-guide@4d8d085]: testing deploy_design
21:59 mutante: deploy1001 - recreating deploy_design deployment key as ED25519 and with the correct comment (the comment matters and must match path to the file for keyholder) (T235677)
21:49 mutante: deploy1001 keyholder restart, keyholder arm ...
21:46 mutante: deploy1001 - move apach2modsec deployment key out of keyholder dir, keyholder arm to reload all other deployment keys including the new one for design (T235677)
21:25 robh: setting up ps1-b8-eqiad per T227543. it will reboot twice in the next 15 minutes, and then should start to clear up in icinga
21:18 ppchelko@deploy1001: Started deploy [restbase/deploy@9cac9ac]: Bump Parsoid-PHP traffic mirroring to 50% T235902
20:35 XioNoX: Homer push to all cr2-eqdfw - new NTP servers, remove border-in4 term unused-ips, add (unused) BGP_Wikimedia_pops, re-order ospf interfaces
20:27 shdubsh: restarting logstash on logstash1008 to test level->severity filter selector
20:12 XioNoX: Homer push to all msw* - new NTP servers - T237011
20:07 XioNoX: Homer push to all asw* - new NTP servers - T237011
19:49 XioNoX: Homer push to eqsin
19:49 mutante: rsyncing home dirs from previous gerrit server cobalt to gerrit1001
18:00 fdans@deploy1001: Started deploy [analytics/refinery@8ca04df]: deploying refinery
16:23 bd808: Our @wikimediatech Twitter account is soft blocked pending phone number verification. bd808 trying to figure out a good way to do that verification for a bot account.
16:14 jynus: restart dbprov2002 after upgrade T236924
16:09 jynus@cumin1001: dbctl commit (dc=all): 'Repool db1119, db1113 at 100%', diff saved to https://phabricator.wikimedia.org/P9513 and previous config saved to /var/cache/conftool/dbconfig/20191031-160925-jynus.json
14:56 Urbanecm: Password reset for SUL user `Darth AK`
14:50 jynus@cumin1001: dbctl commit (dc=all): 'Repool db1119 at 10%', diff saved to https://phabricator.wikimedia.org/P9512 and previous config saved to /var/cache/conftool/dbconfig/20191031-145010-jynus.json
14:28 jynus: reloading ferm on db1119
14:24 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1119', diff saved to https://phabricator.wikimedia.org/P9511 and previous config saved to /var/cache/conftool/dbconfig/20191031-142455-jynus.json
13:40 effie: upload xdebug 2.7.0-1+wmf2 to component/php72 - T234418
11:37 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1119, db1113 T227543', diff saved to https://phabricator.wikimedia.org/P9507 and previous config saved to /var/cache/conftool/dbconfig/20191031-113659-jynus.json
11:24 Urbanecm: EU SWAT done
11:23 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.4/extensions/ProofreadPage/: SWAT: e0d5ce9: Add page navigation tabs in correct order skin-side and remove js requirement for Vector tab icons (T231250); ed17da2: Makes sure that Vector default background does not override the navigation arrows (T236969) (duration: 01m 02s)
09:37 godog: temporarily stop logstash on logstash2005 to test performance with two ingesters only - T215904
09:23 godog: temporarily stop logstash on logstash2006 to test performance with two ingesters only - T215904
09:10 ema: depool cp5009 and reimage as text_ats T227432
08:25 ariel@deploy1001: Finished deploy [dumps/dumps@f2b6d78]: couple of fixup scripts, bug fix for incr dumps index.html generation (duration: 00m 03s)
08:25 ariel@deploy1001: Started deploy [dumps/dumps@f2b6d78]: couple of fixup scripts, bug fix for incr dumps index.html generation
06:37 elukey: upgrade cergen to 0.2.5 on puppetmaster1001
03:44 vgutierrez: switch from nginx to ats-tls on cp4032 - T231627
03:09 vgutierrez: switch from nginx to ats-tls on cp4031 - T231627
02:51 vgutierrez: switch from nginx to ats-tls on cp4030 - T231627
01:41 eileen: civicrm revision changed from 0547c84f73 to d2045c6b98, config revision is 1a709a61aa (looks like patch was still hung in gerrit last time)
01:34 eileen: civicrm revision is 0547c84f73, config revision is 1a709a61aa - that should stop those failmails
11:49 moritzm: temporarily disabling puppet on LDAP servers for a schema change
11:42 ema: depool cp5008 and reimage as text_ats T227432
11:37 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=0)
11:31 mlitn@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Increase rate limits for newbie non-ip users on Commons (duration: 01m 01s)
11:13 Urbanecm: EU SWAT done
11:12 Urbanecm: Synchronized wmf-config/InitialiseSettings.php: SWAT: 61cb77c: Re-apply: MCR: Set testwiki to use the new MCR-only schema (T198558) (duration: 00m 59s)
10:07 jynus: restarting bacula-dir, bacula-sd on backup1001 T236406
09:46 vgutierrez: Switch from nginx to ats-tls on cp4029 - T231627
09:34 vgutierrez: Switch from nginx to ats-tls on cp4028 - T231627
09:25 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
19:30 andrew@deploy1001: Started deploy [horizon/deploy@bab5d37]: (no justification provided)
19:25 brennen@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.35.0-wmf.4
19:14 brennen@deploy1001: Finished scap: testwiki to php-1.35.0-wmf.4 and rebuild l10n cache (duration: 21m 11s)
18:54 jynus@cumin1001: dbctl commit (dc=all): 'Revert state to before overload+maintenance', diff saved to https://phabricator.wikimedia.org/P9501 and previous config saved to /var/cache/conftool/dbconfig/20191029-185438-jynus.json
18:53 brennen@deploy1001: Started scap: testwiki to php-1.35.0-wmf.4 and rebuild l10n cache
18:53 Trey314159: reindexing Slovak wikis on elastic@eqiad and elastic@codfw (T235654)
17:06 jynus@cumin1001: dbctl commit (dc=all): 'repool db1099 both instances fully to increase redundancy', diff saved to https://phabricator.wikimedia.org/P9499 and previous config saved to /var/cache/conftool/dbconfig/20191029-170648-jynus.json
16:56 jynus@cumin1001: dbctl commit (dc=all): 'depool fully db1105:3311, stability/lag issues', diff saved to https://phabricator.wikimedia.org/P9498 and previous config saved to /var/cache/conftool/dbconfig/20191029-165633-jynus.json
16:46 jynus@cumin1001: dbctl commit (dc=all): 'pool db1106 into s1 rcs', diff saved to https://phabricator.wikimedia.org/P9497 and previous config saved to /var/cache/conftool/dbconfig/20191029-164640-jynus.json
16:43 ssastry@deploy1001: Started deploy [parsoid/deploy@aa59ce3]: Update parsoid to 089bf28d
16:39 gehel@cumin2001: END (ERROR) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=97)
19:38 ssastry@deploy1001: Started deploy [parsoid/deploy@d932d6a]: Update parsoid to 089bf28d
18:53 moritzm: updating PHP on people1001
18:52 Urbanecm: Morning SWAT done
18:42 urbanecm@deploy1001: Synchronized wmf-config/logging.php: SWAT: 1a09e2a: Direct Parsoid/PHP logs to a parsoid-php log "type" (T235899) (duration: 00m 52s)
18:41 rlazarus: restarted memcached on mc1020 T235188
18:32 mutante: moscovium - rename all files in /etc/request-tracker4/RT_SiteConfig.d to have a .pm extension - this fixed RT - login works again - puppet patch coming up (T180641)
18:28 mutante: moscovium - deleting /etc/request-tracker4/RT_SiteConfig.d/ 50-debconf.pm and 51-dbconfig-common.pm which duplicate the same files without .pm extension with wrong values, probably due to some package change (T180641)
13:05 effie: enable puppet on mw2* servers, depool and repool to reload apache - T229792
13:01 jynus: stop db1114 for testing
12:30 ema: depool cp5007 and reimage as text_ats T227432
12:22 effie: depool mw2150
11:56 twentyafterfour@deploy1001: Finished deploy [phabricator/deployment@e4e2b22]: testing deployment of phabricator to phab1001 (duration: 00m 05s)
11:56 twentyafterfour@deploy1001: Started deploy [phabricator/deployment@e4e2b22]: testing deployment of phabricator to phab1001
11:34 Urbanecm: EU SWAT done
11:33 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.3/extensions/VisualEditor/includes/ApiVisualEditorEdit.php: SWAT: 8caf681: Dont log missing ETags when creating a new page, thats normal (T233320) (duration: 00m 54s)
11:33 effie: Disable puppet on mw* for 545652 - T229792
11:12 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: dd2f06c: Add Translate channel for the Translate extension (T221119) (duration: 00m 53s)
11:07 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: ff17666: Adjust wgUploadNavigationUrl for azwiki to point to commons UpWiz (T236307) (duration: 00m 53s)
08:16 mobrovac@deploy1001: Finished deploy [restbase/deploy@447981b]: Parsoid: Shim content-language and vary headers only for the PHP variant - T230791 (duration: 13m 42s)
08:15 godog: swift eqiad-prod: final weight to ms-be105[1-6] - T232367
08:02 mobrovac@deploy1001: Started deploy [restbase/deploy@447981b]: Parsoid: Shim content-language and vary headers only for the PHP variant - T230791
07:40 mobrovac@deploy1001: Finished deploy [restbase/deploy@c500d7a]: Add the Parsoid proxy for JS/PHP variants, add top mediarequests end point and add mnwwiki and ge.wm.org - T230791T235744T236389 (duration: 13m 44s)
07:40 elukey@deploy1001: Finished deploy [eventlogging/analytics@0f1ad6d]: Move codebase to Python3 - second attempt (duration: 00m 05s)
07:40 elukey@deploy1001: Started deploy [eventlogging/analytics@0f1ad6d]: Move codebase to Python3 - second attempt
07:37 elukey: upload archiva 2.2.4-1 to wikimedia-stretch (fix to avoid overriding archiva.xml upon install)
07:27 mobrovac@deploy1001: Started deploy [restbase/deploy@c500d7a]: Add the Parsoid proxy for JS/PHP variants, add top mediarequests end point and add mnwwiki and ge.wm.org - T230791T235744T236389
07:25 mobrovac@deploy1001: Finished deploy [restbase/deploy@c500d7a] (dev-cluster): Add the Parsoid proxy for JS/PHP variants, add top mediarequests end point and add mnwwiki and ge.wm.org (duration: 02m 37s)
07:22 mobrovac@deploy1001: Started deploy [restbase/deploy@c500d7a] (dev-cluster): Add the Parsoid proxy for JS/PHP variants, add top mediarequests end point and add mnwwiki and ge.wm.org
22:55 mutante: moscovium rm /dev/shm/envoy_shared_memory_0 to revive envoy which failed to run after changing ports and reinstalling it (T180641)
22:42 mutante: moscovium - manually deleting envoy listener on 1443 and letting puppet recreate config because it's not removed if you change the port (T180641)
21:55 mutante: running puppet on ulsfo cp-ats servers to pick up config change for RT backend
22:01 mutante: mw1270 - was alerting in Icinga as degraded systemd state - reason was 'hhvm.service not-found". systemctl reset-failed cleared it. could cause monitoring spam on more servers (T229792)
20:58 bblack: cr3-esams switch high-traffic1 static fallback routes from lvs3001 to lvs3005
20:58 bblack: cr2-esams switch high-traffic1 static fallback routes from lvs3001 to lvs3005
20:40 bblack: esams lvs: high-traffic1 - change 3005's med to 0 (becomes new primary, permanently)
20:36 bblack: esams lvs: high-traffic1 - change 3003's med to 200, 3001's med to 50, 3005 remains 100 (traffic will blip to 3005 then back to 3001 again)
12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2092 after analyze table', diff saved to https://phabricator.wikimedia.org/P9468 and previous config saved to /var/cache/conftool/dbconfig/20191024-120812-marostegui.json
12:06 XioNoX: shutdown cr1-esams - cr2-knams link
12:00 XioNoX: shutdown transit BGP sessions on cr2-knams
11:40 Urbanecm: EU SWAT done
11:35 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: 3a5cb68: Permission changes of move-rootuserpages assignment at commonswiki (T236359) (duration: 01m 00s)
11:33 ema@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
11:26 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: e079956: Add CAT as alias for NS_CATEGORY at commonswiki (T236352) (duration: 01m 00s)
11:15 mlitn@deploy1001: Synchronized php-1.35.0-wmf.3/extensions/WikibaseMediaInfo: Also use custom PrefetchingTermLookup in SingleEntitySourceServices (duration: 01m 01s)
10:18 marostegui@cumin1001: dbctl commit (dc=all): 'Adjust s6 weights for db1093 and db1085', diff saved to https://phabricator.wikimedia.org/P9466 and previous config saved to /var/cache/conftool/dbconfig/20191024-101810-marostegui.json
08:36 godog: roll restart rsyslog in codfw/eqiad to pick up new kafka partitions
08:18 godog: roll restart rsyslog in ulsfo/esams/eqsin to pick up new kafka partitions
08:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2092 for analyze table', diff saved to https://phabricator.wikimedia.org/P9465 and previous config saved to /var/cache/conftool/dbconfig/20191024-081519-marostegui.json
07:57 XioNoX: reboot mr1-esams
07:42 godog: bump rsyslog- topics partitions to 6 and roll-restart logstash frontends
07:24 vgutierrez@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
05:20 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1097:3315 after compression', diff saved to https://phabricator.wikimedia.org/P9463 and previous config saved to /var/cache/conftool/dbconfig/20191024-052002-marostegui.json
05:18 marostegui: Run analyze enwiki.revision on db2092 T223151
04:59 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1097:3315 after compression', diff saved to https://phabricator.wikimedia.org/P9462 and previous config saved to /var/cache/conftool/dbconfig/20191024-045954-marostegui.json
04:59 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1089 from special slaves group and leave it with its original pooling options T223151', diff saved to https://phabricator.wikimedia.org/P9461 and previous config saved to /var/cache/conftool/dbconfig/20191024-045924-marostegui.json
04:55 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1097:3315 after compression', diff saved to https://phabricator.wikimedia.org/P9460 and previous config saved to /var/cache/conftool/dbconfig/20191024-045544-marostegui.json
04:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
12:31 vgutierrez: restarting ats-tls on cache text nodes - T233274
12:27 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1130 from the special slaves group on s5 and leave it back with its original pooling options T223151', diff saved to https://phabricator.wikimedia.org/P9454 and previous config saved to /var/cache/conftool/dbconfig/20191023-122708-marostegui.json
11:06 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: cf8e2f1: Set $wgArticleCountMethod to any for frwikiquote (T236212) (duration: 01m 12s)
08:42 godog: roll restart rsyslog on cirrus and wqds hosts to pick up changes to logback topic partitions
08:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2091:3312 after table compression', diff saved to https://phabricator.wikimedia.org/P9452 and previous config saved to /var/cache/conftool/dbconfig/20191023-082826-marostegui.json
08:23 godog: roll restart logstash in codfw/eqiad to pick up new kafka partitions
08:22 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights to x100 on s8 eqiad - T231018', diff saved to https://phabricator.wikimedia.org/P9451 and previous config saved to /var/cache/conftool/dbconfig/20191023-082246-marostegui.json
08:11 godog: kafka-logging eqiad set 12 partitions for ^mwlog- ^logback- and eqiad.client.error topics
08:09 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights to x100 on s8 codfw - T231018', diff saved to https://phabricator.wikimedia.org/P9450 and previous config saved to /var/cache/conftool/dbconfig/20191023-080857-marostegui.json
07:51 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights to x100 on s7 eqiad - T231018', diff saved to https://phabricator.wikimedia.org/P9449 and previous config saved to /var/cache/conftool/dbconfig/20191023-075106-marostegui.json
07:48 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights to x100 on s7 codfw - T231018', diff saved to https://phabricator.wikimedia.org/P9448 and previous config saved to /var/cache/conftool/dbconfig/20191023-074828-marostegui.json
07:46 XioNoX: powering down cr2-esams for relocation (for real this time)
07:38 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights to x100 on s6 eqiad - T231018', diff saved to https://phabricator.wikimedia.org/P9447 and previous config saved to /var/cache/conftool/dbconfig/20191023-073831-marostegui.json
07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights to x100 on s6 codfw - T231018', diff saved to https://phabricator.wikimedia.org/P9446 and previous config saved to /var/cache/conftool/dbconfig/20191023-073556-marostegui.json
07:30 XioNoX: powering down cr2-esams for relocation
06:38 marostegui: Compress tables on db1097:3315 T235599
06:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1097:3315 for compression T235599', diff saved to https://phabricator.wikimedia.org/P9445 and previous config saved to /var/cache/conftool/dbconfig/20191023-063800-marostegui.json
05:29 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1096:3315 after maintenance maintenance', diff saved to https://phabricator.wikimedia.org/P9444 and previous config saved to /var/cache/conftool/dbconfig/20191023-052940-marostegui.json
05:08 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1096:3315 after maintenance maintenance', diff saved to https://phabricator.wikimedia.org/P9443 and previous config saved to /var/cache/conftool/dbconfig/20191023-050812-marostegui.json
04:57 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1096:3315 after maintenance maintenance', diff saved to https://phabricator.wikimedia.org/P9442 and previous config saved to /var/cache/conftool/dbconfig/20191023-045722-marostegui.json
17:20 bblack: geodns: re-pooling esams (at this point, we're entirely back in our "normal" state of affairs)
17:19 arlolra@deploy1001: Started deploy [parsoid/deploy@4c64c9c]: Updating Parsoid to cf01d91
16:51 bblack: geodns: moving all "normal" eqiad traffic back to eqiad (in addition to the esams-diverted traffic which is still pointed mostly at eqiad right now)
16:21 mutante: running puppet on deployment servers
16:20 thcipriani: restarting gerrit
16:14 thcipriani: stopping gerrit to run a fix for T222391
15:58 bblack: depooling esams temporarily to test traffic scenario on lvs1014
15:47 bblack: enable pybal+puppet on rebooted lvs1014
15:40 bblack: rebooting lvs1014
15:28 liw@deploy1001: Finished scap: testwiki to php-1.35.0-wmf.3 and rebuild l10n cache (duration: 37m 39s)
15:26 XioNoX: repool esams
15:20 XioNoX: rollback ns2 redirect
15:13 bblack: re-disabling lvs1014 ...
15:10 bblack: re-enabling lvs1014 pybal/puppet
15:03 moritzm: rebooting kafka-main1005 for microcode debugging
15:01 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
13:06 XioNoX: depool esams for onsite work - T235805
13:05 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1096:3316 db1105:3311 db1105:3312 after PDU and on-site maintenance', diff saved to https://phabricator.wikimedia.org/P9434 and previous config saved to /var/cache/conftool/dbconfig/20191022-130556-marostegui.json
12:54 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1096:3316 db1105:3311 instance db1105:3312 after PDU and on-site maintenance', diff saved to https://phabricator.wikimedia.org/P9433 and previous config saved to /var/cache/conftool/dbconfig/20191022-125435-marostegui.json
12:46 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1096:3316 db1105:3311 instance db1105:3312 after PDU and on-site maintenance', diff saved to https://phabricator.wikimedia.org/P9432 and previous config saved to /var/cache/conftool/dbconfig/20191022-124607-marostegui.json
12:37 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1096:3316 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9431 and previous config saved to /var/cache/conftool/dbconfig/20191022-123757-marostegui.json
12:32 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1105:3312 and db1105:3311 after on-site maintenance T235877', diff saved to https://phabricator.wikimedia.org/P9430 and previous config saved to /var/cache/conftool/dbconfig/20191022-123257-marostegui.json
12:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2089:3315', diff saved to https://phabricator.wikimedia.org/P9429 and previous config saved to /var/cache/conftool/dbconfig/20191022-123032-marostegui.json
12:29 moritzm: rebooting miscweb2001 for some microcode tests
12:28 marostegui: Compress db1096:3315
12:27 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
11:10 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: 0593f34: Change the language of Votewiki to Persian (fa) temporarily for the annual ArbCom elections (T230614) (duration: 00m 54s)
10:55 moritzm: rebooting rpki2001 for some microcode tests
10:54 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
09:54 ema: lvs2006: restart pybal to add new service kibana-ssl T210411
09:54 ema: lvs1016: restart pybal to add new service kibana-ssl T210411
09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights to x100 on s4 eqiad - T231018', diff saved to https://phabricator.wikimedia.org/P9425 and previous config saved to /var/cache/conftool/dbconfig/20191022-091327-marostegui.json
09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights to x100 on s4 codfw - T231018', diff saved to https://phabricator.wikimedia.org/P9424 and previous config saved to /var/cache/conftool/dbconfig/20191022-091051-marostegui.json
08:05 marostegui: Stop MySQL on labsdb1012 for PDU work T227142
07:53 marostegui: Stop MySQL on db1116 pc1007 db1096:3315, db1096:3316 for PDU maintenance T227142
06:32 vgutierrez: rolling restart of ats-tls - T233274T234803
05:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096 for PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9423 and previous config saved to /var/cache/conftool/dbconfig/20191022-055151-marostegui.json
05:48 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1070 from config T235464', diff saved to https://phabricator.wikimedia.org/P9422 and previous config saved to /var/cache/conftool/dbconfig/20191022-054759-marostegui.json
05:08 vgutierrez: Switch from nginx to ats-tls on cp1088 - T231433
05:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2089:3315 for compression T235599', diff saved to https://phabricator.wikimedia.org/P9421 and previous config saved to /var/cache/conftool/dbconfig/20191022-050204-marostegui.json
05:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2084:3314 after compression', diff saved to https://phabricator.wikimedia.org/P9420 and previous config saved to /var/cache/conftool/dbconfig/20191022-050048-marostegui.json
04:58 vgutierrez: Switch from nginx to ats-tls on cp2026 - T231433
04:30 vgutierrez: Switch from nginx to ats-tls on cp2024 - T231433
04:18 vgutierrez: Switch from nginx to ats-tls on cp3049 - T231433
03:44 vgutierrez: Switch from nginx to ats-tls on cp3047 - T231433
17:08 jforrester@deploy1001: Synchronized php-1.35.0-wmf.1/extensions/VisualEditor/: Update VisualEditor for set of back-ports in wmf.1 T233320, T234564, T235959 (duration: 00m 56s)
17:01 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@75c0577]: GUI Updates
14:16 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.2 refs T233850
13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2084:3314 and db2091:3312 for table compression', diff saved to https://phabricator.wikimedia.org/P9412 and previous config saved to /var/cache/conftool/dbconfig/20191021-132633-marostegui.json
13:24 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights 1/2 to 100/200 on s2 codfw - T231018', diff saved to https://phabricator.wikimedia.org/P9411 and previous config saved to /var/cache/conftool/dbconfig/20191021-132440-marostegui.json
13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights 1/2 to 100/200 on s2 eqiad - T231018', diff saved to https://phabricator.wikimedia.org/P9410 and previous config saved to /var/cache/conftool/dbconfig/20191021-132145-marostegui.json
13:07 ema: lvs1015: restart pybal to add new service wdqs-ssl T210411
13:04 ema: lvs2003: restart pybal to add new service wdqs-ssl T210411
13:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1090:3312 after schema change and remove db1129 from vslow and dump as it was was there temporarily', diff saved to https://phabricator.wikimedia.org/P9409 and previous config saved to /var/cache/conftool/dbconfig/20191021-130355-marostegui.json
13:02 ema: lvs1016: restart pybal to add new service wdqs-ssl T210411
07:32 ema: depool cp4029 and reimage as text_ats T227432
07:15 vgutierrez: Switch from nginx to ats-tls on cp1075 - T231627
07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Pool non partitioned db1089 into s1 special slaves to check for slow queries T223151', diff saved to https://phabricator.wikimedia.org/P9406 and previous config saved to /var/cache/conftool/dbconfig/20191021-070655-marostegui.json
07:03 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights from 1 to 100 on s1 eqiad - T231018', diff saved to https://phabricator.wikimedia.org/P9405 and previous config saved to /var/cache/conftool/dbconfig/20191021-070352-marostegui.json
07:01 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights from 1 to 100 on s1 codfw - T231018', diff saved to https://phabricator.wikimedia.org/P9404 and previous config saved to /var/cache/conftool/dbconfig/20191021-070119-marostegui.json
06:59 vgutierrez: Switch from nginx to ats-tls on cp2001 - T231627
06:46 vgutierrez: Switch from nginx to ats-tls on cp3030 - T231627
06:28 vgutierrez: Install python3-cryptography-2.6.1-3+deb10u2 on acme-chief hosts - T234131
06:15 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1105:3311', diff saved to https://phabricator.wikimedia.org/P9403 and previous config saved to /var/cache/conftool/dbconfig/20191021-061518-marostegui.json
06:12 vgutierrez: Switch cp1086 from nginx to ats-tls - T231433
06:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
05:58 marostegui@cumin1001: dbctl commit (dc=all): 'Give weight 100 to db1130 on s5 to check for slow queries T223151', diff saved to https://phabricator.wikimedia.org/P9402 and previous config saved to /var/cache/conftool/dbconfig/20191021-055843-marostegui.json
05:54 vgutierrez: Switch cp2017 from nginx to ats-tls - T231433
05:50 marostegui@cumin1001: dbctl commit (dc=all): 'Give more traffic to db1105:3311', diff saved to https://phabricator.wikimedia.org/P9401 and previous config saved to /var/cache/conftool/dbconfig/20191021-055017-marostegui.json
05:43 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db2048 and db2061, those hosts will be decommissioned T228258', diff saved to https://phabricator.wikimedia.org/P9400 and previous config saved to /var/cache/conftool/dbconfig/20191021-054340-marostegui.json
05:42 _joe_: slowly removing service objects from production etcd T233973
05:38 vgutierrez: Switch cp3044 from nginx to ats-tls - T231433
05:37 marostegui@cumin1001: dbctl commit (dc=all): 'Give more traffic to db1105:3311', diff saved to https://phabricator.wikimedia.org/P9399 and previous config saved to /var/cache/conftool/dbconfig/20191021-053737-marostegui.json
05:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
05:28 marostegui: Compress tables on db2084:3314 db2091:3312 - T235599
05:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3312', diff saved to https://phabricator.wikimedia.org/P9398 and previous config saved to /var/cache/conftool/dbconfig/20191021-052643-marostegui.json
05:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2088:3312 db2084:3315 - T235599', diff saved to https://phabricator.wikimedia.org/P9397 and previous config saved to /var/cache/conftool/dbconfig/20191021-052527-marostegui.json
05:20 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1105:3311', diff saved to https://phabricator.wikimedia.org/P9396 and previous config saved to /var/cache/conftool/dbconfig/20191021-052035-marostegui.json
05:19 vgutierrez: Switch cp4026 from nginx to ats-tls - T231433
05:14 marostegui: Deploy schema change on db1090:3312 T234066T233135
05:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1090:3312 for schema change and pool db1129 temporarily in vslow, dump', diff saved to https://phabricator.wikimedia.org/P9395 and previous config saved to /var/cache/conftool/dbconfig/20191021-051356-marostegui.json
18:27 mutante: temp. disabled puppet on all wtp* servers, adding mediawiki appserver roles on them incrementally by re-enabling puppet, starting with wtp1026, scheduled icinga downtime for wtp* all services (T233654)
18:19 mutante: temp. disabling puppet on all wtp* servers
15:40 Urbanecm: Reassign edits from DannyS712 (T235446) to DannyS712 at banwiki (T235446)
15:38 Urbanecm: Run extensions/CentralAuth/maintenance/createLocalAccount.php --wiki=banwiki DannyS712 (T235446)
15:38 Urbanecm: Rename DannyS712@banwiki to DannyS712 (T235446) locally (T235446)
15:07 Urbanecm: Reattach DannyS712@banwiki to DannyS712@SUL (T235446)
14:19 _joe_: uploading cassandra 3.11.4 to stretch-wikimedia
14:10 marostegui: Run compare.py on db1105 - T235877
13:48 jynus: disabled notifications on db1105
13:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3311 and db1105:3312 host rebooted itself', diff saved to https://phabricator.wikimedia.org/P9392 and previous config saved to /var/cache/conftool/dbconfig/20191018-134517-marostegui.json
13:29 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db2059 from config, host decommissioned', diff saved to https://phabricator.wikimedia.org/P9391 and previous config saved to /var/cache/conftool/dbconfig/20191018-132934-marostegui.json
13:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2084:3315 for tables compression T235599', diff saved to https://phabricator.wikimedia.org/P9390 and previous config saved to /var/cache/conftool/dbconfig/20191018-130253-marostegui.json
12:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1076 after schema change', diff saved to https://phabricator.wikimedia.org/P9389 and previous config saved to /var/cache/conftool/dbconfig/20191018-123930-marostegui.json
12:20 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
08:03 ema: depool cp4028 and reimage as text_ats T227432
07:58 marostegui: Deploy schema change on db1076
07:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1076 for schema change', diff saved to https://phabricator.wikimedia.org/P9388 and previous config saved to /var/cache/conftool/dbconfig/20191018-075709-marostegui.json
07:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1129 after schema change', diff saved to https://phabricator.wikimedia.org/P9387 and previous config saved to /var/cache/conftool/dbconfig/20191018-075529-marostegui.json
07:21 moritzm: installing unbound security updates on buster
07:20 moritzm: installing libdatetime-timezone-perl updates (time zone updates)#
05:53 vgutierrez: switch cp1084 from nginx to ats-tls - T231433
05:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
05:15 marostegui: Deploy schema change on db1129 T233135T234066
05:15 marostegui: Compress tables on db2091:3314 T235599
05:14 vgutierrez: switch cp3039 from nginx to ats-tls - T231433
05:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1129 for schema change', diff saved to https://phabricator.wikimedia.org/P9386 and previous config saved to /var/cache/conftool/dbconfig/20191018-051355-marostegui.json
05:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1099:3311 and db2086:3318 after table compression', diff saved to https://phabricator.wikimedia.org/P9385 and previous config saved to /var/cache/conftool/dbconfig/20191018-050831-marostegui.json
04:57 vgutierrez: switch cp4025 from nginx to ats-tls - T231433
04:34 vgutierrez: switch cp5005 from nginx to ats-tls - T231433
04:31 vgutierrez: restarting nagios-nrpe-server on stat1007
18:01 _joe_: depooled wtp1025 from parsoid, parsoid-php to allow running benchmarks there
18:01 elukey: update librdkafka on eventlog1002 and restart eventlogging
15:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1090:3317 and remove db1136 from its temporary vslow,dump role', diff saved to https://phabricator.wikimedia.org/P9382 and previous config saved to /var/cache/conftool/dbconfig/20191017-151952-marostegui.json
15:07 dcausse: unbanning elastic1050:psi
15:01 dcausse: dumping jvm heap on elastic1050:psi to investigate gc issues
14:46 moritzm: installing 4.9.189 Linux update on jessie hosts (no reboots, deploying the package only at this point)
14:37 dcausse: banning elastic1050:psi to investigate gc issues
14:32 moritzm: uploaded linux-meta 1.22 for jessie-wikimedia
14:32 bblack: disable puppet on cache fleet (cp*) ahead of cert deployment refactoring - T234803
13:41 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1129 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9381 and previous config saved to /var/cache/conftool/dbconfig/20191017-134112-marostegui.json
13:30 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1129 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9380 and previous config saved to /var/cache/conftool/dbconfig/20191017-133047-marostegui.json
13:06 XioNoX: rollback failover vrrp from cr2-eqiad to cr1-eqiad - T227133
12:56 XioNoX: restart mr1-eqiad
12:54 XioNoX: downtiming all mgmt host for 30min (mr1-eqiad needs to be rebooted)
12:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2088:3312 for compression T235599', diff saved to https://phabricator.wikimedia.org/P9379 and previous config saved to /var/cache/conftool/dbconfig/20191017-125248-marostegui.json
12:51 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1129 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9378 and previous config saved to /var/cache/conftool/dbconfig/20191017-125154-marostegui.json
12:50 marostegui: Compress tables on db2088:3312 - T235599
12:45 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1129 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9377 and previous config saved to /var/cache/conftool/dbconfig/20191017-124503-marostegui.json
12:13 marostegui@cumin1001: dbctl commit (dc=all): 'Restore db1090:3312 original weight', diff saved to https://phabricator.wikimedia.org/P9376 and previous config saved to /var/cache/conftool/dbconfig/20191017-121330-marostegui.json
12:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3312 after schema change', diff saved to https://phabricator.wikimedia.org/P9375 and previous config saved to /var/cache/conftool/dbconfig/20191017-121106-marostegui.json
11:16 vgutierrez: upgrading ATS on esams nodes to 8.0.5-1wm9 - T234011
11:11 Urbanecm: EU SWAT done
11:11 XioNoX: failover vrrp from cr2-eqiad to cr1-eqiad - T227133
11:11 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: 36d4612: Allow sysops to add transwiki on nnwiki, and add import sources (T231761) (duration: 00m 59s)
11:09 vgutierrez: upgrading ATS on ulsfo nodes to 8.0.5-1wm9 - T234011
11:08 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.2/extensions/WikibaseMediaInfo: SWAT: 5a67011: Keep track of assigned nodes in both old & new DOM (T235236) (duration: 01m 03s)
10:58 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
10:17 elukey@deploy1001: Started deploy [eventlogging/analytics@0f0a1aa]: Move codebase to Python3
09:57 godog: swift codfw-prod: more weight to ms-be205[1-6] - T233638
09:39 godog: swift eqiad-prod: add weight to ms-be105[1-6] - T232367
09:38 marostegui: Stop MySQL on db1129 for PDU work
09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1129 for PDU work, give some traffic to db1090:3312 meanwhile T22meanwhile T227133', diff saved to https://phabricator.wikimedia.org/P9374 and previous config saved to /var/cache/conftool/dbconfig/20191017-093753-marostegui.json
09:27 elukey: upload archiva 2.2.4-1 to stretch-wikimedia - T222595
09:26 marostegui: Stop MySQL on db1117 this will generate some haproxy alerts - T227133
08:28 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
08:05 vgutierrez: upgrading ATS on eqsin nodes to 8.0.5-1wm9 - T234011
08:03 marostegui: Deploy schema change on db1090:3317
08:01 marostegui@cumin1001: dbctl commit (dc=all): 'Fix db1136 weight', diff saved to https://phabricator.wikimedia.org/P9373 and previous config saved to /var/cache/conftool/dbconfig/20191017-080157-marostegui.json
08:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1090:3317 pool db1136 temporarily into vslow,dump', diff saved to https://phabricator.wikimedia.org/P9372 and previous config saved to /var/cache/conftool/dbconfig/20191017-080026-marostegui.json
07:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1136', diff saved to https://phabricator.wikimedia.org/P9371 and previous config saved to /var/cache/conftool/dbconfig/20191017-074658-marostegui.json
07:13 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1130 (non partitioned host) into s5 special group with low weight - T223151', diff saved to https://phabricator.wikimedia.org/P9370 and previous config saved to /var/cache/conftool/dbconfig/20191017-071308-marostegui.json
06:06 elukey: upgrade archiva on archiva1001 to 2.2.4 - T222595
06:02 marostegui@cumin1001: dbctl commit (dc=all): 'Change special weights from x to x100 on s5 - T231018', diff saved to https://phabricator.wikimedia.org/P9369 and previous config saved to /var/cache/conftool/dbconfig/20191017-060251-marostegui.json
05:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
05:30 marostegui: Deploy schema change on labtestwiki and labswiki
05:12 marostegui: Deploy schema change on db1095:3312
05:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3312 and db1136 for schema change', diff saved to https://phabricator.wikimedia.org/P9368 and previous config saved to /var/cache/conftool/dbconfig/20191017-051055-marostegui.json
05:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1103:3312 and db1094', diff saved to https://phabricator.wikimedia.org/P9367 and previous config saved to /var/cache/conftool/dbconfig/20191017-050614-marostegui.json
05:01 vgutierrez: upgrading ATS to 8.0.5-1wm9 on cp5001 - T234011
05:00 vgutierrez: uploaded trafficserver 8.0.5-1wm9 to apt.wikimedia.org (stretch) - T234011
02:04 bblack: repooling eqsin
00:55 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
13:46 XioNoX: rollback failover VRRP from cr1-eqiad to cr2-eqiad - T226782
13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103:3312 and db1094 for schema change', diff saved to https://phabricator.wikimedia.org/P9364 and previous config saved to /var/cache/conftool/dbconfig/20191016-132620-marostegui.json
13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1074 after schema change', diff saved to https://phabricator.wikimedia.org/P9363 and previous config saved to /var/cache/conftool/dbconfig/20191016-131010-marostegui.json
13:10 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
12:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1079 after schema change', diff saved to https://phabricator.wikimedia.org/P9362 and previous config saved to /var/cache/conftool/dbconfig/20191016-125102-marostegui.json
12:38 effie: remove tex* and math related packages from appserver canaries - T195847
05:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3311 for schema change', diff saved to https://phabricator.wikimedia.org/P9360 and previous config saved to /var/cache/conftool/dbconfig/20191016-052104-marostegui.json
05:18 marostegui: Deploy schema change on s2 sanitarium master (db1074) this will create lag on s2 labsdb T233135T234066
05:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1074 for schema change', diff saved to https://phabricator.wikimedia.org/P9359 and previous config saved to /var/cache/conftool/dbconfig/20191016-051812-marostegui.json
05:14 marostegui: Change s7 triggers for archive table from db1125:3317 T234704
05:11 marostegui: Change s2 triggers for archive table from db1125:3312 T234704
05:08 marostegui: Deploy schema change on s7 sanitarium master (db1079) this will create lag on s7 labsdb T233135T234066
05:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1079 for schema change', diff saved to https://phabricator.wikimedia.org/P9358 and previous config saved to /var/cache/conftool/dbconfig/20191016-050627-marostegui.json
03:49 mobrovac@deploy1001: Finished deploy [restbase/deploy@320f3a5]: Parsoid: Use the ETag for retrieving stashed content - T235465 (duration: 13m 37s)
03:35 mobrovac@deploy1001: Started deploy [restbase/deploy@320f3a5]: Parsoid: Use the ETag for retrieving stashed content - T235465
21:26 jforrester@deploy1001: Synchronized multiversion/MWConfigCacheGenerator.php: Provide getCachableMWConfig() which doesn't rely on wgConf (duration: 01m 00s)
21:24 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@cdfa545]: Media: Fix TypeError when processing pages with only Mathoid images (T235408) (duration: 05m 35s)
21:18 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@cdfa545]: Media: Fix TypeError when processing pages with only Mathoid images (T235408)
21:16 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: InitialiseSettings: Stop writing wmgScoreFileBackend and wmgScorePath, never read (duration: 00m 59s)
21:15 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: CommonSettings: Stop using wmg variables for Score extension (duration: 01m 01s)
21:13 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Write wgScoreFileBackend and wgScorePath directly, not via CommonSettings (duration: 01m 00s)
20:10 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.35.0-wmf.2 refs T233850
{{safesubst:SAL entry|1=19:55 urandom: upgrade restbase2011-{a,b,c} to cassandra 3.11.-4 -- T200803}}
19:52 urandom: upgrade restbase1016-c to cassandra 3.11.-4 -- T200803
14:43 elukey: start a root tmux containing a bash script on conf1004 to clean up znodes under /yarn-rmstore/analytics-hadoop/ZKRMStateRoot/RMAppRoot slowly - T217057
14:40 papaul: power down puppetmaster2002 for HW maintenance
14:38 moritzm: installing usbutils update from stretch point release
14:34 elukey: executed 'rmr' in zookeeper on conf1004 for znodes /yarn-leader-election /hadoop-ha /hive_zookeeper_namespace
13:53 moritzm: installing 4.9.189 Linux update from last stretch point releases (no reboots, deploying the package only at this point)
13:03 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1126 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9353 and previous config saved to /var/cache/conftool/dbconfig/20191015-130356-marostegui.json
12:49 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1126 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9352 and previous config saved to /var/cache/conftool/dbconfig/20191015-124942-marostegui.json
12:46 elukey: Hadoop maintenance over
12:33 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1126 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9351 and previous config saved to /var/cache/conftool/dbconfig/20191015-123356-marostegui.json
12:24 mobrovac: restbase add parsoidphp tables in prod - T230792
12:18 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1126 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9350 and previous config saved to /var/cache/conftool/dbconfig/20191015-121840-marostegui.json
12:17 marostegui: Repool labsdb1009 after PDU maintenance
12:17 elukey: Hadoop maintenance start - migration to the new Zookepeer cluster
12:16 moritzm: installing sudo security updates on buster/stretch
12:13 arturo: add copy of python-pykube and python3-pykube from stretch-wikimedia to buster-wikimedia (T230961)
12:06 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
12:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1097:3314', diff saved to https://phabricator.wikimedia.org/P9348 and previous config saved to /var/cache/conftool/dbconfig/20191015-120359-marostegui.json
12:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1086 for schema change', diff saved to https://phabricator.wikimedia.org/P9347 and previous config saved to /var/cache/conftool/dbconfig/20191015-120133-marostegui.json
11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1098:3317', diff saved to https://phabricator.wikimedia.org/P9346 and previous config saved to /var/cache/conftool/dbconfig/20191015-115922-marostegui.json
11:12 Urbanecm: EU SWAT done
11:12 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: ac37540: Add `autopatrol` to translation administrators on mediawiki (duration: 00m 51s)
11:12 jbond42: move puppetmaster_ca_server back to puppetmaster1001
10:54 moritzm: mark ruby-safe-yaml as manually installed using apt-mark on jessie/stretch, prevents accidental removal of ruby-safe-yaml after puppet 4->5 migration
10:07 moritzm: installing openssl updates for buster (some ciphers we don't use were not enabled due to an upstream change related to the selection of ASM-optimised implementations over generic C)
08:07 marostegui: Stop MySQL on db1126 and labsdb1009 for PDU maintenance - T226782
08:06 elukey: upload new version of memkeys (adding a patch to merged to upstream to avoid segfaults on stretch/buster) to stretch|buster wikimedia apt repos - T223863
07:52 Urbanecm: Set email for `Martin Urbanec (test 10)` to test@wikimedia.cz (debug, no ticket)
07:48 Urbanecm: Password reset for Xaris333 #2 (T235441)
07:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
07:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1126 for PDU maintenance T226782', diff saved to https://phabricator.wikimedia.org/P9345 and previous config saved to /var/cache/conftool/dbconfig/20191015-071338-marostegui.json
07:10 XioNoX: failover VRRP from cr1-eqiad to cr2-eqiad in prevision of the PDU work of - T226782
06:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2086:3318 T232446', diff saved to https://phabricator.wikimedia.org/P9344 and previous config saved to /var/cache/conftool/dbconfig/20191015-064419-marostegui.json
06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1070 T235464', diff saved to https://phabricator.wikimedia.org/P9343 and previous config saved to /var/cache/conftool/dbconfig/20191015-064005-marostegui.json
05:38 marostegui: Depool labsdb1009 for PDU maintenance T226782
05:28 marostegui: Deploy schema change on db1098:3317 T234066T233135
05:28 marostegui: Deploy schema change on db1097:3314 T233625
05:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1097:3314', diff saved to https://phabricator.wikimedia.org/P9342 and previous config saved to /var/cache/conftool/dbconfig/20191015-052621-marostegui.json
05:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1098:3317', diff saved to https://phabricator.wikimedia.org/P9341 and previous config saved to /var/cache/conftool/dbconfig/20191015-052220-marostegui.json
05:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2085:3318', diff saved to https://phabricator.wikimedia.org/P9340 and previous config saved to /var/cache/conftool/dbconfig/20191015-051924-marostegui.json
05:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1103:3314', diff saved to https://phabricator.wikimedia.org/P9339 and previous config saved to /var/cache/conftool/dbconfig/20191015-051400-marostegui.json
05:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P9338 and previous config saved to /var/cache/conftool/dbconfig/20191015-051236-marostegui.json
05:00 marostegui@cumin2001: dbctl commit (dc=all): 'Promote db1100 to s5 master and remove read-only from s5 T234300', diff saved to https://phabricator.wikimedia.org/P9337 and previous config saved to /var/cache/conftool/dbconfig/20191015-050042-marostegui.json
05:00 marostegui@cumin2001: dbctl commit (dc=all): 'Set s5 as read-only for maintenance T234300', diff saved to https://phabricator.wikimedia.org/P9336 and previous config saved to /var/cache/conftool/dbconfig/20191015-050016-marostegui.json
05:00 marostegui: Starting s5 failover from db1070 to db1100 - T234300
04:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1101:3317 after schema change', diff saved to https://phabricator.wikimedia.org/P9335 and previous config saved to /var/cache/conftool/dbconfig/20191015-043403-marostegui.json
23:27 Krinkle: Delete 2019-09-01––2019-09-10 arclamp trace logs from webperf1002, and decompress the rest of 2019-09 (this will trigger svg re-generation), T235425
10:08 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1100 with weight 0 in preparation for tomorrow's failover T234300', diff saved to https://phabricator.wikimedia.org/P9326 and previous config saved to /var/cache/conftool/dbconfig/20191014-100758-marostegui.json
09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1130 into s5 api, db1100 will be removed later in preparation for tomorrow's failover T234300', diff saved to https://phabricator.wikimedia.org/P9325 and previous config saved to /var/cache/conftool/dbconfig/20191014-094809-marostegui.json
09:34 hashar: Upgraded CI jobs to Quibble 0.0.38
09:14 marostegui: Deploy schema change on dbstore1003:3317
08:56 @: helmfile [STAGING] Ran 'apply' command on namespace 'restrouter' for release 'staging' .
08:55 @: helmfile [CODFW] Ran 'apply' command on namespace 'restrouter' for release 'production' .
08:52 @: helmfile [EQIAD] Ran 'apply' command on namespace 'restrouter' for release 'production' .
08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1074 and db2126 after changing sanitarium to replicate from db1074 T231638', diff saved to https://phabricator.wikimedia.org/P9322 and previous config saved to /var/cache/conftool/dbconfig/20191014-085143-marostegui.json
08:46 mobrovac: restbase drop metadata keyspaces from cassandra - T235173
07:54 marostegui: Stop db1074 and db2126 in sync to change sanitarium's master for s2 - T231638
07:49 mobrovac@deploy1001: Finished deploy [restbase/deploy@4d469a1] (dev-cluster): Remove VE logging and stop using storage for /page/metadata (duration: 03m 58s)
07:45 mobrovac@deploy1001: Started deploy [restbase/deploy@4d469a1] (dev-cluster): Remove VE logging and stop using storage for /page/metadata
07:41 mobrovac@deploy1001: Finished deploy [restbase/deploy@e0d071f]: Remove VE logging and stop using storage for /page/metadata - T234928T235173 (duration: 13m 37s)
07:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1074 and db2126 to change sanitarium to replicate from db1074 T231638', diff saved to https://phabricator.wikimedia.org/P9320 and previous config saved to /var/cache/conftool/dbconfig/20191014-073319-marostegui.json
07:28 mobrovac@deploy1001: Started deploy [restbase/deploy@e0d071f]: Remove VE logging and stop using storage for /page/metadata - T234928T235173
07:28 mobrovac@deploy1001: Finished deploy [changeprop/deploy@c25a1c2]: Do not pre-generate /page/metadata - T235173 (duration: 01m 25s)
07:26 mobrovac@deploy1001: Started deploy [changeprop/deploy@c25a1c2]: Do not pre-generate /page/metadata - T235173
07:21 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db2068 from config - T235399', diff saved to https://phabricator.wikimedia.org/P9319 and previous config saved to /var/cache/conftool/dbconfig/20191014-072100-marostegui.json
07:16 marostegui: Stop MySQL on labsdb1009 for on-site maintenance - T233273
07:01 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
22:17 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T78711 Update cron-updated miser pages to say they are run periodically, not never (duration: 00m 51s)
22:10 jforrester@deploy1001: Synchronized wmf-config/wikitech.php: Remove debug line dating from 2015-12-08! (duration: 00m 51s)
22:04 jforrester@deploy1001: Synchronized wmf-config/mc.php: Drop nutcracker indirection for HHVM servers, just point to localhost (duration: 00m 51s)
21:58 jforrester@deploy1001: Synchronized wmf-config/PhpAutoPrepend.php: Drop special-case for PHP7, now always used (duration: 00m 51s)
21:55 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Drop HHVM special-case for SVG converter, no longer used (duration: 00m 51s)
21:49 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Don't check to shard static config cache for HHVM any more (duration: 00m 50s)
21:48 jforrester@deploy1001: Synchronized wmf-config/Wikibase.php: Don't check to shard wmgWBSharedCacheKey for HHVM any more (duration: 00m 51s)
21:39 jforrester@deploy1001: Synchronized php-1.35.0-wmf.1/extensions/VisualEditor/lib/ve/src/dm/ve.dm.TreeCursor.js: T234881 TreeCursor: cross ignored nodes properly from the end of a text node (duration: 00m 54s)
20:36 otto@deploy1001: Finished deploy [analytics/refinery@9b322e4]: attempting to fix missing git fat jar on stat1004 (duration: 00m 06s)
20:36 otto@deploy1001: Started deploy [analytics/refinery@9b322e4]: attempting to fix missing git fat jar on stat1004
20:13 hoo: Updated the Wikidata property suggester with data from the 2019-09-30 JSON dump and applied the T132839 workarounds
19:33 godog: swift eqiad-prod: add weight to ms-be105[1-6] - T232367
19:29 marxarelli: promoted 1.35.0-wmf.1 to all wikis. no rise in errors rates. no new relevant errors cc: T233849
19:25 godog: swift codfw-prod: more weight to ms-be205[1-6] - T233638
19:20 dduvall@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.1
19:11 dduvall@deploy1001: rebuilt and synchronized wikiversions files: labswiki to 1.35.0-wmf.1
19:09 dduvall@deploy1001: Synchronized php-1.35.0-wmf.1/extensions/OpenStackManager: labswiki to 1.35.0-wmf.1 (duration: 01m 00s)
19:04 marxarelli: promoting labswiki to 1.35.0-wmf.1 cc: T233849
17:07 jbond42: puppetmaster1001 has been upgraded and is back serving requests
16:21 urandom: Upgrading sessionstore200[1-3].codfw.wmnet to Cassandra 3.11.4 -- T200803
16:18 urandom: Upgrading sessionstore1003.eqiad.wmnet to Cassandra 3.11.4 -- T200803
16:16 urandom: Upgrading sessionstore1002.eqiad.wmnet to Cassandra 3.11.4 -- T200803
16:11 @: helmfile [EQIAD] Ran 'apply' command on namespace 'termbox' for release 'production' .
16:07 @: helmfile [CODFW] Ran 'apply' command on namespace 'termbox' for release 'production' .
16:04 thcipriani: restarting gerrit due to T224448
16:04 @: helmfile [STAGING] Ran 'apply' command on namespace 'termbox' for release 'staging' .
16:01 urandom: Upgrading sessionstore1001.eqiad.wmnet to Cassandra 3.11.4 -- T200803
15:42 @: helmfile [STAGING] Ran 'apply' command on namespace 'termbox' for release 'test' .
15:18 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@1adf74e]: Update mobileapps to c89aa55
14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1074 after getting its BBU replaced T231638', diff saved to https://phabricator.wikimedia.org/P9306 and previous config saved to /var/cache/conftool/dbconfig/20191010-145737-marostegui.json
14:54 moritzm: ran systemctl reset-failed on puppetmaster1001 (puppet-master.service after reimage)
14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1074 after BBU replacement T231638', diff saved to https://phabricator.wikimedia.org/P9305 and previous config saved to /var/cache/conftool/dbconfig/20191010-144201-marostegui.json
14:39 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1112 into recentchanges and remove db1078 from it after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9304 and previous config saved to /var/cache/conftool/dbconfig/20191010-143924-marostegui.json
14:36 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool to db1084 db1083 db1076 db1112 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9303 and previous config saved to /var/cache/conftool/dbconfig/20191010-143633-marostegui.json
14:23 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1084 db1083 db1076 db1112 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9302 and previous config saved to /var/cache/conftool/dbconfig/20191010-142323-marostegui.json
14:13 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1084 db1083 db1076 db1112 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9301 and previous config saved to /var/cache/conftool/dbconfig/20191010-141303-marostegui.json
11:21 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/: SWAT: Set dataBridgeEnabled repo setting on beta (T235033) (affects InitialiseSettings-labs.php and Wikibase.php, but Wikibase.php part is guarded by isset(), so should be safe to sync both at once, I think) (duration: 01m 00s)
11:21 jbond@cumin2001: END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99)
11:14 Lucas_WMDE: ^ (and by CS, I actually mean Wikibase.php, not CommonSettings.php, sorry)
11:13 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/: SWAT: Rename data bridge config variable names (T235033) (affects IS-labs and CS, but the CS part is all guarded by isset(), so should be safe to sync both at once, I think) (duration: 01m 00s)
10:38 moritzm: rebalancing Ganeti eqiad/row C after rolling reboots of Ganeti nodes
10:34 volans: uploaded spicerack_0.0.28-1_amd64.deb to apt.wikimedia.org stretch-wikimedia
08:23 @: helmfile [EQIAD] Ran 'apply' command on namespace 'restrouter' for release 'production' .
08:20 @: helmfile [CODFW] Ran 'apply' command on namespace 'restrouter' for release 'production' .
08:17 @: helmfile [STAGING] Ran 'apply' command on namespace 'restrouter' for release 'staging' .
08:12 mobrovac@deploy1001: Synchronized wmf-config/CommonSettings.php: Add wtp1025/wtp2001 to the list of servers using Parsoid/PHP - T233654 (duration: 01m 01s)
07:55 marostegui: Stop MySQL on es1014 es1013 db1084 db1083 db1077 db1076 db1112 db1124 db1118 for on-site PDU maintenance (this will generate lag on labsdb hosts) - T227536
06:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
06:45 marostegui: Drop designate_pool_manager database from m5 - T233978
06:33 marostegui: Revoke privileges from designate user on the designate_pool_manager database - T233978
05:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1112 for PDU maintenance T227536', diff saved to https://phabricator.wikimedia.org/P9294 and previous config saved to /var/cache/conftool/dbconfig/20191010-055153-marostegui.json
05:51 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1078 into rc service for s3 for PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9293 and previous config saved to /var/cache/conftool/dbconfig/20191010-055102-marostegui.json
05:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1084 db1083 db1076 db1118 for PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9292 and previous config saved to /var/cache/conftool/dbconfig/20191010-054853-marostegui.json
16:50 @: helmfile [CODFW] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
16:48 @: helmfile [EQIAD] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
16:46 @: helmfile [STAGING] Ran 'apply' command on namespace 'wikifeeds' for release 'staging' .
16:05 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1075 after unexpected reboot', diff saved to https://phabricator.wikimedia.org/P9289 and previous config saved to /var/cache/conftool/dbconfig/20191009-160506-marostegui.json
15:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1101:3318 after schema change', diff saved to https://phabricator.wikimedia.org/P9288 and previous config saved to /var/cache/conftool/dbconfig/20191009-153705-marostegui.json
15:04 akosiaris@: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
15:02 akosiaris@: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
14:51 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1085 vslow and dump group', diff saved to https://phabricator.wikimedia.org/P9287 and previous config saved to /var/cache/conftool/dbconfig/20191009-145102-marostegui.json
14:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1113:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P9286 and previous config saved to /var/cache/conftool/dbconfig/20191009-144928-marostegui.json
14:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101:3318 for schema change T233625', diff saved to https://phabricator.wikimedia.org/P9285 and previous config saved to /var/cache/conftool/dbconfig/20191009-144607-marostegui.json
14:44 marostegui@cumin1001: dbctl commit (dc=all): 'More trafic to db1075 after unexpected reboot', diff saved to https://phabricator.wikimedia.org/P9284 and previous config saved to /var/cache/conftool/dbconfig/20191009-144400-marostegui.json
14:38 elukey: cr1-eqsin: change IPv6 address for BGP peer AS4761
14:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1099:3318 after schema change T233625', diff saved to https://phabricator.wikimedia.org/P9283 and previous config saved to /var/cache/conftool/dbconfig/20191009-141137-marostegui.json
14:07 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1075 after unexpected reboot', diff saved to https://phabricator.wikimedia.org/P9282 and previous config saved to /var/cache/conftool/dbconfig/20191009-140749-marostegui.json
14:05 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
13:37 marostegui@cumin1001: dbctl commit (dc=all): 'depool db1075', diff saved to https://phabricator.wikimedia.org/P9280 and previous config saved to /var/cache/conftool/dbconfig/20191009-133709-marostegui.json
13:13 mobrovac@deploy1001: Finished deploy [restbase/deploy@aaadd73]: Parsoid: Retry fetching stashes with undefined as the revid - T234928 (duration: 14m 26s)
12:59 mobrovac@deploy1001: Started deploy [restbase/deploy@aaadd73]: Parsoid: Retry fetching stashes with undefined as the revid - T234928
12:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3318 for schema change T233625', diff saved to https://phabricator.wikimedia.org/P9279 and previous config saved to /var/cache/conftool/dbconfig/20191009-125641-marostegui.json
12:42 marostegui: Stop MySQL and power off db1074 for BBU replacement T231638
12:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1074 for BBU replacement T231638', diff saved to https://phabricator.wikimedia.org/P9278 and previous config saved to /var/cache/conftool/dbconfig/20191009-124218-marostegui.json
12:41 mobrovac@deploy1001: Finished deploy [restbase/deploy@068d2ed]: Feed: Use Wikifeeds; Parsoid: Use the ETag revid for stashing and use the same ETag for stashing and response, take #2 (duration: 08m 18s)
12:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3312 after schema change', diff saved to https://phabricator.wikimedia.org/P9277 and previous config saved to /var/cache/conftool/dbconfig/20191009-124035-marostegui.json
12:38 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
12:36 moritzm: disabled puppet on DNS recursors for staged rollout of ferm NTP change
12:35 jbond42: reimage puppetmaster2002
12:32 mobrovac@deploy1001: Started deploy [restbase/deploy@068d2ed]: Feed: Use Wikifeeds; Parsoid: Use the ETag revid for stashing and use the same ETag for stashing and response, take #2
12:30 mobrovac@deploy1001: Finished deploy [restbase/deploy@068d2ed]: Feed: Use Wikifeeds; Parsoid: Use the ETag revid for stashing and use the same ETag for stashing and response - T170455T234928 (duration: 09m 40s)
12:28 vgutierrez: depooling cp1085 for a power drain - T231525
12:20 mobrovac@deploy1001: Started deploy [restbase/deploy@068d2ed]: Feed: Use Wikifeeds; Parsoid: Use the ETag revid for stashing and use the same ETag for stashing and response - T170455T234928
08:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1113:3316 for schema change, temporarily pool db1085 as vslow,dump', diff saved to https://phabricator.wikimedia.org/P9276 and previous config saved to /var/cache/conftool/dbconfig/20191009-085016-marostegui.json
08:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1085 after schema change', diff saved to https://phabricator.wikimedia.org/P9275 and previous config saved to /var/cache/conftool/dbconfig/20191009-084732-marostegui.json
08:39 vgutierrez: Switch cp1082 from nginx to ats-tls - T231433
05:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1085 for schema change - lag will be generated on s6 labs', diff saved to https://phabricator.wikimedia.org/P9274 and previous config saved to /var/cache/conftool/dbconfig/20191009-051911-marostegui.json
05:11 marostegui: Restart gerrit as it is down
04:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3312 for schema change', diff saved to https://phabricator.wikimedia.org/P9273 and previous config saved to /var/cache/conftool/dbconfig/20191009-045941-marostegui.json
04:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1103:3312', diff saved to https://phabricator.wikimedia.org/P9272 and previous config saved to /var/cache/conftool/dbconfig/20191009-044752-marostegui.json
04:40 vgutierrez: switching cp5004 from nginx to ats-tls - T231433
2019-10-08
23:28 mutante: phab1001 - replacing tin.eqiad.wmnet with deploy1001.eqiad.wmnet in phabricator/deployment-cache/.config:git_server - wondering if we can ever get rid of tin (T190568)
23:05 ebernhardson@deploy1001: Synchronized wmf-config/: [cirrus] drop support for HHVM connection pooling (duration: 00m 59s)
21:58 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Split out the CSP configuration s it can be more easily over-ridden (duration: 00m 59s)
21:09 chaomodus: restarted nagios-nrpe-server on notebook1003
20:38 mutante: labweb1001 - disabled 2fa for myself on Wikitech using disableOATHAuthForUser.php --wiki=labswiki to debug T234996
20:24 mutante: labweb1001 - edit /srv/mediawiki/wmf-config/wikitech.php to and change "false" to "true" on line 52 to enable LDAP debug logging for T234996
19:51 marxarelli: 1.35.0-wmf.1 promoted to group0, cc: T233849. no rise in error rates. no new relevant errors
19:43 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.35.0-wmf.1
12:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1093 after schema change', diff saved to https://phabricator.wikimedia.org/P9261 and previous config saved to /var/cache/conftool/dbconfig/20191008-124417-marostegui.json
09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1093 for schema change', diff saved to https://phabricator.wikimedia.org/P9259 and previous config saved to /var/cache/conftool/dbconfig/20191008-093309-marostegui.json
09:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1088 after schema change', diff saved to https://phabricator.wikimedia.org/P9258 and previous config saved to /var/cache/conftool/dbconfig/20191008-092627-marostegui.json
09:20 marostegui: Compress logging table on db2088:3312 for idwiki,plwiki,ptwiki,zhwiki
07:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1088 for schema change', diff saved to https://phabricator.wikimedia.org/P9256 and previous config saved to /var/cache/conftool/dbconfig/20191008-071859-marostegui.json
07:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1131 after schema change', diff saved to https://phabricator.wikimedia.org/P9255 and previous config saved to /var/cache/conftool/dbconfig/20191008-071551-marostegui.json
06:48 marostegui: Stop MySQL on es1011 db1082 db1081 db1080 db1079 db1075 db1074 (replication lag will appear on labs for s5) for on-site maintenance T227138
06:09 marostegui: Repool labsdb1011 after mysql upgrade
05:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
05:44 elukey: drop PageCreation_7481635 table from the log db on db1107/db1108 - T233892
05:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1082 db1081 db1080 db1079 db1075 db1074 for PDU maintenance T227138', diff saved to https://phabricator.wikimedia.org/P9254 and previous config saved to /var/cache/conftool/dbconfig/20191008-054127-marostegui.json
05:35 elukey: drop CitationUsage tables from the log database on db1107/db1108 (the ones listed in the task) - T233893
05:25 marostegui: Depool labsdb1011 for mysql upgrade
05:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1131 for schema change', diff saved to https://phabricator.wikimedia.org/P9253 and previous config saved to /var/cache/conftool/dbconfig/20191008-051435-marostegui.json
05:10 marostegui: Reload query killer on labsdb1011
05:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1097:3315 T233625', diff saved to https://phabricator.wikimedia.org/P9252 and previous config saved to /var/cache/conftool/dbconfig/20191008-050833-marostegui.json
05:07 marostegui: Deploy schema change on db1097:3315 - T233625
03:04 andrewbogott: restarted nova-conductor on cloudcontrol1003 and cloudcontrol1004 — experimental band-aid for T234876
00:33 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)
2019-10-07
23:52 dzahn@cumin1001: Updating IPMI password on 1254 hosts - dzahn@cumin1001
11:16 arturo: added bdsync 0.11.1-1~wmf1 to buster-wikimedia (T234683)
10:59 mobrovac@deploy1001: Finished deploy [restbase/deploy@5321aac]: Skip checking resources on start-up, add banwiki, add metrics/mediarequests end points and log all VE requests, take #5 (duration: 04m 17s)
10:55 mobrovac@deploy1001: Started deploy [restbase/deploy@5321aac]: Skip checking resources on start-up, add banwiki, add metrics/mediarequests end points and log all VE requests, take #5
10:54 mobrovac@deploy1001: Finished deploy [restbase/deploy@5321aac]: Skip checking resources on start-up, add banwiki, add metrics/mediarequests end points and log all VE requests, take #4 (duration: 04m 27s)
10:50 mobrovac@deploy1001: Started deploy [restbase/deploy@5321aac]: Skip checking resources on start-up, add banwiki, add metrics/mediarequests end points and log all VE requests, take #4
10:48 mobrovac@deploy1001: Finished deploy [restbase/deploy@5321aac]: Skip checking resources on start-up, add banwiki, add metrics/mediarequests end points and log all VE requests, take #3 (duration: 03m 53s)
10:44 mobrovac@deploy1001: Started deploy [restbase/deploy@5321aac]: Skip checking resources on start-up, add banwiki, add metrics/mediarequests end points and log all VE requests, take #3
10:21 mobrovac@deploy1001: Finished deploy [restbase/deploy@1798e39]: Skip checking resources on start-up, add banwiki, add metrics/mediarequests end points and log all VE requests, take #2 (duration: 01m 56s)
10:19 mobrovac@deploy1001: Started deploy [restbase/deploy@1798e39]: Skip checking resources on start-up, add banwiki, add metrics/mediarequests end points and log all VE requests, take #2
10:16 mobrovac@deploy1001: Finished deploy [restbase/deploy@1798e39]: Skip checking resources on start-up, add banwiki, add metrics/mediarequests end points and log all VE requests - T233127T234772 (duration: 05m 58s)
10:10 mobrovac@deploy1001: Started deploy [restbase/deploy@1798e39]: Skip checking resources on start-up, add banwiki, add metrics/mediarequests end points and log all VE requests - T233127T234772
09:55 marostegui: Deploy schema change on db2129 (s6 codfw master), this will generate lag on s6 codfw - T233135T234066
08:34 hashar: gerrit: force reindexing all changes ( gerrit index start changes --force )
07:09 marostegui: Remove grants for dbproxy1006 on m1 databases - T231280
06:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3315 for schema change T233625', diff saved to https://phabricator.wikimedia.org/P9246 and previous config saved to /var/cache/conftool/dbconfig/20191007-065645-marostegui.json
20:41 mutante: deploy1001 / deploy2001 - remove python-pygerrit2 (version for python3 is needed instead)
20:32 mutante: gerrit1001 - scp /usr/share/java/mysql-connector-java.jar from cobalt into /usr/share/java/ on gerrit1001 and then symlink into /var/lib/gerrit2/review_site/lib/ (T222391)
19:27 mutante: wtp1025 - mediawiki appserver classes are being applied, install in progress will trigger some new icinga alerts
14:03 marostegui: Deploy schema change on db2117 T233135T234066
13:50 @: helmfile [EQIAD] Ran 'apply' command on namespace 'restrouter' for release 'production' .
13:47 @: helmfile [CODFW] Ran 'apply' command on namespace 'restrouter' for release 'production' .
13:36 @: helmfile [STAGING] Ran 'apply' command on namespace 'restrouter' for release 'staging' .
12:28 marostegui: Deploy schema change on db2097:3316 T233135T234066
12:23 elukey: cleaned up old files and apt-cache from an-coord1001
08:41 marostegui: Deploy schema change on db2076 (sanitarium master) with replication T233135T234066
08:32 _joe_: reuploading the old confd package to stetch-wikimedia, some incompatibility detected
07:26 elukey: execute gnt-instance remove kerberos1001 on ganeti1001 - T234600
07:24 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
05:53 _joe_: upgrading confd on puppetmaster1001 T147204
05:50 _joe_: uploading confd 0.16.0 on stretch T147204
05:49 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: More traffic to es1019 after on-site maintenance T233698 (duration: 00m 51s)
05:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P9240 and previous config saved to /var/cache/conftool/dbconfig/20191004-051112-marostegui.json
13:26 hashar: restarting Gerrit due to a deadlock in SendEmail task and AccountCacheImpl
13:22 hashar: Gerrit might be dead again; taking traces
13:04 _joe_: restarting php7 on mw1275
12:54 onimisionipe: force shard allocation on eqiad chi cluster
10:27 elukey: killed rsync processes in "D" state on stat1007, force umount/mount of /mnt/hdfs
10:25 jbond42: rolling upgrade of openssl packages
10:21 Urbanecm: Manually cleared signup throttle for IP 80.188.128.54 at cswiki, issue with introduced throttle rule
10:20 Urbanecm: Manually cleared signup throttle for IP 88.100.221.84 at cswiki, issue with introduced throttle rule
10:18 Urbanecm: Manually cleared signup throttle for IP 90.176.155.12 at cswiki, issue with introduced throttle rule
09:32 elukey: run apt-get autoremove incrementally on all the hadoop prod workers to remove python2 deps (and verify that they are not used anymore by Hadoop)
08:33 marostegui: Deploy schema change on db2087:3316 T233135T234066
08:28 marostegui: Deploy schema change on db1096:3316 - T233625
08:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 for schema change T233135T234066', diff saved to https://phabricator.wikimedia.org/P9236 and previous config saved to /var/cache/conftool/dbconfig/20191003-082651-marostegui.json
23:22 ebernhardson@deploy1001: Synchronized php-1.34.0-wmf.24/extensions/CirrusSearch/: T234445: CirrusSearch: Fix Precondition failed: Must have a resultset set (duration: 01m 00s)
23:21 ebernhardson@deploy1001: Synchronized php-1.34.0-wmf.25/extensions/CirrusSearch/: T234445: CirrusSearch: Fix Precondition failed: Must have a resultset set (duration: 01m 02s)
22:29 godog: remove queued messages from mx1001 for fr-tech-ops@, triggering sender rate limit from gmail
22:12 jforrester@deploy1001: Synchronized php-1.34.0-wmf.24/extensions/VisualEditor/includes/ApiVisualEditorEdit.php: VE unstructured logging, part II (duration: 00m 58s)
22:11 jforrester@deploy1001: Synchronized php-1.34.0-wmf.24/extensions/VisualEditor/includes/ApiVisualEditor.php: VE unstructured logging, part I (duration: 00m 59s)
22:09 jforrester@deploy1001: Synchronized php-1.34.0-wmf.25/extensions/VisualEditor/includes/ApiVisualEditorEdit.php: VE unstructured logging, part II (duration: 00m 58s)
22:06 jforrester@deploy1001: Synchronized php-1.34.0-wmf.25/extensions/VisualEditor/includes/ApiVisualEditor.php: VE unstructured logging, part I (duration: 01m 00s)
21:17 mutante: cobalt (gerrit) rsyncing /srv/gerrit/git and /srv/gerrit/plugins data to gerrit1001 again after reinstall and fixing gerrit2 UID/GID (T222391)
21:13 mutante: gerrit1001 - rebooting
21:08 mutante: gerrit1001 changing GID of gerrit2 user to 119 in /etc/group ; find / -uid 499 -exec chown gerrit2 {} \; find / -gid 1001 -exec chown gerrit2:gerrit2 {} \; (T222391)
21:03 mutante: gerrit1001 changing UID of gerrit2 user to 114 and GID to 119 in /etc/passwd to match cobalt to avoid privilege issues after rsyncing data (T222391)
19:58 mutante: puppetmaster1001 - sudo puppet cert clean parsoid.discovery.wmnet (only created yesterday but does not have all the SANs it needs, updating with more SANs) (T233654)
23:20 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T233211: CirrusSearch: Configuration for glent m0 AB test (duration: 00m 58s)
09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2091:3314 for a schema change - T233625', diff saved to https://phabricator.wikimedia.org/P9217 and previous config saved to /var/cache/conftool/dbconfig/20190930-091043-marostegui.json
08:01 moritzm: installing e2fsprogs security updates on Stretch/Buster
07:56 marostegui: Stop dbstore1003:3311 for troubleshooting
06:47 moritzm: installing exim security updates on buster
2019-09-28
16:28 vgutierrez: restarting acme-chief on acmechief1001
22:36 mutante: phab2001 - upgrade php7.2 packages to 7.2.22 (T230024)
22:03 mutante: webperf1001, webperf2001: restart envoyproxy to pick up new cert with the right subject alt. names
18:22 mutante: mwdebug1001, mwdebug1002 - deleted from /srv/mediawiki/: php-1.34.0-wmf.16, .17, .18, .19 and .20 (current is .24) - usage back to about 57% (T234063)
18:17 mutante: mwdebug1001, mwdebug1002 - apt-get clean saves about 3GB and gets usage down from 94% to 87% on / (T234063)
16:01 XioNoX: delete BGP to AS34305 on cr2-esams
15:34 elukey: update pcc facts to add new hosts
15:02 moritzm: installing usb.ids update from Buster 10.1 point release
14:45 moritzm: installing ncurses bugfix update from Buster 10.1 point release
14:39 moritzm: installing postgresql-common bugfix update from Buster 10.1 point release
14:32 effie: Disable puppet and reload apache on mw* for 539465 and 539488 - T229792
13:33 marostegui: Set candidate masters in dbctl T234039
13:31 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
22:15 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T211620 Enable emails for certain notification types by default on officewiki (duration: 00m 56s)
18:44 ayounsi@deploy1001: Started deploy [homer/deploy@715d842]: Initial Homer release
18:35 jforrester@deploy1001: Synchronized wmf-config/CirrusSearch-production.php: Stop setting various static settings, now set in IS (duration: 01m 04s)
18:35 mforns@deploy1001: Finished deploy [analytics/refinery@cd2f43b]: deploy refinery using scap (together with refinery-source v0.0.101) (duration: 06m 04s)
18:34 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set last static Cirrus settings directly in IS (duration: 01m 07s)
18:29 mforns@deploy1001: Started deploy [analytics/refinery@cd2f43b]: deploy refinery using scap (together with refinery-source v0.0.101)
18:02 volans@deploy1001: Started deploy [homer/deploy@68ac5cc]: Initial Homer release
17:58 jforrester@deploy1001: Synchronized wmf-config/CirrusSearch-common.php: Stop setting bits of the CirrusSearch timeoutes arrays, already set in IS (duration: 01m 04s)
17:57 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set the whole of the CirrusSearch timeoutes arrays directly (duration: 01m 00s)
17:49 jforrester@deploy1001: Synchronized wmf-config/CirrusSearch-common.php: Stop setting static values now set in InitialiseSettings (duration: 01m 04s)
17:49 Amir1: end of foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https (T233835, T233246)
14:43 cdanis@cumin1001: dbctl commit (dc=all): 'dbctl 1.2.0 adds hostByName to the output, but it is not used by Mediawiki; this commit is the first made with the new release; no-op change', diff saved to https://phabricator.wikimedia.org/P9208 and previous config saved to /var/cache/conftool/dbconfig/20190926-144328-cdanis.json
08:47 vgutierrez: switching from nginx to ats-tls on cp2008 - T231433
08:43 mobrovac@deploy1001: Started deploy [restbase/deploy@c419651]: Add nqo.wp.org - T233833
08:42 marostegui@cumin1001: dbctl commit (dc=all): 'More weight to db1078', diff saved to https://phabricator.wikimedia.org/P9202 and previous config saved to /var/cache/conftool/dbconfig/20190926-084159-marostegui.json
08:25 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
08:22 marostegui@cumin1001: dbctl commit (dc=all): 'Change special weights from 1 to 100 - T231018', diff saved to https://phabricator.wikimedia.org/P9201 and previous config saved to /var/cache/conftool/dbconfig/20190926-082233-marostegui.json
08:18 marostegui@cumin1001: dbctl commit (dc=all): 'More weight to db1078', diff saved to https://phabricator.wikimedia.org/P9200 and previous config saved to /var/cache/conftool/dbconfig/20190926-081759-marostegui.json
08:13 vgutierrez: switching from nginx to ats-tls on cp3036 - T231433
08:11 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1078', diff saved to https://phabricator.wikimedia.org/P9199 and previous config saved to /var/cache/conftool/dbconfig/20190926-081144-marostegui.json
08:09 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1078', diff saved to https://phabricator.wikimedia.org/P9198 and previous config saved to /var/cache/conftool/dbconfig/20190926-080949-marostegui.json
08:07 elukey: executed 'rmr /yarn-rmstore/analytics-test-hadoop/ZKRMStateRoot' on conf1004's zkCli.sh to clean up znodes - T217057
08:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1078 to change binlog format', diff saved to https://phabricator.wikimedia.org/P9197 and previous config saved to /var/cache/conftool/dbconfig/20190926-080442-marostegui.json
08:02 marostegui: Depool db1078 to restart mysql to change its binlog format to ROW
07:57 vgutierrez: switching from nginx to ats-tls on cp4023 - T231433
07:49 godog: swift eqiad-prod: continue ms-be1027 decom - T233289
07:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
06:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2088:3311 db2091:3312 db2084:3314 db2089:3315 db2089:3316 db2087:3317 T233625', diff saved to https://phabricator.wikimedia.org/P9196 and previous config saved to /var/cache/conftool/dbconfig/20190926-063555-marostegui.json
06:29 marostegui@cumin1001: dbctl commit (dc=all): ' Repool db2088:3312 db2084:3315 db2087:3316 db2086:3317 T233625', diff saved to https://phabricator.wikimedia.org/P9195 and previous config saved to /var/cache/conftool/dbconfig/20190926-062922-marostegui.json
05:30 marostegui@cumin1001: dbctl commit (dc=all): 'Fully pool db1081 - T230784', diff saved to https://phabricator.wikimedia.org/P9194 and previous config saved to /var/cache/conftool/dbconfig/20190926-053029-marostegui.json
05:19 marostegui@cumin1001: dbctl commit (dc=all): 'Increase weight for db1081 - T230784', diff saved to https://phabricator.wikimedia.org/P9193 and previous config saved to /var/cache/conftool/dbconfig/20190926-051916-marostegui.json
05:09 marostegui@cumin1001: dbctl commit (dc=all): 'Give some API weight to db1081 - T230784', diff saved to https://phabricator.wikimedia.org/P9192 and previous config saved to /var/cache/conftool/dbconfig/20190926-050937-marostegui.json
05:07 marostegui@cumin1001: dbctl commit (dc=all): 'Give some weight to db1081 - T230784', diff saved to https://phabricator.wikimedia.org/P9191 and previous config saved to /var/cache/conftool/dbconfig/20190926-050722-marostegui.json
05:01 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1138 to s4 master and remove read-only from s4 T230784', diff saved to https://phabricator.wikimedia.org/P9190 and previous config saved to /var/cache/conftool/dbconfig/20190926-050140-marostegui.json
05:00 marostegui@cumin1001: dbctl commit (dc=all): 'Set s4 as read-only for maintenance T230784', diff saved to https://phabricator.wikimedia.org/P9189 and previous config saved to /var/cache/conftool/dbconfig/20190926-050050-marostegui.json
05:00 marostegui: Starting s4 failover from db1081 to db1138 - T230784
04:15 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1138 with weight 0 T230784', diff saved to https://phabricator.wikimedia.org/P9188 and previous config saved to /var/cache/conftool/dbconfig/20190926-041508-marostegui.json
17:11 ladsgroup@deploy1001: Synchronized php-1.34.0-wmf.23/extensions/WikimediaMaintenance/addWiki.php: Redefine RevisionStore service for the wiki being created (T212881) (duration: 01m 05s)
17:08 ladsgroup@deploy1001: Synchronized php-1.34.0-wmf.24/extensions/WikimediaMaintenance/addWiki.php: Redefine RevisionStore service for the wiki being created (T212881) (duration: 01m 04s)
13:31 marostegui@cumin1001: dbctl commit (dc=all): 'Increase weight for db1075 after BBU replacement', diff saved to https://phabricator.wikimedia.org/P9182 and previous config saved to /var/cache/conftool/dbconfig/20190925-133146-marostegui.json
13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Increase weight for db1075 after BBU replacement', diff saved to https://phabricator.wikimedia.org/P9181 and previous config saved to /var/cache/conftool/dbconfig/20190925-132147-marostegui.json
13:11 marostegui@cumin1001: dbctl commit (dc=all): 'Increase weight for db1075 after BBU replacement', diff saved to https://phabricator.wikimedia.org/P9180 and previous config saved to /var/cache/conftool/dbconfig/20190925-131149-marostegui.json
13:06 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1075 after replacing its BBU', diff saved to https://phabricator.wikimedia.org/P9179 and previous config saved to /var/cache/conftool/dbconfig/20190925-130613-marostegui.json
12:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2085:3311 T233625', diff saved to https://phabricator.wikimedia.org/P9178 and previous config saved to /var/cache/conftool/dbconfig/20190925-125601-marostegui.json
12:51 marostegui@cumin1001: dbctl commit (dc=all): ' Depool for schema change on the logging table: db2088:3312 db2084:3315 db2087:3316 db2086:3317 T233625', diff saved to https://phabricator.wikimedia.org/P9177 and previous config saved to /var/cache/conftool/dbconfig/20190925-125140-marostegui.json
12:47 akosiaris@: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
12:47 akosiaris@: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
12:46 akosiaris@: helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
12:45 akosiaris@: helmfile [EQIAD] Ran 'apply' command on namespace 'kube-system' for release 'calico-policy-controller' .
12:45 akosiaris@: helmfile [CODFW] Ran 'apply' command on namespace 'kube-system' for release 'calico-policy-controller' .
12:44 akosiaris@: helmfile [STAGING] Ran 'apply' command on namespace 'kube-system' for release 'calico-policy-controller' .
12:18 mobrovac@deploy1001: Finished deploy [restbase/deploy@19d0f44]: REVERT (due to wikifeeds problems): Start using the wikifeeds service for v1/feed - T170455 (duration: 02m 35s)
12:16 mobrovac@deploy1001: Started deploy [restbase/deploy@19d0f44]: REVERT (due to wikifeeds problems): Start using the wikifeeds service for v1/feed - T170455
11:47 mobrovac@deploy1001: Finished deploy [restbase/deploy@87eea26]: Start using the wikifeeds service for v1/feed - T170455 (duration: 02m 35s)
11:45 mobrovac@deploy1001: Started deploy [restbase/deploy@87eea26]: Start using the wikifeeds service for v1/feed - T170455
11:43 Urbanecm: EU SWAT done
11:41 urbanecm@deploy1001: Synchronized wmf-config/VariantSettings.php: SWAT: 11a48f8: Add support for some languages on Commons and stop support for nys on Wikidata (T230480) (duration: 00m 56s)
11:39 Urbanecm: Run mwscript initSiteStats.php --wiki=napwikisource --update (T233673)
11:37 urbanecm@deploy1001: Synchronized wmf-config/VariantSettings.php: SWAT: 9eaa4f8: Set wgArticleCountMethod to any for napwikisource (T233673) (duration: 00m 56s)
11:27 urbanecm@deploy1001: Synchronized php-1.34.0-wmf.23/extensions/MassMessage/: SWAT: ba9b209: Provide deduplication info to MassMessageJob (T232379) (duration: 00m 57s)
11:26 urbanecm@deploy1001: Synchronized static/images/project-logos/mxwikimedia.png: SWAT: 246b352: Update logo for mx.wikimedia (T233670) (duration: 00m 54s)
11:24 urbanecm@deploy1001: Synchronized php-1.34.0-wmf.23/extensions/GrowthExperiments/modules/homepage/ext.growthExperiments.Homepage.less: SWAT: d4c64a7: Fix broken display of mobile overlay headings (T233163) (duration: 00m 57s)
11:16 urbanecm@deploy1001: Synchronized wmf-config/VariantSettings.php: SWAT: 8bf6aae: Enable alternate mobile link for ar,zh,hi wikis (T206497) (duration: 00m 54s)
11:10 _joe_: all wikis (including API) are now served by PHP7 T219150
06:36 marostegui: Remove db1066 from tendril and zarcillo T233071
06:30 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1075', diff saved to https://phabricator.wikimedia.org/P9163 and previous config saved to /var/cache/conftool/dbconfig/20190924-063002-marostegui.json
06:19 marostegui@cumin1001: dbctl commit (dc=all): 'Give more weight to db1075', diff saved to https://phabricator.wikimedia.org/P9162 and previous config saved to /var/cache/conftool/dbconfig/20190924-061943-marostegui.json
05:39 marostegui@cumin1001: dbctl commit (dc=all): 'Give more weight to db1075', diff saved to https://phabricator.wikimedia.org/P9161 and previous config saved to /var/cache/conftool/dbconfig/20190924-053919-marostegui.json
05:25 marostegui@cumin1001: dbctl commit (dc=all): 'Give weight 100 to db1075', diff saved to https://phabricator.wikimedia.org/P9160 and previous config saved to /var/cache/conftool/dbconfig/20190924-052545-marostegui.json
05:13 cdanis@cumin1001: dbctl commit (dc=all): 're-do T230783 master promotion and set read-write', diff saved to https://phabricator.wikimedia.org/P9159 and previous config saved to /var/cache/conftool/dbconfig/20190924-051307-cdanis.json
05:11 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1123 to s3 master and remove read-only from s3 T230783', diff saved to https://phabricator.wikimedia.org/P9158 and previous config saved to /var/cache/conftool/dbconfig/20190924-051147-marostegui.json
05:10 cdanis: T230783 mark DEFAULT not s3 as readonly in etcd etcd dbconfig data
05:00 marostegui@cumin1001: dbctl commit (dc=all): 'Set s3 as read-only for maintenance T230783', diff saved to https://phabricator.wikimedia.org/P9157 and previous config saved to /var/cache/conftool/dbconfig/20190924-050034-marostegui.json
05:00 marostegui: Starting s3 failover from db1075 to db1123 - T230783
04:21 marostegui@cumin1001: dbctl commit (dc=all): 'Set weight 0 to db1123 T230783', diff saved to https://phabricator.wikimedia.org/P9156 and previous config saved to /var/cache/conftool/dbconfig/20190924-042121-marostegui.json
04:13 marostegui: Start pre switchover steps - T230783
03:52 chaomodus: rebooted netboxdb[12]001 for kernel upgrade
03:46 crusnov@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
08:24 elukey@deploy1001: Started deploy [analytics/refinery@a20a647]: Deploy python2 -> python3 fixes
08:21 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1123 after kernel and binlog format change', diff saved to https://phabricator.wikimedia.org/P9148 and previous config saved to /var/cache/conftool/dbconfig/20190923-082119-marostegui.json
07:41 godog: swift run swiftrepl without deletes eqiad -> codfw
07:40 godog: swift eqiad-prod: continue ms-be1027 decom - T233289
07:30 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1123 after kernel and binlog format change', diff saved to https://phabricator.wikimedia.org/P9147 and previous config saved to /var/cache/conftool/dbconfig/20190923-073044-marostegui.json
07:15 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1123 after kernel and binlog format change', diff saved to https://phabricator.wikimedia.org/P9146 and previous config saved to /var/cache/conftool/dbconfig/20190923-071537-marostegui.json
07:08 marostegui: Stop MySQL on db1123 to reboot to change binlog format and kernel - T230783
07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1123 to change binlog format T230783', diff saved to https://phabricator.wikimedia.org/P9145 and previous config saved to /var/cache/conftool/dbconfig/20190923-070628-marostegui.json
06:50 marostegui@cumin1001: dbctl commit (dc=all): 'Change db1123 and db1078 roles, db1078 will serve logpager and recentchanges, db1123 will just serve general traffic', diff saved to https://phabricator.wikimedia.org/P9144 and previous config saved to /var/cache/conftool/dbconfig/20190923-065056-marostegui.json
08:52 jynus: creating new database on m1 "bacula9" T229209
08:28 hashar: Killed zuul-server process on contint2001 which was establishing connections to Gerrit and filling the pool of allowed ssh connections # T233390
08:23 hashar: CI in default since it is somehow no more able to fetch from Gerrit T233390
08:20 hashar: contint1001: upgrade zuul to 2.5.1-wmf10 # T203846
08:12 hashar: contint2001: upgrade zuul to 2.5.1-wmf10 # T203846
07:46 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=False)
05:29 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1089 from logpager and contributions after testing, repool back with normal weight on main traffic T223151', diff saved to https://phabricator.wikimedia.org/P9136 and previous config saved to /var/cache/conftool/dbconfig/20190920-052902-marostegui.json
05:27 marostegui: Analyze table enwiki.logging on db2102 - T223151
05:07 marostegui: Remove temporary index on hiwikisource views T219374
01:06 mholloway-shell@deploy1001: Finished deploy [recommendation-api/deploy@a29da76]: Rolling back deployment due to alerts beginning after 0:00 UTC (duration: 02m 51s)
01:05 jforrester@deploy1001: Synchronized php-1.34.0-wmf.23/extensions/TimedMediaHandler/: T233360 Fix Safari 13.0 regression in video playback with audio (duration: 00m 58s)
01:03 mholloway-shell@deploy1001: Started deploy [recommendation-api/deploy@a29da76]: Rolling back deployment due to alerts beginning after 0:00 UTC
2019-09-19
23:23 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
17:43 Krinkle: Move whisper/MediaWiki/wanobjectcache/revision_row_1/29 to whisper/MediaWiki/wanobjectcache/revision_row_1_29 on graphite1004 and graphite2003 (T232907)
17:38 arlolra@deploy1001: Started deploy [parsoid/deploy@77630c5]: Updating Parsoid to 6bf23c2
13:08 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1089 into contributions service T223151', diff saved to https://phabricator.wikimedia.org/P9133 and previous config saved to /var/cache/conftool/dbconfig/20190919-130848-marostegui.json
13:01 mobrovac@deploy1001: Finished deploy [restbase/deploy@7f4b7f7]: Start using RESTBase built on Stretch - T224553 (duration: 21m 38s)
12:39 mobrovac@deploy1001: Started deploy [restbase/deploy@7f4b7f7]: Start using RESTBase built on Stretch - T224553
12:36 @: helmfile [STAGING] Ran 'sync' command on namespace 'restrouter' for release 'staging' .
07:22 marostegui@cumin1001: dbctl commit (dc=all): 'Give more logpager weight to db1089 T223151', diff saved to https://phabricator.wikimedia.org/P9131 and previous config saved to /var/cache/conftool/dbconfig/20190919-072234-marostegui.json
07:01 moritzm: reimaging restbase2012 to stretch T224553
06:18 marostegui: Sanitize hiwikisource on db1124:3313 and db2094:3313 T219374
06:04 marostegui@cumin1001: dbctl commit (dc=all): 'Temporarily pool db1089 into enwiki logpager T223151', diff saved to https://phabricator.wikimedia.org/P9130 and previous config saved to /var/cache/conftool/dbconfig/20190919-060440-marostegui.json
05:11 marostegui: Stop MySQL on db2055 for decommission T233186
05:11 marostegui: Remove db2055 from tendril and zarcillo T233186
21:54 gilles: T233095 Purging all eswiki articles (both desktop and mobile this time)
21:53 gilles@deploy1001: Synchronized php-1.34.0-wmf.22/maintenance/purgeList.php: T233095 Make purgeList.php use getCdnUrls() (duration: 01m 04s)
21:13 XioNoX: enable damping on primary codfw-eqiad link - T196432
21:09 XioNoX: enable damping on codfw-ulsfo link - T196432
20:50 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: No longer load InitialiseSettings at all in CommonSettings (duration: 01m 03s)
08:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2089:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P9125 and previous config saved to /var/cache/conftool/dbconfig/20190918-085721-marostegui.json
06:43 moritzm: reimaging restbase2011 to stretch T224553
06:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2089:3316 for schema change', diff saved to https://phabricator.wikimedia.org/P9124 and previous config saved to /var/cache/conftool/dbconfig/20190918-060401-marostegui.json
05:58 marostegui: Deploy schema change on db2097:3316 - T233135
05:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repool host after onsite checks T233184', diff saved to https://phabricator.wikimedia.org/P9123 and previous config saved to /var/cache/conftool/dbconfig/20190918-054755-marostegui.json
17:00 @: helmfile [EQIAD] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
16:59 @: helmfile [EQIAD] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
16:21 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0)
16:04 jbond42: run octocatalog-diff from elnath with current facts
15:55 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: Revert Set MinimumPasswordLengthToLogin to 10 for all prived groups, not just +staff (duration: 00m 55s)
15:53 reedy@deploy1001: sync-file aborted: (no justification provided) (duration: 00m 01s)
15:53 reedy@deploy1001: sync-file aborted: (no justification provided) (duration: 00m 00s)
15:17 marostegui@cumin1001: dbctl commit (dc=all): 'Host down for on-site maintenance', diff saved to https://phabricator.wikimedia.org/P9120 and previous config saved to /var/cache/conftool/dbconfig/20190917-151714-marostegui.json
15:16 marostegui: Stop MySQL on db2127 and shut the host down for onsite maintenance
14:52 elukey@cumin1001: END (FAIL) - Cookbook sre.hadoop.reboot-workers (exit_code=99)
10:44 vgutierrez: replacing nginx with ATS in cp1076 (upload cluster) - T231433
09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depool and stop replication on db1130 db1104 db1085 db1086 (lag will appear on s6 on labsdb) for PDU maintenance - T227539', diff saved to https://phabricator.wikimedia.org/P9116 and previous config saved to /var/cache/conftool/dbconfig/20190917-094827-marostegui.json
09:46 marostegui: Depool and stop replication on db1130 db1104 db1085 db1086 (lag will appear on s6 on labsdb) for PDU maintenance - T227539
09:30 hashar: Restarting CI jenkins
09:29 marostegui: Downtime db1073 db1130 db1104 db1085 db1086 for the PDU maintenance T227539
09:18 jynus@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
09:05 jiji@deploy1001: Synchronized wmf-config/CommonSettings.php: Push PHP7 traffic to 100% of users who accept cookies - T219150 (duration: 00m 57s)
08:37 vgutierrez: upgrading ATS to 8.0.5-1wm8 on cp3034 - T231849T232724
07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1074 with just 50 to keep its warmness level just in case T231638', diff saved to https://phabricator.wikimedia.org/P9115 and previous config saved to /var/cache/conftool/dbconfig/20190917-075807-marostegui.json
07:48 effie: Enable puppet on mw*
07:42 elukey: reboot analytics-tool1004 (host running superset) for kernel updates
07:41 marostegui: Stop mysql on db1063 for decommissioning T232564
07:40 marostegui: Remove db1063 from puppet and zarcillo T232564
07:29 vgutierrez: repooling cp5007 without wikibase configuration - T99531
07:23 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
07:19 vgutierrez: depooling cp5007 to ensure that wikibase removal goes as expected - T99531
07:10 vgutierrez: getting rid of wikibase TLS certificate & nginx configuration on the text cache cluster - T99531
06:56 vgutierrez: upgrading ATS to 8.0.5-1wm8 on cp2002, cp4021 and cp5001 - T231849
06:55 vgutierrez: uploaded trafficserver 8.0.5-1wm8 to apt.wikimedia.org (stretch) - T231849
06:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1066 T233071', diff saved to https://phabricator.wikimedia.org/P9114 and previous config saved to /var/cache/conftool/dbconfig/20190917-065342-marostegui.json
06:49 moritzm: reimage restbase2010 to Stretch T224553
05:57 vgutierrez: upgrading ATS to 8.0.5-1wm7 on cp2002 and cp4021 - T232724
05:56 vgutierrez: uploaded trafficserver 8.0.5-1wm7 to apt.wikimedia.org (stretch) - T232298T232724
05:23 effie: disable puppet on mw* servers for 536979
05:01 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1122 to s2 master and remove read-only from s2 T230785', diff saved to https://phabricator.wikimedia.org/P9113 and previous config saved to /var/cache/conftool/dbconfig/20190917-050133-marostegui.json
05:00 marostegui@cumin1001: dbctl commit (dc=all): 'Set s2 as read-only for maintenance T230785', diff saved to https://phabricator.wikimedia.org/P9112 and previous config saved to /var/cache/conftool/dbconfig/20190917-050043-marostegui.json
05:00 marostegui: Starting s2 failover from db1066 to db1122 - T230785
04:57 effie: Downtiming HTTPS-blog on icing - T232412
04:14 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1122 with weight 0 and depool it from API T230785', diff saved to https://phabricator.wikimedia.org/P9111 and previous config saved to /var/cache/conftool/dbconfig/20190917-041441-marostegui.json