Server Admin Log

From Wikitech
(Redirected from Server admin log)
Jump to: navigation, search

2017-04-30

  • 07:45 elukey: deleted /srv/cassandra-a/commitlog/CommitLog-5-1490738321543.log from restbase1009-a (empty commit log file created before OOM - backup in /home/elukey)

2017-04-29

  • 10:50 elukey: set sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 to kafka[1018,1020,1022].eqiad.wmnet (was 120 - maybe related to T136094 ?)
  • 10:39 elukey: start ferm on kafka1020/18 (nodes were previously down for maintenance, not sure why ferm wasn't started)
  • 09:59 reedy@naos: Synchronized wmf-config/CommonSettings.php: Revert pdf processor firejails T164045 (duration: 02m 41s)

2017-04-28

  • 21:24 Dereckson: End of live debug on mwdebug1001, restored previous state with a local scap pull
  • 21:00 ejegg: updated payments-wiki from 1620b82 to 4c56302
  • 20:23 Dereckson: Live debug on mwdebug1001 for T164059
  • 19:30 jynus: shutting down db1063 - I see high temperatures reported, and going up T164107
  • 19:09 urandom: T163936: reenabling puppet on restbase-dev1001
  • 18:14 urandom: T163936: disabling puppet on restbase-dev1001 (t-shooting c-m-c)
  • 17:09 jynus: restarting replication on all nodes on s7-eqiad T164092
  • 16:38 jynus: stopping replication on all nodes on s7-eqiad in case db1062 boots up in a corrupted state
  • 16:36 jynus: restarting db1062 once more T164092
  • 15:56 godog: poweroff prometheus1004 for ram upgrade - T163385
  • 15:40 jynus: deploying new events_coredb_slave.sql on codfw T160984
  • 15:21 godog: poweroff prometheus1003 for ram upgrade - T163385
  • 14:55 gehel: shutting down elastic2020 for mainboard replacement - T149006
  • 14:32 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Change db1063 IP and rack - T163895 (duration: 00m 48s)
  • 14:31 marostegui@naos: Synchronized wmf-config/db-codfw.php: Change db1063 IP and rack - T163895 (duration: 00m 50s)
  • 14:04 marostegui: Stop and shutdown db1063 - T163895
  • 14:04 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Change db1062 rack location - T163895 (duration: 00m 52s)
  • 13:59 moritzm: installing ghostscript security updates
  • 13:56 urandom: T163936: restarting cassandra-metrics-collector, restbase production
  • 13:55 urandom: $ readlink /usr/local/lib/cassandra-metrics-collector/cassandra-metrics-collector.jar
  • 13:50 ema: varnish 4.1.6-1wm1 uploaded to apt.w.o
  • 13:46 urandom: T163936: restarting cassandra-metrics-collector on restbase1007
  • 13:46 marostegui@naos: Synchronized wmf-config/db-codfw.php: Change db1061 IP - T163895 (duration: 01m 00s)
  • 13:44 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Change db1061 IP - T163895 (duration: 01m 19s)
  • 13:44 urandom: T163936: forcing puppet run on restbase1007
  • 13:30 marostegui: Stop MySQL and shutdown db1061 - T163895
  • 13:26 marostegui: Stop MySQL and shutdown db1062 - T163895
  • 10:47 akosiaris: migrate/evacuate ganeti2005, ganeti2006 for T164011
  • 10:42 akosiaris: reboot oresrdb1002 for kernel upgrade
  • 09:56 moritzm: installing libxslt security updates on trusty
  • 09:29 marostegui: upgrade mariadb db1059,db1056 from 10.0.22 to 10.0.28
  • 09:17 marostegui: upgrade mariadb db1071 from 10.0.23 to 10.0.28
  • 09:15 akosiaris: reboot oresrdb1001 for kernel upgrade
  • 09:02 marostegui: Upgrade mariadb on db1081 and db1084 from 10.0.23 to 10.0.28
  • 08:03 Amir1: cleanup done, 4M rows deleted (T159753)
  • 07:58 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1045 - T162539 T163548 (duration: 02m 38s)
  • 06:48 Amir1: cleaning around 5-10M rows in ores_classification in enwiki (half-an-hour script, T159753)
  • 01:18 ejegg: rolled payments-wiki back to 1620b82
  • 01:15 ejegg: udated payments-wiki from 1620b82 to 4c56302

2017-04-27

  • 23:36 catrope@naos: Synchronized php-1.29.0-wmf.21/extensions/SecurePoll/includes/pages/CreatePage.php: Stop gap for fix global election creation (T164043) (duration: 00m 43s)
  • 23:34 catrope@naos: Synchronized wmf-config/InitialiseSettings.php: Enable WikidataPageBanner on viwikivoyage (T163662) (duration: 00m 46s)
  • 23:29 ejegg: rolled back payments-wiki to 1620b82
  • 23:29 catrope@naos: Synchronized wmf-config/InitialiseSettings.php: Enable responsive references on elwiki (T163074) (duration: 00m 49s)
  • 23:27 ejegg: udated payments-wiki from 1620b82 to 4c56302
  • 23:22 catrope@naos: Synchronized wmf-config/InitialiseSettings.php: Set ORES thresholds in new format for all enabled wikis (T162760) (duration: 00m 53s)
  • 23:16 catrope@naos: Synchronized php-1.29.0-wmf.21/includes/deferred/LinksUpdate.php: Release prior row locks beforehand in LinksUpdate::updateCategoryCounts (T163801) (duration: 01m 01s)
  • 23:13 catrope@naos: Synchronized wmf-config/CirrusSearch-common.php: Enable sistersearch title profile for wikivoyage (duration: 01m 19s)
  • 21:57 cwd: updated process-control to 1.0.6
  • 21:56 volans: shutting down gadolinium, it came up 1h25m ago and stole the public IP from meitnerium
  • 21:08 ppchelko@naos: Finished deploy [restbase/deploy@61c1ceb]: Automatically rerender parsoid, only store summaries if they are changed, don't rerender data-parsoid (duration: 07m 16s)
  • 21:01 ppchelko@naos: Started deploy [restbase/deploy@61c1ceb]: Automatically rerender parsoid, only store summaries if they are changed, don't rerender data-parsoid
  • 20:53 ppchelko@naos: Finished deploy [restbase/deploy@fcfc537]: Automatically rerender parsoid, only store summaries if they are changed (duration: 11m 33s)
  • 20:53 twentyafterfour@naos: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.21
  • 20:47 twentyafterfour@naos: Synchronized php-1.29.0-wmf.21/extensions/FlaggedRevs: deploy fix for T163994 (duration: 01m 17s)
  • 20:42 ppchelko@naos: Started deploy [restbase/deploy@fcfc537]: Automatically rerender parsoid, only store summaries if they are changed
  • 20:37 mutante: ocg1001 - has been reinstalled but ocg package deployment fails currently "has the minion key been accepted", should not be repooled just yet
  • 20:32 mutante: ores/cache::misc: switch ores back to codfw-only - everything is like it was before the failed deploy yesterday again
  • 20:21 andrewbogott: stripping a bunch of unneeded extensions from wikitech-static
  • 20:20 mutante: ocg1001 - re-added to puppet, initial run, reinstall ongoing (T161158)
  • 20:18 mutante: ores is active/active now, for a short time
  • 20:16 mutante: ocg1001 - revoke old puppet cert, salt key
  • 20:15 mutante: run puppet on cache::misc to push ores change - cumin -b 5 -s 10 'R:class = role::cache::misc' 'run-puppet-agent -q'
  • 20:03 twentyafterfour: 1.29.0-wmf.21 is blocked by T163994
  • 20:01 mutante: ocg1001 - reboot into PXE, re-install
  • 19:59 twentyafterfour@naos: Synchronized php-1.29.0-wmf.21/extensions/FlaggedRevs/frontend/FlaggedRevsUI.hooks.php: deploy fix for T163994 (duration: 01m 04s)
  • 19:33 twentyafterfour: start mediawiki deployment train group 2 - all wikis to 1.29.0-wmf.21
  • 19:24 reedy@naos: Synchronized wmf-config/CommonSettings.php: Run pdf processors in firejails T164000 (duration: 01m 20s)
  • 19:20 XenoRyet: Updated paymentswiki from ee7d402 to 1620b82
  • 18:47 addshore: Morning SWAT Done!
  • 18:46 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: SWAT WMDE Spring campaign - Remove logging (no longer needed) (duration: 00m 47s)
  • 18:44 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: SWAT wmgUseGettingStarted true for dewiki (duration: 00m 48s)
  • 18:41 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: SWAT Enable Cognate Logging (duration: 00m 48s)
  • 18:40 XenoRyet: Roll back paymentswiki from 030b2f9 to ee7d402
  • 18:34 addshore@naos: Synchronized php-1.29.0-wmf.21/extensions/CirrusSearch: SWAT #1 #2 (duration: 00m 59s)
  • 18:31 addshore@naos: Synchronized wmf-config/CirrusSearch-common.php: SWAT update name of sistersearch profile for wikivoyage (duration: 00m 49s)
  • 18:24 addshore@naos: Synchronized php-1.29.0-wmf.21/extensions/WikimediaEvents/WikimediaEventsHooks.php: SWAT WMDE Spring campaign - Remove hook PT2/2 (duration: 00m 52s)
  • 18:23 urandom: T163936: restarting cassandra-metrics-collector, restbase production
  • 18:22 addshore@naos: Synchronized php-1.29.0-wmf.21/extensions/WikimediaEvents/extension.json: SWAT WMDE Spring campaign - Remove hook PT1/2 (duration: 00m 57s)
  • 18:21 urandom: T163936: restarting cassandra-metrics-collector, restbase staging
  • 18:20 addshore@naos: Synchronized php-1.29.0-wmf.21/includes/api/ApiQueryPagePropNames.php: SWAT Do not add limit to ApiQueryPagePropNames when database type is mysql (duration: 01m 04s)
  • 18:17 twentyafterfour: restarting apache on iridium to hotfix T164005
  • 18:07 addshore@naos: Synchronized wmf-config/Wikibase-production.php: SWAT Fix echoIcon for wikibase in testwikis (duration: 01m 27s)
  • 17:44 XenoRyet: Updated paymentswiki from ee7d402 to 030b2f9
  • 17:36 ladsgroup@naos: Finished deploy [ores/deploy@68cca85]: (no justification provided) (duration: 21m 50s)
  • 17:30 _joe_: started pybal on lvs1006 after network was fixed
  • 17:25 XenoRyet: reverted paymentswiki from 030b2f9 to ee7d402
  • 17:20 XenoRyet: Updated paymentswiki from ee7d402 to 030b2f9
  • 17:15 ladsgroup@naos: Started deploy [ores/deploy@68cca85]: (no justification provided)
  • 17:15 Amir1: ladsgroup@naos:/srv/deployment/ores/deploy$ scap deploy (T163950)
  • 17:12 demon@naos: Pruned MediaWiki: 1.29.0-wmf.18 [keeping static files] (duration: 00m 20s)
  • 17:08 _joe_: stop pybal on lvs1006 to stop announcing via BGP
  • 17:08 demon@naos: Pruned MediaWiki: 1.29.0-wmf.16 (duration: 00m 13s)
  • 17:04 demon@naos: Synchronized scap/plugins/clean.py: One last fix (duration: 01m 04s)
  • 16:53 gehel: unbanning all elasticsearch servers in eqiad row D - T148506
  • 16:48 demon@naos: Synchronized scap/plugins/clean.py: --keep-static is nice now. Also need a co-master sync (duration: 01m 28s)
  • 16:45 andrewbogott: re-enabling labs instance creation/deletion
  • 16:42 demon@naos: Pruned MediaWiki: 1.29.0-wmf.19 [keeping static files] (duration: 00m 15s)
  • 16:32 gehel: unbanning elasticsearch servers in eqiad row D - elastic10(17|18|19|20) - T148506
  • 15:56 elukey: restart of jmxtrans on all the hadoop worker nodes
  • 15:51 andrewbogott: disabling labs instance create/delete to avoid hilarity during network maintenance
  • 15:50 elukey: forced 'service ferm start' on the failed analytics hosts
  • 15:46 marostegui: Upgrade db1091 mariadb from 10.0.23 to 10.0.28
  • 15:39 marostegui: Upgrade db1089 mariadb from 10.0.23 to 10.0.28
  • 15:34 marostegui: Upgrade db1090 mariadb from 10.0.23 to 10.0.28
  • 15:22 jynus: stopping all replication channels on dbstore1001 for topology changes
  • 14:34 ema: upgrade upload-codfw to varnish 4.1.5-1wm4 T145661
  • 14:29 marostegui: Stop MySQL and shutdown es2019 for HW replacement - T149526
  • 14:26 ema: varnish 4.1.5-1wm4 uploaded to apt.w.o T145661
  • 14:08 marostegui: Deploy alter table labswiki.revision on labtestweb2001 - T132416
  • 14:04 marostegui: Deploy alter table labswiki.revision on silver - T132416
  • 13:57 _joe_: restarting HHVM on mw2213, stuck in HPHP::Treadmill::getAgeOldestRequest
  • 13:52 ladsgroup@naos: Synchronized wmf-config/Wikibase-production.php: SWAT: Set echoIcon for notification of wikibase in test wikis (T142102) (duration: 00m 57s)
  • 13:52 Amir1: start of scap sync-file wmf-config/Wikibase-production.php 'SWAT: Set echoIcon for notification of wikibase in test wikis (T142102)'
  • 13:45 ladsgroup@naos: Synchronized portals: (no justification provided) (duration: 01m 05s)
  • 13:44 ladsgroup@naos: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 01m 21s)
  • 13:43 Amir1: ladsgroup@naos:/srv/mediawiki-staging$ portals/sync-portals (T128546)
  • 12:53 volans: disabled puppet on rdb*
  • 12:06 marostegui: Upgrade es1011 and es1014 from mariadb 10.0.22 to mariadb 10.0.28
  • 11:50 marostegui: Upgrade mariadb from 10.0.22 to 10.0.28 on es1015
  • 09:46 moritzm: upgrading mysql on bohrium/piwik
  • 09:25 _joe_: restarting all redis instances for jobqueues on eqiad to force a full resync with masters in codfw T163337
  • 08:55 jynus: deploying alter table to all wikis on s6 T163979
  • 08:54 _joe_: restarting redis rdb1001:6380 after cleaning up the current AOF files for investigation of T163337
  • 08:50 moritzm: installing django security updates
  • 08:29 godog: ms-be1039 issue "controller slot=3 pd 1I:1:5 modify disablepd" to force failed sdc - T163690
  • 08:25 ema: restart varnish-be on cp2024 with expiry thread RT experiment enabled
  • 08:19 ema: upgrade varnish to 4.1.5-1wm3 on cp2024
  • 07:56 elukey: aqs100[69] back serving AQS traffic
  • 07:55 ema: varnish 4.1.5-1wm3 uploaded to apt.w.o T145661
  • 07:16 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool hosts that needed to be moved for the network maintenance - T162681 (duration: 02m 32s)
  • 06:53 marostegui: Reboot es1014 for kernel upgrade - T162029
  • 06:50 elukey: executed kafka preferred-replica-election to rebalance topic leaders in the analytics cluster after maintenance
  • 06:45 marostegui: Reboot es1011 for kernel upgrade - T162029
  • 06:39 marostegui: Logging for the record: drop table hashs from s2, s3 and s7 (only places where it existed) - T54927
  • 06:23 _joe_: moving orphaned objects in ms-be1039's root partition in sdc1/stale_root to save space
  • 06:17 marostegui: Deploy schema change on s7 metawiki.pagelinks to remove partitioning on db1041 - T153300
  • 06:14 marostegui: Deploy alter table on s5 (wikidatawiki) on db1049 - T163548
  • 06:14 marostegui: Deploy alter table on s5 (wikidatawiki) on db1070 (running locally instead of neodymium as this host will be affected by the network maintenance) - T163548
  • 06:11 marostegui: Deploy alter table on s5 (wikidatawiki) on db1070 (running locally instead of neodymium as this host will be affected by the network maintenance) - T130067 T162539
  • 06:09 marostegui: Deploy alter table on s5 (wikidatawiki) on db1049 - T130067 T162539
  • 05:59 marostegui: Deploy alter table labsdb1003 (wikidatawiki) https://phabricator.wikimedia.org/T162539%C2%A0https://phabricator.wikimedia.org/T163548
  • 05:24 Amir1: cleaning some rows in ores_classification in enwiki (T159753)
  • 03:44 ottomata: starting kafka broker on kafka1020
  • 03:40 ottomata: running kafka replica election to bring kafka1018 back as preferred leader
  • 02:21 Jamesofur: running populateEditCount.php in screen on wast for T163854, counting edits for board vote eligibility
  • 02:16 RoanKattouw: Reset 2FA for T163931 on labswiki
  • 00:14 twentyafterfour: starting phabricator update
  • 00:05 ebernhardson@naos: Synchronized php-1.29.0-wmf.21/extensions/CirrusSearch/includes/Searcher.php: cirrus: align sister search boost template config variable with documentation (duration: 00m 50s)

2017-04-26

  • 23:51 niharika29@naos: Synchronized php-1.29.0-wmf.21/includes/interwiki/ClassicInterwikiLookup.php: Interwiki: Dont override interwiki map order (T145337) (duration: 01m 00s)
  • 23:38 niharika29@naos: Synchronized php-1.29.0-wmf.21/extensions/CirrusSearch/: Align other index template boosting config names (duration: 00m 57s)
  • 23:34 niharika29@naos: Synchronized wmf-config/InitialiseSettings.php: Increase max field count for wikidata; Enable Flow beta feature on arwiki (T155720) (duration: 00m 58s)
  • 23:31 niharika29@naos: Synchronized wmf-config/InitialiseSettings.php: Increase max field count for wikidata; Enable Flow beta feature on arwiki (T155720) (duration: 01m 04s)
  • 23:29 niharika29@naos: Synchronized wmf-config/CirrusSearch-common.php: [cirrus] Increase max field count for wikidata (duration: 01m 23s)
  • 21:42 mutante: running puppet on all cache::misc nodes via cumin to switch ORES to eqiad
  • 21:30 mutante: restarting uwsgi-ores service on all scb2* with systemctl restart
  • 21:15 twentyafterfour: finished with mediawiki deployment train for group1. Everything appears stable, no increase in logspam.
  • 21:12 twentyafterfour@naos: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.21
  • 21:09 halfak@naos: Started restart [ores/deploy@cc12103]: (no justification provided)
  • 21:08 twentyafterfour@naos: Synchronized php-1.29.0-wmf.21/extensions/Flow/Hooks.php: sync https://gerrit.wikimedia.org/r/#/c/350481/ refs T163896 T161733 (duration: 01m 20s)
  • 21:05 arlolra: Updated Parsoid to 4949857a (T116508, T64270, T133673)
  • 20:55 arlolra@naos: Finished deploy [parsoid/deploy@8d109eb]: Updating Parsoid to 4949857a (duration: 06m 52s)
  • 20:48 arlolra@naos: Started deploy [parsoid/deploy@8d109eb]: Updating Parsoid to 4949857a
  • 20:48 twentyafterfour: deploying https://gerrit.wikimedia.org/r/#/c/350481/1 to get the train back on track refs T161733
  • 20:35 bsitzmann@naos: Finished deploy [mobileapps/deploy@b5afcb8]: Update mobileapps to 14bd4a5 (duration: 15m 17s)
  • 20:34 halfak@naos: Finished deploy [ores/deploy@cc12103]: T162892 (duration: 21m 28s)
  • 20:31 elukey: restart zookeeper on conf1003 after network maintenance
  • 20:20 bsitzmann@naos: Started deploy [mobileapps/deploy@b5afcb8]: Update mobileapps to 14bd4a5
  • 20:12 halfak@naos: Started deploy [ores/deploy@cc12103]: T162892
  • 19:50 elukey: restart kafka nodes (kafka1018 and kafka1020) after network maintenance
  • 19:45 twentyafterfour@naos: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.20
  • 19:42 twentyafterfour: rolling back group1 to wmf.20 due to T163896 refs T161733
  • 19:31 twentyafterfour@naos: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.21
  • 19:24 twentyafterfour: begin deployment train: group1 wikis to 1.29.0-wmf.21 refs T161733
  • 19:22 bblack: initiating cumin-based restart of all varnish backends for cache_upload in codfw to downgrade from experimental package. 30 minute spacing, 10 hosts, ~5h to completion...
  • 19:17 thcipriani@naos: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable collectionsaveascommunitypage right on es.wikipedia T163767 (duration: 00m 49s)
  • 19:05 bblack: restarting varnish frontend and backend on cp3033 to downgrade
  • 19:03 bblack: restaring varnish-frontend on cp2014 to downgrade
  • 18:58 thcipriani@naos: Synchronized wmf-config/CommonSettings.php: SWAT: Workaround issue of overriding whitelist config variable T163114 (duration: 00m 53s)
  • 18:56 bblack: downgrading varnish back to 4.1.5-wm1 on all -wm2 hosts
  • 18:50 thcipriani@naos: Synchronized php-1.29.0-wmf.21/extensions/CirrusSearch: SWAT: Provide a way to blacklist a set of wikis for crosswiki search T163546 (duration: 01m 02s)
  • 18:44 thcipriani@naos: Synchronized wmf-config/CirrusSearch-common.php: SWAT: Adjust sistersearch against wikivoyage to require title matching T163547 (duration: 01m 11s)
  • 18:38 thcipriani@naos: Synchronized wmf-config/CirrusSearch-common.php: SWAT: Configure multimedia search template boosting T163223 (duration: 00m 53s)
  • 18:30 thcipriani@naos: Synchronized php-1.29.0-wmf.20/extensions/SecurePoll: SWAT: Add voter scripts for board/fdc election 2017 T163854 (duration: 00m 57s)
  • 18:26 thcipriani@naos: Synchronized php-1.29.0-wmf.21/extensions/SecurePoll: SWAT: Add voter scripts for board/fdc election 2017 T163854 (duration: 01m 00s)
  • 18:23 thcipriani@naos: Synchronized dblists/commonsuploads.dblist: SWAT: Enable local uploads on knwiki T133137 (duration: 01m 06s)
  • 18:16 ema: start varnish-frontend on cp2014
  • 18:14 jynus: running alter table on all wikis of s3 T163912
  • 17:49 jynus: rebooting es1019 for upgrading and to fix race condition on services
  • 17:46 elukey: restart nutcracker on the eqiad mw hosts to pick up the new shard config (spamming elasticsearch memcached and triggering alarms)
  • 17:44 elukey: unmasking and starting daemons on restbase-dev1003
  • 17:41 reedy@naos: Synchronized wmf-config/InitialiseSettings.php: touch (duration: 01m 23s)
  • 17:02 mobrovac@naos: Started restart [trending-edits/deploy@7112062]: Restart for ICU lib update
  • 17:01 mobrovac@naos: Started restart [mobileapps/deploy@5c2b9a9]: Restart for ICU lib update
  • 17:00 mobrovac@naos: Started restart [mathoid/deploy@7eb4092]: Restart for ICU lib update
  • 16:43 mobrovac@naos: Started restart [electron-render/deploy@9156760]: Restart for ICU lib update
  • 16:39 mobrovac@naos: Started restart [graphoid/deploy@128206b]: Restart for ICU lib update
  • 16:37 mobrovac@naos: Started restart [eventstreams/deploy@05bcc8f]: Restart for ICU lib update
  • 16:37 mobrovac@naos: Started restart [electron-render/deploy@9156760]: Restart for ICU lib update
  • 16:36 mobrovac@naos: Started restart [cxserver/deploy@6899032]: Restart for ICU lib update
  • 16:34 mobrovac@naos: Started restart [citoid/deploy@b8c4cb2]: Restart for ICU lib update
  • 16:14 elukey: stop and mask cassandra and restbase on restbase-dev1003 for row-d maintenance
  • 16:07 _joe_: disabled and masked strongswan, memcached, redis on mc1013-17 for decommissioning
  • 15:43 XioNoX: VRRP priority removed, interfaces cr2/asw2 renamed - T148506
  • 15:40 _joe_: shutting down conf1003 T148506
  • 15:33 XioNoX: "cr2-eqiad# delete interfaces ae4 disable" done, confirmed links and LACP are up - T148506
  • 15:33 XioNoX: "cr2-eqiad# delete interfaces ae4 disable" done, confirmed links and LACP are up
  • 15:24 marostegui: Shutdown es2019 for maintenance with papaul and Dell - T149526
  • 15:12 XioNoX: switch ports for rack D7 and D8 configured - T148506
  • 14:47 marostegui: Stop MySQL db1070 (just in case) to test drac cold restart
  • 14:47 bblack@neodymium: conftool action : set/pooled=no; selector: dc=eqiad,cluster=cache_upload,name=cp107[1234].eqiad.wmnet
  • 14:26 elukey: depooling aqs100[69] from AQS for network maintenance
  • 14:20 elukey: stop zookeeper on conf1003 for row-d maintenance (Hadoop, Kafka related)
  • 14:04 XioNoX: "cr2-eqiad# set interfaces ae4 disable" done, (1 ping loss) - T148506
  • 14:00 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1026, depool db1045 - T162539 T163548 (duration: 00m 53s)
  • 13:59 XioNoX: lowered VRRP priority for T148506
  • 13:58 andrewbogott: put labservices1001 into downtime to minimize (but probably not totally eliminate) alert spam
  • 13:56 andrewbogott: disabled instance creation on Horizon via https://gerrit.wikimedia.org/r/#/c/350414/ and on wikitech via a strategic edit in extensions/OpenStackManager/special/SpecialNovaInstance.php
  • 13:56 godog: downtime and poweroff ms-be 21 26 27 37 38 39 before switch relocation - T148506
  • 13:54 gehel: downtime "ElasticSearch health check for shards" checks for logstash and elasticsearch eqiad - T148506
  • 13:53 elukey: stop kafka on kafka1020 and kafka1018 for row-d extended maintenance (D2)
  • 13:44 _joe_: shutting down mc1013-18 for row D maintenance
  • 13:40 aude@naos: Synchronized wmf-config/CommonSettings-labs.php: (no justification provided) (duration: 00m 57s)
  • 13:32 aude@naos: Synchronized wmf-config/Wikibase-production.php: disable tabular-data for now on wikidata and enable echo notification on test wikis (duration: 01m 06s)
  • 13:29 marostegui: Deploy alter table on db1069 (wikidatawiki) https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548
  • 13:27 marostegui: Deploy alter table labsdb1001 https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548
  • 13:23 marostegui: Deploy alter table db1045 - https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548
  • 13:22 elukey: restart HDFS on analytics100[12] (Hadoop master nodes) to pick up recent topology changes for the cluster
  • 13:10 aude@naos: Synchronized wmf-config/throttle.php: (no justification provided) (duration: 01m 23s)
  • 13:02 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp2014.codfw.wmnet,service=varnish-be
  • 13:00 ema: cp2017: restart varnish-be
  • 12:56 marostegui: Shutdown db1092 for maintenance - https://phabricator.wikimedia.org/T162681
  • 12:55 gehel: restart elasticsearch on relforge1001 to validate new config - T161830
  • 12:46 moritzm: installing mysql security updates (5.5 as packaged in Debian jessie)
  • 12:43 ema@neodymium: conftool action : set/pooled=no; selector: name=cp2014.codfw.wmnet,service=varnish-be
  • 11:32 jynus: applying new events_coredb_slave.sql on db2055 T160984
  • 11:31 moritzm: rebooting mwlog2001 for update to Linux 4.9
  • 10:47 ladsgroup@naos: Synchronized wmf-config/Wikibase-labs.php: T142104, part II (duration: 00m 56s)
  • 10:45 ladsgroup@naos: Synchronized static/images/wikibase/echoIcon.svg: T142104, part I (duration: 01m 04s)
  • 10:44 marostegui: Deploy alter table on s5, on db1063 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166
  • 10:39 jynus@naos: Synchronized wmf-config/db-eqiad.php: switch s5 eqiad master from db1049 to db1063 (duration: 01m 24s)
  • 09:48 jynus: migrating s5 eqiad replicas under db1063
  • 09:42 jynus: restarting mariadb at db1063
  • 09:24 marostegui: Shutdown db1094, db1093, db1091 for maintenance - T162681
  • 09:16 marostegui: Shutdown es1019 for maintenance - T162681
  • 08:32 elukey: Gracefully stopping hadoop daemons on Hadoop nodes affected by Row-D maintenance
  • 08:30 marostegui: Deploy alter table on change_tag and tag_summary on silver and labtestweb2001 - T147166
  • 08:27 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Depool hosts that need to be moved for the network maintenance - T162681 (duration: 02m 25s)
  • 08:22 moritzm: reimaging terbium to jessie
  • 07:59 jynus: shutting down mariadb on db1040 as a backup before decommissioning
  • 07:48 marostegui: Deploy alter table on s1, on db1052 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166
  • 07:30 marostegui: Deploy alter table on s7, on db1062 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166
  • 07:24 marostegui: Deploy alter table on s4, on db1068 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166
  • 07:09 marostegui: Deploy alter table on s6, on db1061 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166
  • 06:56 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1071 - T162539 T163548 (duration: 02m 24s)
  • 06:45 marostegui: Deploy alter table on s2, on db1054 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166
  • 06:10 marostegui: Deploy alter table on s3, on db1075 (eqiad master) for tables: change_tag and tag_summary - T147166
  • 05:57 marostegui: Deploy alter table enwiki.revision on labsdb1011 - T132416
  • 00:20 catrope@naos: Synchronized php-1.29.0-wmf.21/extensions/Flow/modules/flow/ui/widgets/mw.flow.ui.ReplyWidget.js: T163749 (duration: 01m 24s)

2017-04-25

  • 22:24 mutante: mediawiki maintenance servers: last log entry was _before_ merging https://gerrit.wikimedia.org/r/#/c/342777/ and making a change
  • 22:23 andrewbogott: re-enabling dns on labservices1001
  • 22:22 mutante: mediawiki maintenance servers: making wasat identical to terbium. wasat is currently the active server running crons. no change there at all. on terbium where crons are inactive, some log files were removed
  • 22:13 twentyafterfour@naos: rebuilt wikiversions.php and synchronized wikiversions files: group0 wikis to 1.29.0-wmf.21
  • 22:08 madhuvishy: Reenabled labs instance creation and deletion on horizon
  • 22:05 twentyafterfour@naos: Finished scap: sync 1.29.0-wmf.21 to testwikis (pre-group0) refs T161733 (attempt #5) (duration: 21m 52s)
  • 22:02 andrewbogott: causing an intentional outage of labs-ns0 and labs-recursor0 to make sure we're properly girded for tomorrow's switch replacement.
  • 21:43 twentyafterfour@naos: Started scap: sync 1.29.0-wmf.21 to testwikis (pre-group0) refs T161733 (attempt #5)
  • 21:41 twentyafterfour@naos: scap failed: CalledProcessError Command 'cp -r "/tmp/scap_l10n_66989801"/* "/srv/mediawiki-staging/php-1.29.0-wmf.21/cache/l10n"' returned non-zero exit status 1 (duration: 03m 38s)
  • 21:38 twentyafterfour@naos: Started scap: sync 1.29.0-wmf.21 to testwikis (pre-group0) refs T161733 (attempt #4)
  • 21:33 twentyafterfour@naos: scap failed: CalledProcessError Command 'cp -r "/tmp/scap_l10n_930292683"/* "/srv/mediawiki-staging/php-1.29.0-wmf.21/cache/l10n"' returned non-zero exit status 1 (duration: 03m 46s)
  • 21:30 twentyafterfour@naos: Started scap: sync 1.29.0-wmf.21 to testwikis (pre-group0) refs T161733 (attempt #3)
  • 21:23 twentyafterfour@naos: scap failed: CalledProcessError Command 'cp -r "/tmp/scap_l10n_2414756836"/* "/srv/mediawiki-staging/php-1.29.0-wmf.21/cache/l10n"' returned non-zero exit status 1 (duration: 00m 54s)
  • 21:23 twentyafterfour@naos: Started scap: sync 1.29.0-wmf.21 to testwikis (pre-group0) refs T161733 (attempt #2)
  • 21:09 twentyafterfour@naos: scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_3498979833" --threads=30 --lang en --quiet' returned non-zero exit status 1 (duration: 01m 56s)
  • 21:07 twentyafterfour@naos: Started scap: sync 1.29.0-wmf.21 to testwikis (pre-group0) refs T161733
  • 20:00 madhuvishy: Labs instance creation and deletion on horizon temporarily disabled via https://gerrit.wikimedia.org/r/350266
  • 19:50 demon@naos: Synchronized wmf-config/CommonSettings-labs.php: no-op, beta change (duration: 01m 58s)
  • 18:55 chasemp: restart nova-fullstack on labnet1001
  • 18:50 chasemp: downtime labservices1001 as we fail away from it and puppet staleness on labservices1002
  • 18:38 andrewbogott: disabling nova-api for another try at labservices failover
  • 18:33 twentyafterfour: Deployment Train: Branching mediawiki wmf/1.29.0-wmf.21 from master refs T161733
  • 17:36 jynus: running test schema change on etwiki on eqiad (depooled) T17441
  • 17:35 RainbowSprinkles: gerrit: Quick reboot to pick up new bouncycastle library
  • 17:25 arlolra: Updated Parsoid to 55b90511 (T153885, T163330, T89262, T154709, T162919, T161306)
  • 17:20 moritzm: rebooting ruthenium for update to Linux 4.9
  • 17:19 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 00m 07s)
  • 17:19 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 17:18 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 00m 05s)
  • 17:18 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 17:18 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 00m 08s)
  • 17:18 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 17:18 arlolra@naos: Finished deploy [parsoid/deploy@719d7bd]: Updating Parsoid to 55b90511 (duration: 08m 02s)
  • 17:17 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 00m 07s)
  • 17:17 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 17:11 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 02m 18s)
  • 17:09 arlolra@naos: Started deploy [parsoid/deploy@719d7bd]: Updating Parsoid to 55b90511
  • 17:08 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 16:54 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 00m 25s)
  • 16:53 godog: flush wikiwix cache from planet2001 and rebuild files
  • 16:53 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 16:53 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 00m 07s)
  • 16:53 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 16:50 andrewbogott: labservices failover aborted due to cryptic routing/firewall issue
  • 16:45 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2255.codfw.wmnet,service=apache2
  • 16:44 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: enable wildcard topic config (duration: 00m 20s)
  • 16:44 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: enable wildcard topic config
  • 16:42 godog: flush wikiwix cache from planet1001 and rebuild files
  • 16:41 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2255.codfw.wmnet,service=apache2
  • 16:41 akosiaris@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2256.codfw.wmnet,service=apache2
  • 16:40 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2256.codfw.wmnet,service=apache2
  • 16:38 andrewbogott: stopping nova-api for labservices switchover
  • 16:36 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: enable wildcard topic config (duration: 00m 53s)
  • 16:35 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: enable wildcard topic config
  • 16:29 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: enable wildcard topic config (duration: 00m 04s)
  • 16:29 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: enable wildcard topic config
  • 16:18 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 00m 06s)
  • 16:17 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 16:09 thcipriani@naos: Synchronized README: test new scap version (duration: 01m 03s)
  • 15:59 akosiaris: restart pybal on lvs[2001-2002].codfw.wmnet,lvs[3001-3002].esams.wmnet,lvs[4001-4002].ulsfo.wmnet,lvs[1001-1002].wikimedia.org T159687
  • 15:50 moritzm: installing libav security updates
  • 15:48 bawolff@naos: Synchronized wmf-config/CommonSettings-labs.php: Test account creation limits on labs (duration: 01m 14s)
  • 15:47 akosiaris: restart pybal on lvs2003.codfw.wmnet,lvs3003.esams.wmnet,lvs4003.ulsfo.wmnet,lvs1003.wikimedia.org T159687
  • 15:46 marostegui: Stop replication on db1086 and db1094 in sync - https://phabricator.wikimedia.org/T130067
  • 15:36 mobrovac@naos: Finished deploy [changeprop/deploy@7521b2f]: Bring back the concurrency level - T163292 (duration: 01m 13s)
  • 15:35 mobrovac@naos: Started deploy [changeprop/deploy@7521b2f]: Bring back the concurrency level - T163292
  • 15:33 jynus: stopping replication on dbstore1001 to change its replication topology
  • 15:33 akosiaris: restart pybal on lvs[2004-2006].codfw.wmnet,lvs3004.esams.wmnet,lvs4004.ulsfo.wmnet,lvs[1004-1006].wikimedia.org T159687
  • 15:28 filippo@neodymium: conftool action : set/pooled=yes; selector: name=mw2017.codfw.wmnet
  • 15:27 mobrovac@naos: Finished deploy [changeprop/deploy@e0e3684]: Bring back the concurrency level - T163292 (duration: 00m 10s)
  • 15:26 mobrovac@naos: Started deploy [changeprop/deploy@e0e3684]: Bring back the concurrency level - T163292
  • 15:18 ema: start cache_text upgrade to linux 4.9 T162029
  • 15:14 marostegui: Deploy alter table s7 on watchlist table directly on the master (db1062) - https://phabricator.wikimedia.org/T130067
  • 15:14 filippo@neodymium: conftool action : set/pooled=no; selector: name=mw2017.codfw.wmnet
  • 14:59 jynus@naos: Synchronized wmf-config/db-eqiad.php: switch s7 eqiad master from db1041 to db1062 (duration: 00m 54s)
  • 14:54 bblack: upgrading nginx on cp1008
  • 14:30 bawolff@naos: Synchronized private/PrivateSettings.php: rv change to T163477 to see if it fixes logging (duration: 01m 14s)
  • 14:27 bawolff: Logging has seemed to stop after last deploy to private settings :(
  • 14:20 bblack: uploaded WMF nginx-1.11.10-1+wmf1 packages to jessie-wikimedia repo
  • 14:17 marostegui: Stop replication in sync on db1089 and db1083 for maintenance - https://phabricator.wikimedia.org/T130067
  • 14:08 jynus: restarting mariadb on db1062
  • 14:07 jynus: moving s7 eqiad replicas under db1062
  • 14:02 godog: poweroff ms-be1016 for controller swap - T150206
  • 14:02 bawolff@naos: Synchronized wmf-config/PrivateSettings.php: Hopefully cause previous changes to be picked up try2 (duration: 00m 44s)
  • 13:58 bawolff@naos: Synchronized wmf-config/PrivateSettings.php: Hopefully cause previous changes to be picked up (duration: 00m 44s)
  • 13:51 hashar: European SWAT complete
  • 13:49 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Re-enable ContentTranslation - T163344 (duration: 00m 44s)
  • 13:37 hashar@naos: Synchronized php-1.29.0-wmf.20/includes/media/TransformationalImageHandler.php: media: Capture stderr when running convert --version - T158649 (duration: 00m 47s)
  • 13:35 moritzm: rebooting einsteinium for update to Linux 4.9
  • 13:31 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Fix namespace Wikipedia_talk for zh_classicalwiki - T162547 (duration: 00m 48s)
  • 13:24 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Two namespace aliases for zh_classicalwiki - T162547 (duration: 00m 49s)
  • 13:22 marostegui: Deploy alter table on s3 (only etwiki) for tag_summary and change_tag tables - T147166
  • 13:20 hashar@naos: Synchronized php-1.29.0-wmf.20/includes: Fix bogus field reference in Category::getCountMessage() callback - T162941 (duration: 01m 14s)
  • 13:16 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Add NS aliases for zh_classicalwiki - T162547 (duration: 01m 00s)
  • 13:15 marostegui: Deploy alter table on silver.watchlist and labtestweb2001.labtestwiki for the watchlist table - T130067
  • 13:12 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Add Draft namespace to zh_classicalwiki - T163655 (duration: 01m 19s)
  • 13:10 hashar: zh_classicalwiki : renamed broken page via namespaceDupes.php : id=73504 ns=0 dbk=模板:Protected_logo -> 模板:Protected_logobroken
  • 12:35 marostegui: Stop replication in sync on db1092 and db1087 for maintenance - https://phabricator.wikimedia.org/T130067
  • 11:57 gehel: banning elasticsearch row D node in preparation for maintenance
  • 11:46 marostegui: Deploy alter table s5 on watchlist table directly on the master (db1049) - https://phabricator.wikimedia.org/T130067
  • 11:28 jynus@naos: Synchronized wmf-config/db-eqiad.php: Depool db1022, promote db1061 as the s6 eqiad master (duration: 01m 17s)
  • 11:27 marostegui: Deploy alter table s1 on watchlist table directly on the master (db1052) - https://phabricator.wikimedia.org/T130067
  • 11:01 jynus: switching eqiad s6 master to db1061
  • 10:45 jynus: stopping replication on db1050
  • 10:39 marostegui: Stop replication in sync on db1090 and db1076 for maintenance - https://phabricator.wikimedia.org/T130067
  • 10:15 jynus: restarting db1061's mysql process
  • 10:12 jynus: moving all slaves of s6 eqiad under db1061
  • 09:49 marostegui: Stop replication in sync on db1091 and db1084 for maintenance - T130067
  • 09:46 marostegui: Deploy alter table s2 on watchlist table directly on the master (db1054) - T130067
  • 09:10 jynus@naos: Synchronized wmf-config/db-eqiad.php: Promote db1054 as the new s2 master on eqiad (duration: 01m 19s)
  • 08:56 marostegui: Stop replication on db1088 and db1093 in sync - T130067
  • 08:53 jynus: restarting stopping replication on s2-eqiad and restarting db1054
  • 08:52 marostegui: Deploy alter table s4 commonswiki.watchlist directly on db1068 (eqiad master) - T130067
  • 08:24 marostegui: Stop MySQL db1041 (eqiad master) to reclone db1062 from it - T163665
  • 08:03 jynus: moving all slaves of s2 eqiad under db1054
  • 07:14 ema: upgrade cp3033 varnish-be to varnish 4.1.5-1wm2, expiry thread lock/priority workaround T145661
  • 06:34 marostegui: Deploy alter table on s3, all the wikis to the watchlist table on db1075, eqiad master - T130067
  • 06:10 marostegui@naos: Synchronized wmf-config/db-codfw.php: Restore db2061 original weight (duration: 00m 57s)
  • 06:06 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1071, depool db1026 - T162539 T163548 (duration: 01m 17s)
  • 05:41 marostegui: Deploy alter table enwiki.revision on labsdb1009 and labsdb1010 - T132416
  • 02:22 bawolff: deployed patch for T163477
  • 01:42 MaxSem: Deployed security patches for T163166
  • 00:53 bawolff: unconfirming emails associated with T163477
  • 00:38 mutante: ocg1001 - powercycle into installer, was sitting at partman step with "failure to read from sda"...
  • 00:25 twentyafterfour: restarted apache2 on iridium to tune rate limiting value
  • 00:16 twentyafterfour@naos: Synchronized wmf-config/CommonSettings.php: fix "Notice: Undefined variable: wmgRelatedArticlesFooterWhitelistedSkins" (duration: 01m 11s)

2017-04-24

  • 23:41 twentyafterfour@naos: Synchronized wmf-config/: deploy https://gerrit.wikimedia.org/r/#/c/348472/ refs T163114 (duration: 01m 05s)
  • 23:22 ejegg: updated civicrm from 40d88c0 to 061cd61
  • 23:08 ejegg: updated civicrm from a11c108 to 40d88c0
  • 22:46 bawolff: deploy patch for T155277
  • 21:53 hoo: Updated the sites and site_identifiers tables on all Wikidata clients for dtywiki T161529.
  • 21:41 ejegg: updated civicrm from 51dbbad to a11c108
  • 19:52 mattflaschen@naos: Finished scap: Full scap (due to ORES i18n change earlier), plus additional $wgHiddenPrefs change (duration: 17m 06s)
  • 19:35 mattflaschen@naos: Started scap: Full scap (due to ORES i18n change earlier), plus additional $wgHiddenPrefs change
  • 19:10 bblack: cp2026: restart to wm2 varnish package
  • 18:42 thcipriani@naos: Synchronized wmf-config/throttle.php: SWAT: New throttle rule T163726 (duration: 01m 03s)
  • 18:19 thcipriani@naos: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove defunct $wgForeignUploadTestEnabled for cross-wiki upload A/B test (duration: 00m 53s)
  • 18:18 jynus: disabling mysql replication eqiad -> codfw on s[1-7] and x1 shards T155099
  • 18:10 thcipriani@naos: Synchronized wmf-config/CommonSettings-labs.php: SWAT: Full path to xvfb-run (beta only change) (duration: 01m 07s)
  • 17:53 marostegui@naos: Synchronized wmf-config/db-codfw.php: Increase db2061 weight (duration: 00m 47s)
  • 17:46 marostegui: Alter table labtestwiki.user_groups on labtestweb2001 - T155605
  • 17:43 bblack: installing varnish 4.1.5-1wm2 on all cache_upload hosts @ codfw (no restarts)
  • 17:41 marostegui@naos: Synchronized wmf-config/db-codfw.php: Increase db2043 and db2061 weight (duration: 00m 49s)
  • 17:36 demon@naos: Synchronized dblists/group0.dblist: moving labstestwiki to group0 (duration: 00m 54s)
  • 17:35 bblack: upgrade cp2024 varnish-be to varnish 4.1.5-1wm2, expiry thread lock/priority workaround T145661
  • 17:28 marostegui@naos: Synchronized wmf-config/db-codfw.php: Increase db2043 and db2061 weight - T163339 (duration: 00m 58s)
  • 17:19 gehel: restarting wdqs-updater for new configuration
  • 17:10 gehel@naos: Finished deploy [wdqs/wdqs@481346a]: (no justification provided) (duration: 01m 47s)
  • 17:08 gehel@naos: Started deploy [wdqs/wdqs@481346a]: (no justification provided)
  • 16:58 marostegui@naos: Synchronized wmf-config/db-codfw.php: Repool db2043 and db2061 with less weight - T163339 (duration: 01m 16s)
  • 16:56 godog: poweroff prometheus2004 for memory upgrade - T163386
  • 16:11 ema: upgrade cp2017 varnish-be to varnish 4.1.5-1wm2, expiry thread lock/priority workaround T145661
  • 15:44 jynus: stopping all slaves on dbstore1001 for maintenance
  • 15:44 godog: poweroff prometheus2003 for memory upgrade - T163386
  • 15:28 mattflaschen@naos: Synchronized wmf-config/CommonSettings.php: T163696: Only copy filter thresholds if they are set (duration: 01m 10s)
  • 15:10 matt_flaschen: GuidedTour/RCFilters/ORES deployment complete and tested
  • 15:09 XioNoX: disabling the bgp session between pfw-codfw and cr2 for T163447
  • 15:07 ema: varnish 4.1.5-1wm2 uploaded to apt.w.o T145661
  • 15:06 matt_flaschen: Preference updates (for ORES on enwiki) done, using naos instead of terbium
  • 14:54 mattflaschen@naos: Synchronized php-1.29.0-wmf.20/extensions/ORES: Make the preference for the "r" flag on the RC page also control highlighting (duration: 00m 48s)
  • 14:50 mattflaschen@naos: Synchronized wmf-config/: Release RC Filters on more wikis and prep changes for that (duration: 00m 53s)
  • 14:39 matt_flaschen: Deployment of T152827 ("Enable GuidedTour on all wikis") complete and tested
  • 14:38 Dereckson: Created linter table on ptwikimedia and dtywiki
  • 14:34 mattflaschen@naos: Synchronized wmf-config/InitialiseSettings.php: Enable GuidedTour on all wikis (duration: 00m 59s)
  • 14:27 marostegui: Deploy alter table on s3 etwiki on watchlist table directly on the master (db1075) - T130067
  • 14:17 marostegui: Stop MySQL db2043 and db2061 for maintenance - https://phabricator.wikimedia.org/T163339
  • 14:14 marostegui@naos: Synchronized wmf-config/db-codfw.php: Depool db2043 and db2061 - T163339 (duration: 01m 08s)
  • 14:14 moritzm: rebooting ms1001 for kernel update to Linux 4.9
  • 14:10 hashar@naos: Finished scap: Full scap for namespaces related changes (T161529 and https://gerrit.wikimedia.org/r/#/c/349864/1) (duration: 16m 06s)
  • 14:09 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp2002.codfw.wmnet,service=varnish-be
  • 14:08 ema: re-pooling cp2002's varnish-be with increased priority for expiry thread T145661
  • 13:57 ema@neodymium: conftool action : set/pooled=no; selector: name=cp2002.codfw.wmnet,service=varnish-be
  • 13:54 hashar@naos: Started scap: Full scap for namespaces related changes (T161529 and https://gerrit.wikimedia.org/r/#/c/349864/1)
  • 13:50 addshore: Initial run of populateCognatePages.php complete. 27,595,121 rows in cognate_pages & 17,263,411 in cognate_titles
  • 13:49 godog: swift eqiad-prod: more weight on ms-be1028 -> ms-be1039 - T160640
  • 13:47 elukey: reimage analytics1003 to Jessie (Oozie/Hive/Camus not available during this timeframe in the Analytics Hadoop cluster)
  • 13:47 marostegui: Deploy unscheduled alter table on silver (labswiki.user_groups) - T159416
  • 13:26 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Enable user group expiry in production - T159416 (duration: 00m 49s)
  • 13:16 marostegui: Remove replication codfw - eqiad on s3 (db2018 codfw master will not be a slave of eqiad master) - https://phabricator.wikimedia.org/T130067 https://phabricator.wikimedia.org/T147166 T162133
  • 13:14 hashar@naos: Synchronized php-1.29.0-wmf.20/extensions/ProofreadPage/ProofreadPage.namespaces.php: Fix language code for Norwegian (duration: 00m 54s)
  • 13:12 marostegui: Deploy alter table on wikidatawiki.wb_terms on db1082 - T162539 - T163548
  • 13:11 marostegui: Deploy alter table on wikidatawiki.wb_terms on db1063 - T162539 https://phabricator.wikimedia.org/T163548
  • 13:10 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Make sysops able to grant/remove confirmed user group at cswiki - T163206 (duration: 00m 55s)
  • 13:09 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Raise autoconfirmed status requirements to 4 days, 10 edits at cswiki - T163207 (duration: 01m 09s)
  • 13:06 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Set timezone to Asia/Kolkata on wb.wikimedia - T163322 (duration: 00m 44s)
  • 13:05 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Remove all feeds added in T127176 from RSS whitelist for mw.org - T163217 (duration: 00m 45s)
  • 13:03 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Enable NewUserMessage on zh_classicalwiki - T163043 (duration: 00m 46s)
  • 12:52 aude@naos: Synchronized wmf-config/Wikibase-production.php: Disable use of new column in wb_terms table for now (duration: 00m 48s)
  • 12:46 aude@naos: Synchronized wmf-config/Wikibase-production.php: (no justification provided) (duration: 00m 47s)
  • 12:41 Dereckson: pt.wikimedia.org and dty.wikipedia.org wikis creation done
  • 12:38 dereckson@naos: Synchronized wmf-config/interwiki.php: +dty +wmpt and other fixes (duration: 00m 48s)
  • 12:28 Dereckson: mwscript extensions/WikimediaMaintenance/filebackend/setZoneAccess.php dtywiki --backend=local-multiwrite (T162874)
  • 12:14 dereckson@naos: Synchronized wmf-config/InitialiseSettings.php: Initial configuration for dty.wikipedia (T161529) (duration: 00m 49s)
  • 12:13 dereckson@naos: Synchronized langlist: +dty (T161529) (duration: 00m 50s)
  • 12:09 dereckson@naos: rebuilt wikiversions.php and synchronized wikiversions files: +dtywiki
  • 12:08 Dereckson: Creata dtywiki database (T161529)
  • 12:08 dereckson@naos: Synchronized dblists: +dtywiki (duration: 00m 56s)
  • 12:07 dereckson@naos: Synchronized static/images/project-logos/: Logo for dty.wikipedia (T161529) (duration: 01m 13s)
  • 11:59 Dereckson: Purged https://pt.wikimedia.org/ URL (T126832)
  • 11:55 dereckson@naos: Synchronized multiversion/MWMultiVersion.php: Entry point for pt.wikimedia.org (T126832) (duration: 00m 44s)
  • 11:50 Dereckson: mwscript extensions/WikimediaMaintenance/filebackend/setZoneAccess.php ptwikimedia --backend=local-multiwrite (T126832)
  • 11:48 dereckson@naos: Synchronized wmf-config/InitialiseSettings.php: Initial configuration for pt.wikimedia (T126832)
  • 11:42 dereckson@naos: rebuilt wikiversions.php and synchronized wikiversions files: +pt.wikimedia (T126832)
  • 11:42 dereckson@naos: Synchronized dblists/: Respawn pt.wikimedia configuration (duration: 00m 44s)
  • 11:41 Dereckson: Recreate database for ptwikimedia (T126832)
  • 11:28 dereckson@naos: Synchronized php-1.29.0-wmf.20/languages/messages/MessagesDty.php: Localize namespaces in Doteli (T162872) (duration: 00m 50s)
  • 11:27 dereckson@naos: Synchronized php-1.29.0-wmf.20/extensions/Gadgets/Gadgets.namespaces.php: Localize namespaces in Doteli (T162873) (duration: 00m 44s)
  • 11:26 dereckson@naos: Synchronized php-1.29.0-wmf.20/extensions/Scribunto/Scribunto.namespaces.php: Localize namespaces in Doteli (T162874) (duration: 00m 46s)
  • 11:16 addshore: addshore@wasat:~$ mwscriptwikiset extensions/Cognate/maintenance/populateCognatePages.php wiktionary.dblist --batch-size=1000
  • 11:14 addshore@naos: Synchronized wmf-config/InitialiseSettings-labs.php: Deploy Cognate to production wiktionaries T150182 PT 4/4 (duration: 00m 47s)
  • 11:12 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: Deploy Cognate to production wiktionaries T150182 PT 3/4 (touched) (duration: 00m 52s)
  • 11:02 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: Deploy Cognate to production wiktionaries T150182 PT 3/4 (duration: 00m 57s)
  • 11:01 addshore@naos: Synchronized wmf-config/CommonSettings-labs.php: Deploy Cognate to production wiktionaries T150182 PT 2/4 (duration: 01m 01s)
  • 10:57 addshore@naos: Synchronized wmf-config/CommonSettings.php: Deploy Cognate to production wiktionaries T150182 PT 1/4 (duration: 01m 18s)
  • 10:28 addshore: addshore@wasat:~$ mwscriptwikiset extensions/Cognate/maintenance/populateCognatePages.php wiktionary.dblist
  • 10:27 addshore: 180 rows added to cognate_titles & cognate_pages
  • 10:25 addshore: addshore@wasat:~$ mwscript extensions/Cognate/maintenance/populateCognatePages.php zawiktionary
  • 10:25 addshore: 172 sites added to cognate_sites
  • 10:24 addshore: addshore@wasat:~$ mwscript extensions/Cognate/maintenance/populateCognateSites.php enwiktionary --site-group=wiktionary
  • 10:16 addshore@naos: Finished scap: Add Cognate to extension-list T150182 (duration: 15m 26s)
  • 10:01 addshore@naos: Started scap: Add Cognate to extension-list T150182
  • 10:00 jynus: disabling puppet on app servers for apache config deploy T126832
  • 09:56 addshore@naos: Synchronized wmf-config/InitialiseSettings-labs.php: wmgUseInterwikiSorting true for wiktionaries PT 2/2 (duration: 00m 46s)
  • 09:54 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: wmgUseInterwikiSorting true for wiktionaries PT 1/2 (duration: 00m 47s)
  • 09:51 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: Configure InterwikiSorting orders for Wiktionaries PT 2/2 (duration: 00m 48s)
  • 09:50 addshore@naos: Synchronized wmf-config/InterwikiSortOrders.php: Configure InterwikiSorting orders for Wiktionaries PT 1/2 (duration: 00m 53s)
  • 09:49 jynus: testing mediawiki changes on mwdebug1001
  • 09:44 addshore@naos: Synchronized docroot/noc/conf/InterwikiSortOrders.php.txt: NOOP Add InterwikiSortOrders to noc docroot (docs only) (duration: 01m 00s)
  • 09:42 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: Use group0 to reduce lines for WMDE related config settings (duration: 01m 18s)
  • 09:15 marostegui: Stop MYSQL on db1062 to backup its mysql - T163665
  • 09:14 jynus: dropping ptwikimedia from es1012,es1016,es1018,es2011,es2012,es2013, T126832
  • 09:11 jynus: dropping ptwikimedia from es3 T126832
  • 09:08 jynus: dropping ptwikimedia from es2 T126832
  • 09:04 jynus: dropping ptwikimedia from x1 T126832
  • 08:55 jynus: dropping ptwikimedia from s3 T126832
  • 08:03 marostegui: Deploy alter table enwiki.revision on db1095 (sanitarium2) - T132416
  • 07:34 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1080 and db1067 (duration: 01m 18s)
  • 06:23 marostegui: Deploy alter table enwiki.revision db1052 (eqiad master) - T132416
  • 06:12 marostegui: Deploy alter table on wikidatawiki.wb_terms on db1087 - https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548
  • 06:12 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1092, depoll db1087 - T162539 T163548 (duration: 02m 19s)

2017-04-23

  • 19:13 ema: cp2020: restart varnish-be
  • 17:49 jynus: disabling puppet on db2062 and upgrading MariaDB package to 10.1 T116557
  • 03:12 andrewbogott: removing files in /srv/deployment/ocg/postmortem on ocg1003, another case of T162780

2017-04-22

  • 13:41 ema@neodymium: conftool action : set/pooled=no; selector: name=cp2024.codfw.wmnet,service=varnish-be
  • 07:53 jynus: restarting es2019.codfw.wmnet after upgrade
  • 07:43 jynus: powercycling es2019.codfw.wmnet, unresponsive
  • 07:21 jynus@naos: Synchronized wmf-config/db-codfw.php: Depool es2019 (duration: 02m 16s)
  • 03:21 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2024.codfw.wmnet,service=varnish-be
  • 02:56 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2024.codfw.wmnet,service=varnish-be
  • 02:18 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2002.codfw.wmnet,service=varnish-be
  • 00:34 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2002.codfw.wmnet,service=varnish-be

2017-04-21

  • 23:52 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2026.codfw.wmnet,service=varnish-be
  • 22:49 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2026.codfw.wmnet,service=varnish-be
  • 15:06 marostegui@naos: Synchronized wmf-config/db-codfw.php: Increase weight db2071 (duration: 01m 17s)
  • 14:32 marostegui: Analyze revision, logging and page table on s1 db1067 - https://phabricator.wikimedia.org/T116557
  • 14:26 ema: ban objects with CT < 1024 on codfw cache_upload T145661
  • 14:00 moritzm: installing postgresql bugfix update from jessie point release on labsdb1004
  • 13:35 marostegui: Deploy alter table on wikidatawiki.wb_terms on db1092 - T162539 T163548
  • 13:20 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Depool db1092 - T162539 T163548 (duration: 01m 18s)
  • 12:51 akosiaris: reboot puppetmaster1002 for kernel upgrade
  • 12:07 marostegui: Analyze revision, logging and page table on s1 db1080 - T116557
  • 12:07 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Update db1080 depool reason (duration: 01m 18s)
  • 10:35 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1071 - T163109 (duration: 01m 20s)
  • 09:20 moritzm: rebooting etherpad1001 (running etherpad.wikimedia.org) for update to Linux 4.9
  • 09:10 jynus: stopping and upgrading/reconfiguring db2062 (depooled) T116557
  • 08:49 jynus@naos: Synchronized wmf-config/db-codfw.php: Depool db2062 (duration: 01m 20s)
  • 08:32 akosiaris: looking at tcpircbot (logmsgbot) problems at tegmen
  • 08:20 elukey: rolling restart of aqs (nodejs) on aqs* to pick up upgrades
  • 08:01 moritzm: rolling restart of hhvm on application servers in eqiad to pick up ICU security update
  • 07:47 marostegui: Stop MySQL on db1071 and db1063 to reclone db1063 - T163109
  • 07:43 moritzm: installing further icu security updates
  • 06:21 marostegui: Restart MySQL on db1065 for maintenance - T163351
  • 06:09 marostegui: Deploy alter table enwiki.revision db1067 - T132416

2017-04-20

  • 22:28 twentyafterfour: enable rate limiting in phabricator
  • 22:17 paravoid: setting tw_reuse to 1 on dbproxy1003
  • 21:47 twentyafterfour: started phd on iridium
  • 21:31 twentyafterfour: stopped phd on iridium to reduce load on the database
  • 19:26 Amir1: deploy finished
  • 19:24 Amir1: start of ladsgroup@naos:/srv/mediawiki-staging/php-1.29.0-wmf.20$ scap sync-file php-1.29.0-wmf.20/extensions/ORES/includes/Hooks.php 'Disable ORES in Recentchangeslinked (T163063)'
  • 19:15 mutante: test logging in fundraising channel
  • 19:06 mutante: fixing duplicate ircecho situation - since today it should run from tegmen, the active icinga server
  • 17:51 mutante: restarted icinga-wm (ircecho) to pick up config change
  • 17:13 jynus: stopping replication on db1040
  • 17:09 andrewbogott: disabling puppet on serpens, seaborgium, pollux, dubnium, labservices1001, labservices1002 for tentative rollout of https://gerrit.wikimedia.org/r/#/c/348920/
  • 16:58 jynus: moving GTID s4 eqiad replicas under db1068
  • 16:46 ema: repool varnish-be on cp2017
  • 16:18 ema: depool varnish-be on cp2017
  • 16:08 elukey: uploaded piwik 2.17.1-1 to jessie-wikimedia main
  • 15:17 Amir1: deleting duplicate rows in ores_classification dated after revision 775502802 (dated April 15th) (T163337)
  • 15:16 XioNoX: disabling pybal on lvs2002 for T163323
  • 14:32 moritzm: upgrading tor on radium to 0.2.9.10
  • 14:23 moritzm: rebooting radium (tor relay) for kernel update to Linux 4.9
  • 14:09 moritzm: rebooting osmium for kernel update to Linux 4.9
  • 14:06 gehel: rolling restart of kartotherian / tilerator on maps codfw cluster
  • 13:58 gehel: rolling restart of kartotherian / tilerator on maps eqiad cluster
  • 13:58 marostegui: Stop MySQL on db1068 and db1081 for maintenance - T163110
  • 13:57 jynus: running reset slave all on db2019
  • 13:53 gehel: rolling restart of kartotherian / tilerator on maps-test cluster
  • 13:18 moritzm: restarting hhvm on mw2097/2098 to pick up icu security update
  • 13:11 elukey: upgrading Piwik to 2.17.1 (brief downtime during the maintenance announced)
  • 12:12 elukey: restart Yarn Resource manager on analytics1001 (hadoop master) to pick up new JVM settings
  • 12:11 moritzm: installing icu security updates
  • 11:32 _joe_: removing hack for jobqueue's refreshlinks T163418 from the jobrunners
  • 11:23 jynus: changing db2071 to replicate from db2016
  • 10:32 moritzm: installing remaining dbus updates from jessie point update
  • 10:07 elukey: restart Yarn Resource manager on analytics1002 (hadoop master standby) to pick up new JVM settings
  • 09:47 Amir1: running the cleanup script for ores_classification in enwiki
  • 09:38 _joe_: live-hack redeployed, running scap pull on codfw jobrunners T163418
  • 09:38 _joe_: live-hack redeployed, running scap pull on codfw jobrunners
  • 09:34 hashar@naos: Synchronized rpc/RunJobs.php: Revert "rpc: raise exception instead of die" - causes monitoring spam (duration: 01m 20s)
  • 09:17 _joe_: removed the live hack, running scap pull again on mw2154
  • 09:14 _joe_: scap pull of live hack for T163418 on mw2154
  • 08:47 _joe_: live-patching ./includes/jobqueue/jobs/RefreshLinksJob.php to drop all recursive jobs, T163418
  • 07:59 jynus: shutting down db1080 for cloning and upgrade T163413
  • 07:54 jynus@naos: Synchronized wmf-config/db-codfw.php: Add db2071, depooled (duration: 00m 53s)
  • 07:53 jynus@naos: Synchronized wmf-config/db-eqiad.php: Depool db1080 (duration: 01m 02s)
  • 07:53 marostegui: Deploy alter table enwiki.revision db1065 - https://phabricator.wikimedia.org/T132416
  • 07:31 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T132416 (duration: 02m 18s)
  • 07:12 marostegui: Deploy alter table on s4.image on eqiad master db1040 (this will create lag on eqiad - all hosts have been silenced) - https://phabricator.wikimedia.org/T73563
  • 06:39 marostegui: Deploy alter table on s4.oldimage on eqiad master db1040 (this will create lag on eqiad - all hosts have been silenced) - T73563
  • 01:37 mutante: mw2150 - restarted hhvm (had 'thread leakage' alert)
  • 01:28 mutante: ran puppet on all (16) Dell R320 via cumin to add CPU frequency check
  • 00:37 ejegg: updated CiviCRM from 90d679b to 51dbbad

2017-04-19

  • 23:58 ejegg: updated payments-wiki from ccfbf98 to ee7d402
  • 22:37 papaul: OS installation on db2071
  • 21:44 ejegg: updated SmashPig from 17c56b0 to 200f63e
  • 21:37 krinkle@naos: Synchronized php-1.29.0-wmf.20/resources/src/startup.js: I34bbe8edf - Fix js fatal (duration: 01m 20s)
  • 20:08 ejegg: updated payments-wiki from 5398b23 to ccfbf98
  • 19:22 krinkle@naos: Synchronized php-1.29.0-wmf.20/resources/src/mediawiki/mediawiki.js: Ie50bdd (duration: 00m 58s)
  • 19:20 krinkle@naos: Synchronized php-1.29.0-wmf.20/extensions/WikimediaEvents/extension.json: T162604 (duration: 01m 20s)
  • 19:17 XenoRyet: Updated SmashPig from 3db064d to 17c56b0
  • 18:58 ejegg: rolled back payments-wiki to 5398b23
  • 18:56 ejegg: updated payments-wiki from 5398b23 to 68e3ac6
  • 18:27 ariel@naos: Finished deploy [dumps/dumps@ad621e6]: doc fixes thanks to awight (duration: 00m 04s)
  • 18:27 ariel@naos: Started deploy [dumps/dumps@ad621e6]: doc fixes thanks to awight
  • 18:25 ejegg: updated payments-wiki from 36f38f6 to 5398b23
  • 18:19 mobrovac: restbase stopping RB and disabling puppet on restbase1018 due to T163292
  • 18:18 ariel@naos: Finished deploy [dumps/dumps@101f8a4]: page range fixes and standalone scripts (duration: 00m 18s)
  • 18:18 ariel@naos: Started deploy [dumps/dumps@101f8a4]: page range fixes and standalone scripts
  • 17:27 Amir1: mwscript extensions/ORES/maintenance/CleanDuplicateScores.php on all wikis with ORES review tool enabled (T163337)
  • 17:26 thcipriani@naos: Synchronized docroot/noc/index.html: test scap on naos.codfw.wmnetdocroot/noc/index.html: trailing whitespace (duration: 02m 02s)
  • 17:25 mobrovac@naos: Started restart [restbase/deploy@1bfada4]: Restart to stop trying to connect to dead restbase1018 Cassandra instances - T163292
  • 17:08 thcipriani@naos.codfw.wmnet: test
  • 17:03 filippo@naos: Finished deploy [prometheus/jmx_exporter@7327459]: test deploy from naos (duration: 00m 03s)
  • 17:03 filippo@naos: Started deploy [prometheus/jmx_exporter@7327459]: test deploy from naos
  • 17:02 godog: bounce tcpircbot on einsteinium to pick up changes
  • 17:02 _joe_: running manally enwiki refreshLinks jobs to catch up a bit
  • 16:59 papaul: power balancing on mw2215
  • 16:58 Amir1: ladsgroup@naos:~$ mwscript extensions/ORES/maintenance/CleanDuplicateScores.php --wiki=enwiki froze
  • 16:49 Amir1: ladsgroup@naos:~$ mwscript extensions/ORES/maintenance/CleanDuplicateScores.php --wiki=enwiki (T163337)
  • 16:33 godog: deploy.fixurl on G@deployment_target:* after deployment server switchover
  • 16:20 gehel: disabling deprecation warning logs on elasticsearch eqiad - T163345
  • 16:19 jynus: setting db2033 as read write
  • 16:13 godog: run puppet on naos.codfw.wmnet - new deployment server
  • 16:03 gehel: disabling deprecation warning logs on elasticsearch codfw - T163345
  • 15:51 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=codfw,cluster=elasticsearch,name=elastic2020.*
  • 15:49 jynus: shutting down db2033 (x1-master)
  • 15:48 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=codfw,cluster=appserver,name=mw2256.*
  • 15:48 jynus@tin: Synchronized wmf-config/db-codfw.php: Failing over x1-master (duration: 00m 41s)
  • 15:46 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2020.codfw.wmnet
  • 15:42 jynus@tin: Synchronized wmf-config/InitialiseSettings.php: Disable cx_translation- it is causing an outage on x1 (duration: 02m 44s)
  • 15:40 dzahn@puppetmaster2001: conftool action : set/pooled=no; selector: name=mw2256.codfw.wmnet
  • 15:32 mutante: mw2256 went down and showed " PANIC: double fault, error_code: 0x0"
  • 15:16 jynus@tin: Synchronized wmf-config/db-codfw.php: Pool db2055 as an additional API server (duration: 01m 02s)
  • 15:11 _joe_: ran cumin 'R:class = role::mediawiki::jobrunner and *.eqiad.wmnet' 'systemctl reset-failed' manually
  • 15:07 godog: start swiftrepl on ms-fe1005 for codfw switchover
  • 15:04 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_restart_parsoid(eqiad, codfw) Successfully completed
  • 14:53 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2256.codfw.wmnet,service=apache2
  • 14:53 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2256.codfw.wmnet,service=nginx
  • 14:48 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2256.codfw.wmnet,service=nginx
  • 14:48 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2256.codfw.wmnet,service=apache2
  • 14:46 gehel: banning elastic2020 from codfw cluster - T149006
  • 14:46 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_restart_parsoid(eqiad, codfw) Rolling restart parsoid in eqiad and codfw
  • 14:44 oblivian@tin: Synchronized wmf-config/ProductionServices.php: Fix redis locks (duration: 02m 24s)
  • 14:41 akosiaris: powercycle mw2256
  • 14:33 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_tendril(eqiad, codfw) Successfully completed
  • 14:33 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_tendril(eqiad, codfw) Update Tendril configuration for the new masters
  • 14:33 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_start_maintenance(eqiad, codfw) Successfully completed
  • 14:31 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_start_maintenance(eqiad, codfw) Start MediaWiki maintenance in the new master DC
  • 14:31 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_restore_ttl(eqiad, codfw) Successfully completed
  • 14:31 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_restore_ttl(eqiad, codfw) Restore the TTL of all the MediaWiki discovery records
  • 14:30 switchdc: (volans@sarin) END TASK - switchdc.stages.t08_stop_mediawiki_readonly(eqiad, codfw) Successfully completed
  • 14:30 switchdc: (volans@sarin) MediaWiki read-only period ends at: 2017-04-19 14:30:05.678665
  • 14:30 root@tin: Synchronized wmf-config/db-codfw.php: Set MediaWiki in read-write mode in datacenter codfw (duration: 00m 18s)
  • 14:29 switchdc: (volans@sarin) START TASK - switchdc.stages.t08_stop_mediawiki_readonly(eqiad, codfw) Set MediaWiki in read-write mode (db_to config already merged and git pulled)
  • 14:28 switchdc: (volans@sarin) END TASK - switchdc.stages.t07_coredb_masters_readwrite(eqiad, codfw) Successfully completed
  • 14:28 switchdc: (volans@sarin) START TASK - switchdc.stages.t07_coredb_masters_readwrite(eqiad, codfw) set core DB masters in read-write mode
  • 14:25 switchdc: (volans@sarin) END TASK - switchdc.stages.t06_redis(eqiad, codfw) Successfully completed
  • 14:25 switchdc: (volans@sarin) START TASK - switchdc.stages.t06_redis(eqiad, codfw) Switch the Redis replication
  • 14:25 switchdc: (volans@sarin) END TASK - switchdc.stages.t05_switch_traffic(eqiad, codfw) Successfully completed
  • 14:22 switchdc: (volans@sarin) START TASK - switchdc.stages.t05_switch_traffic(eqiad, codfw) Switch traffic flow to the appservers in the new datacenter
  • 14:22 switchdc: (volans@sarin) END TASK - switchdc.stages.t05_switch_datacenter(eqiad, codfw) Successfully completed
  • 14:22 root@tin: Synchronized wmf-config/CommonSettings.php: Switch MediaWiki active datacenter to codfw (duration: 00m 19s)
  • 14:21 switchdc: (volans@sarin) START TASK - switchdc.stages.t05_switch_datacenter(eqiad, codfw) Switch MediaWiki configuration to the new datacenter
  • 14:21 switchdc: (volans@sarin) END TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) Successfully completed
  • 14:15 switchdc: (volans@sarin) START TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) wipe and warmup caches
  • 14:15 switchdc: (volans@sarin) END TASK - switchdc.stages.t03_coredb_masters_readonly(eqiad, codfw) Successfully completed
  • 14:15 switchdc: (volans@sarin) START TASK - switchdc.stages.t03_coredb_masters_readonly(eqiad, codfw) set core DB masters in read-only mode
  • 14:14 switchdc: (volans@sarin) END TASK - switchdc.stages.t02_start_mediawiki_readonly(eqiad, codfw) Successfully completed
  • 14:14 root@tin: Synchronized wmf-config/db-eqiad.php: Set MediaWiki in read-only mode in datacenter eqiad (duration: 01m 29s)
  • 14:13 switchdc: (volans@sarin) MediaWiki read-only period starts at: 2017-04-19 14:12:54.007017
  • 14:12 switchdc: (volans@sarin) START TASK - switchdc.stages.t02_start_mediawiki_readonly(eqiad, codfw) Set MediaWiki in read-only mode (db_from config already merged and git pulled)
  • 14:09 switchdc: (volans@sarin) END TASK - switchdc.stages.t01_stop_maintenance(eqiad, codfw) Successfully completed
  • 14:07 switchdc: (volans@sarin) START TASK - switchdc.stages.t01_stop_maintenance(eqiad, codfw) Stop MediaWiki maintenance in the old master DC
  • 14:06 godog: stop swiftrepl on ms-fe1005 for codfw switchover
  • 14:06 switchdc: (volans@sarin) END TASK - switchdc.stages.t00_reduce_ttl(eqiad, codfw) Successfully completed
  • 14:06 switchdc: (volans@sarin) START TASK - switchdc.stages.t00_reduce_ttl(eqiad, codfw) Reduce the TTL of all the MediaWiki discovery records
  • 14:06 switchdc: (volans@sarin) END TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Successfully completed
  • 14:05 switchdc: (volans@sarin) START TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Disabling puppet on selected hosts
  • 14:00 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2014.codfw.wmnet,service=varnish-be
  • 13:42 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2014.codfw.wmnet,service=varnish-be
  • 13:28 urandom: cqlsh -f /etc/cassandra/adduser.cql, recreating user/perms (as-needed)
  • 12:38 urandom: T163292: Starting removal of Cassandra instance restbase1018-c.eqiad.wmnet
  • 11:36 oblivian:: Setting swift-rw in eqiad DOWN
  • 11:36 oblivian:: Setting swift-rw in codfw UP
  • 11:36 ema: repool varnish-be on cp3044
  • 11:23 godog: add naos to git-deploy term on common-infrastructure4 - T162900
  • 11:03 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) Successfully completed
  • 10:57 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) wipe and warmup caches
  • 10:56 _joe_: running the warmup stage in codfw for final testing
  • 10:41 ema: depool varnish-be on cp3044 because of mailbox lag issues
  • 09:34 moritzm: installing dbus security updates
  • 09:11 elukey: cleaning up ocg1003's /srv/deployment/ocg/postmortem dir (root partition filled up)
  • 07:26 hoo: Updated the sites and site_identifiers tables on all Wikidata clients for T149522.
  • 06:57 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t06_redis(codfw, eqiad) Successfully completed
  • 06:56 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t06_redis(codfw, eqiad) Switch the Redis replication
  • 06:52 _joe_: artificially stopping slave replication on rdb2001 for a final test of the switchover redis stage
  • 03:53 urandom: T163292: Starting removal of Cassandra instance restbase1018-b.eqiad.wmnet
  • 03:49 mobrovac@tin: Started restart [restbase/deploy@1bfada4]: (no justification provided)
  • 03:40 mobrovac@tin: Started restart [restbase/deploy@1bfada4]: Kick RB to pick up restbase1018 instances are gone
  • 03:32 mobrovac@tin: Finished deploy [changeprop/deploy@a19ebf8]: Temp: Decrease the transclusion update from 400 to 200 for T163292 (duration: 00m 53s)
  • 03:31 mobrovac@tin: Started deploy [changeprop/deploy@a19ebf8]: Temp: Decrease the transclusion update from 400 to 200 for T163292
  • 01:58 mutante: naos: rsyncd is of course legitimately running on a deployment server sepearate from this (unlike in other cases where we used it for syncing during migration), so this was just the one config fragment for /home and not removing the service or anything
  • 01:56 mutante: naos: manually deleting rsyncd config remnants (puppet wouldn't know to clean up after itself)
  • 01:47 mutante: rsyncing /home from mira to naos (T162900)
  • 01:21 urandom: T163292: Starting removal of Cassandra instance restbase1018-a.eqiad.wmnet

2017-04-18

  • 23:04 dzahn@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1018.eqiad.wmnet
  • 23:02 mutante: ms1001 - deleting old GlobalCert SSL cert for dumps.wm that was about to expire and is replaced by Letsencrypt,
  • 22:30 mutante: ocg1003 gzipping ocg.log for disk space
  • 21:12 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2002.codfw.wmnet,service=varnish-be
  • 20:36 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2002.codfw.wmnet,service=varnish-be
  • 17:26 mobrovac@tin: Finished deploy [restbase/deploy@1bfada4]: Blacklist all user pages on commons (duration: 07m 12s)
  • 17:26 ssastry@tin: Finished deploy [parsoid/deploy@b067328]: Deploying Parsoid to bump heap limits to 900m (from 600m) (duration: 06m 25s)
  • 17:19 ssastry@tin: Started deploy [parsoid/deploy@b067328]: Deploying Parsoid to bump heap limits to 900m (from 600m)
  • 17:19 mobrovac@tin: Started deploy [restbase/deploy@1bfada4]: Blacklist all user pages on commons
  • 17:12 XenoRyet: updated tools from a8b8d72 to a1e9342
  • 17:09 elukey: restart nutcracker in codfw (profile::mediawiki::nutcracker) to make sure that all the daemons are running with the latest config
  • 16:26 bblack: completed Traffic-layer portions of codfw switchover ( https://wikitech.wikimedia.org/wiki/Switch_Datacenter#Switchover_2 )
  • 16:21 bblack: starting Traffic-layer portions of codfw switchover ( https://wikitech.wikimedia.org/wiki/Switch_Datacenter#Switchover_2 )
  • 16:15 jynus: reimporting some rows to dbstore1002 on jawiki and ruwiki T160509
  • 16:12 godog: reboot tin to fix cpu mhz issue and check bios settings - T163158
  • 16:09 mobrovac@tin: Finished deploy [restbase/deploy@960b468]: Blacklist an enwiki and a commons page (duration: 08m 16s)
  • 16:01 mobrovac@tin: Started deploy [restbase/deploy@960b468]: Blacklist an enwiki and a commons page
  • 16:00 mobrovac@tin: Finished deploy [restbase/deploy@960b468]: Dev Cluster: Blacklist an enwiki and a commons page (duration: 01m 42s)
  • 15:58 mobrovac@tin: Started deploy [restbase/deploy@960b468]: Dev Cluster: Blacklist an enwiki and a commons page
  • 15:20 elukey: restored default output-buffer config for rdb2005:6479
  • 15:08 godog: puppet-run on cache_upload in codfw/eqiad to pick up swift a/p changes
  • 15:02 godog: puppet-run on cache_upload in codfw/eqiad to pick up switch a/a changes
  • 15:02 gehel: upgrading elastic2020 to elasticsearch 5.1.2
  • 14:55 _joe_: switchover of services, misc things done
  • 14:54 oblivian:: Setting restbase-async in codfw DOWN
  • 14:54 oblivian:: Setting restbase-async in eqiad UP
  • 14:43 _joe_: switching traffic for all a/a services plus maps and restbase to codfw-only
  • 14:38 _joe_: forcing puppet run on caches for catching up with the a/a setting of maps and restbase
  • 14:33 oblivian:: Setting restbase in eqiad DOWN
  • 14:33 _joe_: starting switchover of services eqiad => codfw; external traffic will be switched over, as well as internal traffic to restbase
  • 14:25 gehel: un-ban elastic2020 to get ready for real-life test during switchover - T149006
  • 14:22 elukey: executed config set client-output-buffer-limit "normal 0 0 0 slave 2147483648 2147483648 300 pubsub 33554432 8388608 60" on rdb2005:6749 as attempt to solve slave lagging - T159850
  • 14:21 oblivian:: Setting mobileapps in eqiad UP
  • 14:14 oblivian:: Setting mobileapps in eqiad DOWN
  • 14:11 elukey: executed CONFIG SET appendfsync everysec (default) to restore defaults on rdb2005:6479- T159850
  • 14:08 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t09_restart_parsoid(codfw, eqiad) Successfully completed
  • 14:04 elukey: executed CONFIG SET appendfsync no on rdb2005:6479 to test if fsync stalls affect replication - T159850
  • 13:50 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t09_restart_parsoid(codfw, eqiad) Rolling restart parsoid in eqiad and codfw
  • 13:35 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t01_stop_maintenance(codfw, eqiad) Failed to execute
  • 13:35 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t01_stop_maintenance(codfw, eqiad) Stop MediaWiki maintenance in the old master DC
  • 12:32 moritzm: upgrading labnodepool1001 to Linux 4.9
  • 12:13 moritzm: upgraded mw1261 to HHVM 3.18.2+wmf2
  • 11:39 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Successfully completed
  • 11:38 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Start MediaWiki maintenance in the new master DC
  • 11:37 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_tendril(codfw, eqiad) Successfully completed
  • 11:37 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_tendril(codfw, eqiad) Update Tendril configuration for the new masters
  • 11:35 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_tendril(eqiad, codfw) Successfully completed
  • 11:35 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_tendril(eqiad, codfw) Update Tendril configuration for the new masters
  • 11:34 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_tendril(codfw, eqiad) Successfully completed
  • 11:34 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_tendril(codfw, eqiad) Update Tendril configuration for the new masters
  • 11:33 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_restore_ttl(codfw, eqiad) Successfully completed
  • 11:33 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_restore_ttl(codfw, eqiad) Restore the TTL of all the MediaWiki discovery records
  • 11:31 switchdc: (volans@sarin) END TASK - switchdc.stages.t08_stop_mediawiki_readonly(codfw, eqiad) Successfully completed
  • 11:31 switchdc: (volans@sarin) START TASK - switchdc.stages.t08_stop_mediawiki_readonly(codfw, eqiad) Set MediaWiki in read-write mode (db_to config already merged and git pulled)
  • 11:30 switchdc: (volans@sarin) END TASK - switchdc.stages.t07_coredb_masters_readwrite(codfw, eqiad) Successfully completed
  • 11:30 switchdc: (volans@sarin) START TASK - switchdc.stages.t07_coredb_masters_readwrite(codfw, eqiad) set core DB masters in read-write mode
  • 11:18 switchdc: (volans@sarin) END TASK - switchdc.stages.t06_redis(codfw, eqiad) Successfully completed
  • 11:18 switchdc: (volans@sarin) START TASK - switchdc.stages.t06_redis(codfw, eqiad) Switch the Redis replication
  • 11:14 moritzm: upgrading logstash* to Linux 4.9
  • 10:58 switchdc: (volans@sarin) END TASK - switchdc.stages.t05_switch_traffic(codfw, eqiad) Successfully completed
  • 10:56 switchdc: (volans@sarin) START TASK - switchdc.stages.t05_switch_traffic(codfw, eqiad) Switch traffic flow to the appservers in the new datacenter
  • 10:56 switchdc: (volans@sarin) END TASK - switchdc.stages.t05_switch_datacenter(codfw, eqiad) Successfully completed
  • 10:55 switchdc: (volans@sarin) START TASK - switchdc.stages.t05_switch_datacenter(codfw, eqiad) Switch MediaWiki configuration to the new datacenter
  • 10:48 switchdc: (volans@sarin) END TASK - switchdc.stages.t03_coredb_masters_readonly(codfw, eqiad) Failed to execute
  • 10:48 switchdc: (volans@sarin) START TASK - switchdc.stages.t03_coredb_masters_readonly(codfw, eqiad) set core DB masters in read-only mode
  • 10:43 switchdc: (volans@sarin) END TASK - switchdc.stages.t02_start_mediawiki_readonly(codfw, eqiad) Successfully completed
  • 10:43 switchdc: (volans@sarin) START TASK - switchdc.stages.t02_start_mediawiki_readonly(codfw, eqiad) Set MediaWiki in read-only mode (db_from config already merged and git pulled)
  • 10:33 switchdc: (volans@sarin) END TASK - switchdc.stages.t01_stop_maintenance(codfw, eqiad) Failed to execute
  • 10:33 switchdc: (volans@sarin) START TASK - switchdc.stages.t01_stop_maintenance(codfw, eqiad) Stop MediaWiki maintenance in the old master DC
  • 10:31 switchdc: (volans@sarin) END TASK - switchdc.stages.t00_reduce_ttl(codfw, eqiad) Successfully completed
  • 10:31 switchdc: (volans@sarin) START TASK - switchdc.stages.t00_reduce_ttl(codfw, eqiad) Reduce the TTL of all the MediaWiki discovery records
  • 10:31 switchdc: (volans@sarin) END TASK - switchdc.stages.t00_disable_puppet(codfw, eqiad) Successfully completed
  • 10:31 switchdc: (volans@sarin) START TASK - switchdc.stages.t00_disable_puppet(codfw, eqiad) Disabling puppet on selected hosts
  • 10:28 switchdc: (volans@sarin) END TASK - switchdc.stages.t00_reduce_ttl(codfw, eqiad) Failed to execute
  • 10:28 switchdc: (volans@sarin) START TASK - switchdc.stages.t00_reduce_ttl(codfw, eqiad) Reduce the TTL of all the MediaWiki discovery records
  • 10:26 switchdc: (volans@sarin) END TASK - switchdc.stages.t00_disable_puppet(codfw, eqiad) Successfully completed
  • 10:26 switchdc: (volans@sarin) START TASK - switchdc.stages.t00_disable_puppet(codfw, eqiad) Disabling puppet on selected hosts
  • 10:25 volans: Final test of switchdc steps in the codfw->eqiad configuration, only idempotent changes, T160178
  • 10:25 moritzm: installing wireshark security updates
  • 10:20 moritzm: uploaded HHVM 3.18.2+wmf2 for jessie-wikimedia/experimental (includes fix for T162354)
  • 09:52 oblivian:: Setting zotero in codfw UP
  • 09:50 _joe_: testing switchover script for services, will act on zotero in codfw
  • 09:45 _joe_: adding 60G to the ocg output partition on ocg1003
  • 09:17 oblivian@neodymium: conftool action : set/pooled=true; selector: dnsdisc=zotero,name=codfw
  • 09:03 volans: upgrading conftool to v0.4.1 on neodymium/sarin
  • 07:48 _joe_: uploaded python-conftool 0.4.1 to jessie-wikimedia
  • 07:42 _joe_: cleaning up orphaned COW images in /var/cache/pbuilder/build/ on copper
  • 06:16 marostegui: For the record: restarted s7 instance on db1069 - T163183
  • 00:36 catrope@tin: Synchronized php-1.29.0-wmf.20/extensions/MobileFrontend/resources/mobile.mainMenu/mainmenu.less: T163059 (duration: 03m 07s)

2017-04-17

  • 23:37 mutante: runnin rmmod acpi_pad on the 16 R320 via cumin, since blacklisting in puppet does not actively remove, confirmed unloaded. (16/16) success ratio (>= 100.0% threshold) for command: 'lsmod|grep -c acpi_pad ||:' (T162850)
  • 23:33 mutante: running puppet via cumin on all 16 Dell PowerEdge R320, adding blacklist file for acpi_pad kernel module. 15/16 success, all but tin (T162850)
  • 22:46 catrope@tin: Synchronized php-1.29.0-wmf.20/extensions/WikimediaEvents/modules/ext.wikimediaEvents.recentChangesClicks.js: T158458 T163152 (duration: 03m 01s)
  • 22:42 mutante: tin - load average going down, acpi_pad processes gone, cpu usage low again (T163158)
  • 22:40 mutante: tin - rmmod acpi_pad (T163158)
  • 22:08 catrope@tin: Synchronized php-1.29.0-wmf.20/extensions/WikimediaEvents/modules/ext.wikimediaEvents.recentChangesClicks.js: T158458 T163152 (duration: 16m 23s)
  • 19:16 mutante: tegmen test ircecho stop/start service to confirm it's fine on jessie/prod icinga role (that's the passive server)
  • 19:02 demon@tin: Synchronized wmf-config/: Pruning some old extension message files, co-master sync (duration: 01m 52s)
  • 18:58 demon@tin: Pruned MediaWiki: 1.29.0-wmf.15 (duration: 00m 14s)
  • 18:46 maxsem@tin: Finished deploy [tilerator/deploy@001811e]: https://gerrit.wikimedia.org/r/#/c/348224/ to test hosts only (duration: 00m 19s)
  • 18:46 maxsem@tin: Started deploy [tilerator/deploy@001811e]: https://gerrit.wikimedia.org/r/#/c/348224/ to test hosts only
  • 18:45 maxsem@tin: scap aborted: https://gerrit.wikimedia.org/r/#/c/348224/ to test hosts only (duration: 00m 19s)
  • 18:45 maxsem@tin: Started scap: https://gerrit.wikimedia.org/r/#/c/348224/ to test hosts only
  • 15:48 mobrovac@tin: Finished deploy [restbase/deploy@6595298]: Update client caching headers for T161284 (duration: 08m 15s)
  • 15:40 mobrovac@tin: Started deploy [restbase/deploy@6595298]: Update client caching headers for T161284
  • 15:34 mobrovac@tin: Finished deploy [restbase/deploy@6595298]: (no justification provided) (duration: 01m 29s)
  • 15:33 mobrovac@tin: Started deploy [restbase/deploy@6595298]: (no justification provided)
  • 15:32 mobrovac@tin: Finished deploy [restbase/deploy@6595298]: (no justification provided) (duration: 01m 42s)
  • 15:31 mobrovac@tin: Started deploy [restbase/deploy@6595298]: (no justification provided)
  • 09:33 marostegui: Silence alerts for restbase2004 and restbase2009 T160759

2017-04-16

  • 15:44 elukey: restart ocg on ocg1003 to clean up deleted files in lsof
  • 15:35 elukey: executing sudo find -name *.pdf -mtime +3 -exec rm {} \; on ocg1003's /srv/deployment/ocg/output to clean up some disk space - T162780

2017-04-14

  • 23:14 jynus: skipping CREATE DATABASE wbwikimedia on dbstore2001- duplicate declaration due to multi-source
  • 22:58 jynus: skipping CREATE DATABASE pawikisource on dbstore2001- duplicate declaration due to multi-source
  • 22:49 volans: restarting parsoid to get the disable linter change T148609
  • 22:17 Reedy: created linter tables on wbwikimedia T148609
  • 22:16 Reedy: created linter tables on pawikisource T148609
  • 20:53 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Disable Linter on larger wikis T148609 (duration: 00m 41s)
  • 20:26 reedy@tin: Synchronized wmf-config/abusefilter.php: abusefilter-modify-restricted for trwiki T161960 (duration: 01m 38s)
  • 17:48 mutante: mw1297 - restarted hhvm and apache
  • 17:07 twentyafterfour: deployed phabricator hotfix for T162943
  • 10:29 elukey: rollback systctl settings on mw1306 after experiment (stop jobchron/runner, stop hhvm, restore systctl settings, restart hhvm and job* daemons)
  • 09:50 elukey: temporarily set sysctl -w net.netfilter.nf_conntrack_max=524288 on mw1306 (jobrunner) as test - (rollback: sysctl -w net.netfilter.nf_conntrack_max=262144")
  • 09:43 elukey: temporarily set sysctl -w net.ipv4.ip_local_port_range="15000 64000" on mw1306 (jobrunner) as test - (rollback: sysctl -w net.ipv4.ip_local_port_range="32768 60999") - T157968
  • 08:32 elukey: restored appendfsync to 'everysec' on Redis rdb2005:6380 (end of performance experiment)
  • 07:23 elukey: executed CONFIG SET appendfsync no on redis2005:6780 as performance test
  • 00:39 niharika29@tin: Synchronized wmf-config/abusefilter.php: Fix Abuse Filter configuration for tr.wikipedia (T161960) (duration: 00m 42s)
  • 00:30 niharika29@tin: Finished scap: Reword ORES preferences (T162831), Put ORES r behind a preference (T162831), Deploy Special:Autoblocklist (T146414) (duration: 24m 44s)
  • 00:05 niharika29@tin: Started scap: Reword ORES preferences (T162831), Put ORES r behind a preference (T162831), Deploy Special:Autoblocklist (T146414)
  • 00:03 mutante: mw1297 - restart hhvm/apache
  • 00:03 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Remove use of blacklist for related pages feature (T162201) (duration: 00m 41s)
  • 00:02 niharika29@tin: Synchronized wmf-config/CommonSettings.php: Remove use of blacklist for related pages feature (T162201) (duration: 00m 41s)
  • 00:00 mutante: mw1293 - restart hhvm

2017-04-13

  • 23:56 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Retry sync Revert Remove use of blacklist for related pages feature (T162201) (duration: 00m 40s)
  • 23:51 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Revert Remove use of blacklist for related pages feature (T162201) (duration: 00m 41s)
  • 23:43 niharika29@tin: Synchronized wmf-config/CommonSettings.php: Remove use of blacklist for related pages feature (T162201) (duration: 00m 40s)
  • 23:41 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Remove use of blacklist for related pages feature (T162201) (duration: 00m 40s)
  • 23:39 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Enable related pages on Vector for htwiki (T126826) (duration: 00m 41s)
  • 23:26 niharika29@tin: Synchronized php-1.29.0-wmf.20/extensions/CirrusSearch/: Revert Workaround OOM issue on ngrams field (duration: 00m 54s)
  • 23:19 Dereckson: Create account for Jayantanth on wb.wikimedia (bureaucrat)
  • 23:09 dereckson@tin: Synchronized wmf-config/interwiki.php: DMOZ, pa.wikisource and wb.wikimedia interwiki map update (duration: 00m 41s)
  • 23:01 Dereckson: Create local-multiwrite stores for wb.wikimedia (T162510)
  • 23:01 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Initial configurationfor wb.wikimedia.org (T162510) (duration: 00m 40s)
  • 23:00 Dereckson: Create Translate extension tables for wb.wikimedia (T162510)
  • 22:59 dereckson@tin: Synchronized multiversion/MWMultiVersion.php: Add wb.wikimedia.org to wikimedia.org domains to serve as wikis (T162510) (duration: 00m 40s)
  • 22:59 dereckson@tin: rebuilt wikiversions.php and synchronized wikiversions files: Create wb.wikimedia.org (T162510)
  • 22:58 dereckson@tin: Synchronized dblists: Create wb.wikimedia.org (T162510) (duration: 00m 41s)
  • 22:47 dereckson@tin: Synchronized static/images/project-logos/: Logos for wb.wikimedia (T162510) (duration: 00m 41s)
  • 22:32 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: pa.wikisource creation (take two) (duration: 00m 41s)
  • 22:31 dereckson@tin: Synchronized w/static/images/project-logos/: pa.wikisource creation (take two) (duration: 00m 40s)
  • 22:30 dereckson@tin: rebuilt wikiversions.php and synchronized wikiversions files: pa.wikisource creation (take two)
  • 22:30 dereckson@tin: Synchronized dblists: pa.wikisource creation (take two) (duration: 00m 41s)
  • 22:15 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Initial configuration for pa.wikisource (T149522) (duration: 00m 41s)
  • 22:14 dereckson@tin: Synchronized static/images/project-logos/: Logos for pa.wikisource (T149522) (duration: 00m 41s)
  • 22:12 dereckson@tin: rebuilt wikiversions.php and synchronized wikiversions files: (no justification provided)
  • 22:12 dereckson@tin: Synchronized dblists: pa.wikisource creation (T149522) (duration: 00m 41s)
  • 21:56 demon@tin: Finished scap: pruned cdb files from wmf.18 (duration: 07m 55s)
  • 21:48 demon@tin: Started scap: pruned cdb files from wmf.18
  • 20:07 urandom: T161243: Clearing all snapshots
  • 19:45 ejegg: updated civicrm from 908b9c1 to 90d679b
  • 19:43 ejegg: updated SmashPig from ab52dbe to 3db064d
  • 19:16 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.20
  • 18:57 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Clean Wikisource namespaces T46320 (duration: 00m 43s)
  • 18:42 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Education Program on it.wikiversity T162692 (duration: 00m 43s)
  • 18:38 reedy@tin: Synchronized php-1.29.0-wmf.20/extensions/LiquidThreads: Remove extra parameter from hook (duration: 00m 45s)
  • 18:35 reedy@tin: Synchronized wmf-config/abusefilter.php: Enable AbuseFilter blocks on tr.wikipedia T161960 (duration: 00m 43s)
  • 18:30 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable NewUserMessage on tr.wikiquote T161962 (duration: 00m 43s)
  • 18:30 urandom: T161243: Truncating parsoid tables (wikimedia storage group)
  • 18:29 mutante: restarting jenkins service to apply logging change gerrit:347877. it was already tested on jenkinstest.integration.eqiad.wmflabs
  • 18:25 reedy@tin: Synchronized php-1.29.0-wmf.20/extensions/Wikidata: Stop some logspam for deprecated hooks (duration: 02m 06s)
  • 18:23 reedy@tin: Synchronized php-1.29.0-wmf.20/extensions/WikimediaEvents: Stop some logspam for deprecated hooks (duration: 00m 43s)
  • 18:21 reedy@tin: Synchronized php-1.29.0-wmf.20/extensions/LiquidThreads: Stop some logspam for deprecated hooks (duration: 00m 45s)
  • 18:19 reedy@tin: Synchronized php-1.29.0-wmf.19/extensions/Wikidata: Stop some logspam for deprecated hook usage (duration: 02m 14s)
  • 18:16 urandom: T161243: Truncating parsoid tables (default storage group)
  • 18:16 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Document EducationProgram config (duration: 00m 43s)
  • 18:12 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Set wgUsejQueryThree to false everywhere ahead of further testing (duration: 00m 43s)
  • 18:09 reedy@tin: Synchronized wmf-config/CommonSettings-labs.php: Run 3d2png with xfvb-run on beta (duration: 00m 43s)
  • 16:55 elukey: restored default value of client-output-buffer-limit on rdb1007:6379 - T159850
  • 16:23 mobrovac@tin: Finished deploy [citoid/deploy@b8c4cb2]: Test deploy for T162814 (duration: 02m 24s)
  • 16:21 mobrovac@tin: Started deploy [citoid/deploy@b8c4cb2]: Test deploy for T162814
  • 16:15 thcipriani@tin: Synchronized README: scap.cfg change test (duration: 00m 44s)
  • 15:49 mobrovac@tin: Finished deploy [citoid/deploy@212800d]: Enable multiple results for T115248 and remove b/c for T114515 (duration: 03m 10s)
  • 15:46 mobrovac@tin: Started deploy [citoid/deploy@212800d]: Enable multiple results for T115248 and remove b/c for T114515
  • 15:02 andrewbogott: disabling puppet on dubnium and pollux for a cautious merge of https://gerrit.wikimedia.org/r/#/c/348071
  • 15:01 andrewbogott: disabling puppet on seaborgium and serpens for a cautious merge of https://gerrit.wikimedia.org/r/#/c/348071
  • 14:56 ppchelko@tin: Finished deploy [changeprop/deploy@e47afea]: Provide separate rules for ORES precaching in both DCs (duration: 00m 58s)
  • 14:55 ppchelko@tin: Started deploy [changeprop/deploy@e47afea]: Provide separate rules for ORES precaching in both DCs
  • 14:50 moritzm: installing bouncycastle security updates
  • 14:27 bblack: disabling puppet on recnds/ntp boxes to control patch rollout
  • 13:28 moritzm: powercycling thumbor1001, stuck in reboot
  • 13:18 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 43s)
  • 13:16 hashar@tin: Synchronized dblists/closed.dblist: Close wikimania2016 - T161183 (duration: 00m 43s)
  • 13:14 hashar@tin: Synchronized static/images/project-logos: (no justification provided) (duration: 00m 46s)
  • 13:00 moritzm: Upgrading thumbor* to Linux 4.9
  • 12:52 elukey: temporary set config set client-output-buffer-limit "slave 5368709120 5368709120 180" on rdb1007:6379
  • 12:34 volans@tin: Synchronized wmf-config/db-eqiad.php: Use a generic retry for the read only message T160178 (duration: 00m 44s)
  • 12:34 elukey: temporary set config set client-output-buffer-limit "slave 3221225472 3221225472 180" on rdb1007:6379
  • 12:22 volans@tin: Synchronized wmf-config/db-codfw.php: Use a generic retry for the read only message T160178 (duration: 01m 54s)
  • 12:16 moritzm: restarting ntp on achernar
  • 11:59 elukey: temporary set config set client-output-buffer-limit "slave 2536870912 2536870912 60" on rdb1007:6379
  • 11:37 elukey: temporary set config set client-output-buffer-limit "slave 2147483648 2147483648 60" on rdb1007:6379 to give time to rdb2005's replication to catch up - T159850
  • 10:58 moritzm: rebooting alsafi to Linux 4.9
  • 10:58 moritzm: rebooting alfafi to Linux 4.9
  • 10:47 elukey: reverted previous config for Redis rdb2005
  • 10:47 XioNoX: Confirmed we can still reach cr2-knams:lo0 via v6 (from esams), disabling IPv4 transit sessions for T162601
  • 10:42 XioNoX: disable V6 transit BGP session on cr2-knams for T162601
  • 10:22 elukey: executed CONFIG SET appendfsync no (prev value: "everysec") to Redis instance 6380 on rdb2005 - T125735
  • 10:13 godog: upgrade thumbor to 0.1.38
  • 10:08 moritzm: rebooting restbase1016 to Linux 4.9
  • 09:39 moritzm: rebooting restbase1011 to Linux 4.9
  • 09:12 moritzm: rebooting restbase1010 to Linux 4.9
  • 06:29 elukey: re-arm keyholder on mira after reboot
  • 06:14 elukey: powercycle mira - eth0 errors in the dmesg, CPU system utilization skyrocketed
  • 04:14 mutante: ms-be2023 is rebooting
  • 04:12 mutante: ms-be2023 icinga alerts, no more swift processes. cant ssh to it. attempt to power cycle. mgmt console enourmous spam of "rejecting I/O to offline device"
  • 01:58 bblack@neodymium: conftool action : set/pooled=yes; selector: name=achernar.wikimedia.org,dc=codfw,cluster=dns,service=pdns_recursor
  • 00:36 catrope@tin: Finished scap: Split RCFilters GuidedTour messages for ORES vs non-ORES (T162693) (duration: 53m 47s)

2017-04-12

  • 23:42 catrope@tin: Started scap: Split RCFilters GuidedTour messages for ORES vs non-ORES (T162693)
  • 23:37 catrope@tin: Synchronized php-1.29.0-wmf.20/extensions/MobileFrontend/: Log only infoboxes which are not a direct children of lead section (T149884) (duration: 01m 05s)
  • 23:35 catrope@tin: Synchronized php-1.29.0-wmf.20/resources/src/mediawiki.widgets: Fix setDisabled in mw.widgets.Complex* (T162667) (duration: 00m 42s)
  • 23:32 catrope@tin: Synchronized php-1.29.0-wmf.19/resources/src/mediawiki.widgets: Fix setDisabled in mw.widgets.Complex* (T162667) (duration: 00m 44s)
  • 23:25 awight: rebuilt and reenabled process-control jobs
  • 23:20 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Disable cross-wiki uploads to Commons (T162374) (duration: 00m 43s)
  • 23:19 cwd: removed p-c crontab to stop all jobs
  • 23:15 bblack@neodymium: conftool action : set/pooled=no; selector: name=achernar.wikimedia.org,dc=codfw,cluster=dns,service=pdns_recursor
  • 23:13 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable wgCiteResponsiveReferences on cawiki (T161307) and bgwiki (T162145) (duration: 00m 44s)
  • 23:02 bblack@neodymium: conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org,dc=codfw,cluster=dns,service=pdns_recursor
  • 22:50 bblack: acamar fixed up BIOS: HT disabled and power mgmt was set to PPW (DAPC) instead of PPW (OS)
  • 22:45 bblack: downtiming acamar again to fixup bios stuff (HT at least)
  • 21:31 Dereckson: Create Education Program tables on it.wikiversity (T162692)
  • 20:44 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy Linter to all wikis - T148609 (duration: 00m 44s)
  • 20:42 bblack@neodymium: conftool action : set/pooled=no; selector: name=acamar.wikimedia.org,dc=codfw,cluster=dns,service=pdns_recursor
  • 20:25 mutante: planet2001 - manually updating all feeds to make it active (or would have to wait for crons)
  • 20:12 ssastry@tin: Finished deploy [parsoid/deploy@323cebb]: Updating Parsoid to 75debae3 (duration: 09m 16s)
  • 20:07 mutante: planet2001 - activating all the crons, making planet active/active eqiad/codfw
  • 20:03 ssastry@tin: Started deploy [parsoid/deploy@323cebb]: Updating Parsoid to 75debae3
  • 19:42 bd808@tin: Synchronized wmf-config/mc.php: Revert "wikitech: Enable binary memcached protocol" (duration: 00m 43s)
  • 19:05 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.20
  • 19:05 XenoRyet: reverted SmashPig from aede277 to ab52dbe
  • 19:05 demon@tin: Synchronized php: symlink bump (duration: 00m 42s)
  • 19:04 ejegg: updated payments-wiki from 0b396a3 to 36f38f6
  • 18:52 XenoRyet: updated SmashPig from ab52dbe to aede277
  • 18:45 thcipriani@tin: Synchronized php-1.29.0-wmf.20/extensions/MobileFrontend: SWAT: formatter: Increase log level of infobox message T149884 (duration: 00m 46s)
  • 18:44 ppchelko@tin: Finished deploy [changeprop/deploy@e403f56]: Config: Send ORES precache requests to both DCs. Attempt #2. T159615 (duration: 01m 15s)
  • 18:43 ppchelko@tin: Started deploy [changeprop/deploy@e403f56]: Config: Send ORES precache requests to both DCs. Attempt #2. T159615
  • 18:38 thcipriani@tin: Synchronized php-1.29.0-wmf.20/extensions/MobileFrontend: SWAT: formatter: Change log channel of infobox message T149884 (duration: 00m 46s)
  • 18:37 ppchelko@tin: Finished deploy [changeprop/deploy@0a9a008]: Config: Send ORES precache requests to both DCs. T159615 (duration: 06m 53s)
  • 18:30 ppchelko@tin: Started deploy [changeprop/deploy@0a9a008]: Config: Send ORES precache requests to both DCs. T159615
  • 18:26 thcipriani@tin: Synchronized php-1.29.0-wmf.20/extensions/MobileFrontend: SWAT: setMobileOptions at time of skin creation T125588 (duration: 00m 46s)
  • 18:18 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Tweak Russian logo wordmark T162036 PART II (duration: 00m 43s)
  • 18:16 thcipriani@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ru.svg: SWAT: Tweak Russian logo wordmark T162036 PART I (duration: 00m 43s)
  • 16:46 awight@tin: rebuilt wikiversions.php and synchronized wikiversions files: (no justification provided)
  • 16:37 awight@tin: Synchronized php-1.29.0-wmf.20/extensions/FundraiserLandingPage: Fix for donatewiki T162716 (duration: 00m 45s)
  • 16:35 awight@tin: Synchronized php-1.29.0-wmf.19/extensions/FundraiserLandingPage: Fix for donatewiki T162716 (duration: 00m 48s)
  • 15:53 chasemp: remove 2fa for Freddy2001 on wikitech per T162772
  • 14:31 andrewbogott: running maintain-meta_p on labsdb1001/1003/1009/1010/1011
  • 14:23 hashar: Restarting Jenkins for git/scm plugins updates
  • 14:06 hashar: European SWAT complete
  • 13:51 switchdc: (volans@neodymium) END TASK - switchdc.stages.t05_switch_traffic(codfw, eqiad) Successfully completed
  • 13:48 switchdc: (volans@neodymium) START TASK - switchdc.stages.t05_switch_traffic(codfw, eqiad) Switch traffic flow to the appservers in the new datacenter
  • 13:42 volans: testing t05_switch_traffic of the switchdc
  • 13:41 elukey: apply SLOWLOG RESET and CONFIG SET slowlog-max-len 100000 (prev value 10000, 10ms) to rdb1005:6380 to track down slow reqs - T125735
  • 13:37 hoo@tin: Synchronized php-1.29.0-wmf.20/extensions/Wikidata: Update Wikibase/ ArticlePlaceholder (duration: 02m 19s)
  • 13:33 hoo@tin: Synchronized php-1.29.0-wmf.19/extensions/Wikidata: Update Wikibase/ ArticlePlaceholder (duration: 02m 16s)
  • 13:33 elukey: restored slowlog-log-slower-than 10000 on rdb2005
  • 13:25 elukey: applied CONFIG SET slowlog-log-slower-than 300000 to Redis 6379 on rdb2005 and reset slowlog history to play with the stats
  • 13:10 addshore@tin: Synchronized php-1.29.0-wmf.20/extensions/WikimediaEvents/extension.json: WMDE Spring campaign PT2/2 (duration: 00m 45s)
  • 13:09 addshore@tin: Synchronized php-1.29.0-wmf.20/extensions/WikimediaEvents/WikimediaEventsHooks.php: WMDE Spring campaign PT1/2 (duration: 00m 45s)
  • 13:08 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Revert "Temporarily enable change dispatch logging on testwikidata" - T159828 (duration: 00m 47s)
  • 12:23 elukey: restart HDFS datanode daemons on all the Hadoop worker node to pick up the new JVM settings
  • 12:18 kartik@tin: Finished deploy [cxserver/deploy@2842efa]: Update cxserver to 56a012d (duration: 03m 58s)
  • 12:14 kartik@tin: Started deploy [cxserver/deploy@2842efa]: Update cxserver to 56a012d
  • 11:57 elukey: restart Yarn nodemanager daemons on all the Hadoop worker node to pick up the new JVM settings
  • 11:05 _joe_: downgrading python-urllib3 on puppetmaster1001
  • 11:02 akosiaris: upgrade puppet across the trusty fleet to 3.8. T162462
  • 10:34 hashar: Upgrading Jenkins "Email Extension" plugin 2.57.1..2.57.2 and restarting Jenkins
  • 10:07 hashar: Upgrading Jenkins "Git client" plugin 2.3.0..2.4.1 and restarting Jenkins
  • 09:58 switchdc: (volans@neodymium) END TASK - switchdc.stages.t07_coredb_masters_readwrite(codfw, eqiad) Successfully completed
  • 09:58 switchdc: (volans@neodymium) START TASK - switchdc.stages.t07_coredb_masters_readwrite(codfw, eqiad) set core DB masters in read-write mode
  • 09:56 switchdc: (volans@neodymium) END TASK - switchdc.stages.t03_coredb_masters_readonly(codfw, eqiad) Failed to execute
  • 09:56 switchdc: (volans@neodymium) START TASK - switchdc.stages.t03_coredb_masters_readonly(codfw, eqiad) set core DB masters in read-only mode
  • 09:53 _joe_: removing the old directory of data from ocg1003
  • 09:52 volans: testing t03 and t07 DB-RO/RW stages of switchdc (codfw->eqiad), we are already in that situation, t03 will fail the verfication, is expected
  • 09:52 godog: swift codfw-prod: ms-be2001 - ms-be2012 initial decom - T162785
  • 09:47 _joe_: remounting the new partition under /srv/deployment/ocg/output, cleaning out the old dir. Will cause a service interruption for requests to ocg1003 for a few minutes. T162780
  • 09:42 gehel: starting load on elastic2020 - T149006
  • 09:41 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: wmgUseGettingStarted false for dewiki (duration: 00m 45s)
  • 09:26 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: WMDE Spring campaign - Add logging from WikimediaEvent (duration: 00m 46s)
  • 09:22 hashar: Restarting Jenkins for Matrix related plugins updates (3)
  • 09:12 _joe_: copying data from / to the neww partition on ocg1003 T162462
  • 09:10 hashar: Restarting Jenkins for plugins update (2)
  • 09:06 _joe_: creating a LVM volume on ocg1003
  • 09:05 hashar: Restarting Jenkins for plugins update
  • 08:59 addshore@tin: Synchronized php-1.29.0-wmf.19/extensions/WikimediaEvents/extension.json: patch1 & patch2 WMDE Spring campaign PT2/2 (duration: 00m 45s)
  • 08:58 addshore@tin: Synchronized php-1.29.0-wmf.19/extensions/WikimediaEvents/WikimediaEventsHooks.php: patch1 & patch2 WMDE Spring campaign PT1/2 (duration: 00m 47s)
  • 08:52 ema: upgrade cache_upload to linux 4.9 T162029
  • 08:44 gehel: reimaging elastic2020 for testing - T149006
  • 08:24 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Successfully completed
  • 08:22 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Start MediaWiki maintenance in the new master DC
  • 08:14 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t08_stop_mediawiki_readonly(codfw, eqiad) Failed to execute
  • 08:14 root@tin: Synchronized wmf-config/db-eqiad.php: Set MediaWiki in read-write mode in datacenter eqiad (duration: 00m 35s)
  • 08:13 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t08_stop_mediawiki_readonly(codfw, eqiad) Set MediaWiki in read-write mode (db_to config already merged and git pulled)
  • 08:09 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t06_redis(codfw, eqiad) Successfully completed
  • 08:09 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t06_redis(codfw, eqiad) Switch the Redis replication
  • 08:02 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t05_switch_datacenter(codfw, eqiad) Successfully completed
  • 08:02 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t05_switch_datacenter(codfw, eqiad) Switch MediaWiki configuration to the new datacenter
  • 08:00 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t09_restore_ttl(codfw, eqiad) Successfully completed
  • 07:59 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t09_restore_ttl(codfw, eqiad) Restore the TTL of all the MediaWiki discovery records
  • 07:58 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t05_switch_traffic(codfw, eqiad) Successfully completed
  • 07:55 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t05_switch_traffic(codfw, eqiad) Switch traffic flow to the appservers in the new datacenter
  • 07:55 _joe_: resuming non-dry run tests of switchdc, all logs from switchdc by me are just tests
  • 06:57 _joe_: the last messages are just a test and nothing was really done, as codfw is already in read-only mode right now
  • 06:57 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t02_start_mediawiki_readonly(codfw, eqiad) Failed to execute
  • 06:57 root@tin: Synchronized wmf-config/db-codfw.php: Set MediaWiki in read-only mode in datacenter codfw (duration: 00m 23s)
  • 06:57 switchdc: (oblivian@sarin) MediaWiki read-only period starts at: 2017-04-12 06:56:53.822926
  • 06:56 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t02_start_mediawiki_readonly(codfw, eqiad) Set MediaWiki in read-only mode (db_from config already merged and git pulled)
  • 06:53 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t01_stop_maintenance(codfw, eqiad) Failed to execute
  • 06:53 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t01_stop_maintenance(codfw, eqiad) Stop MediaWiki maintenance in the old master DC
  • 06:50 _joe_: testing switchover codfw => eqiad, no destructive actions will be taken
  • 06:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 - T17441 (duration: 00m 46s)
  • 06:37 elukey: reimage mw2246.codfw.wmnet mw2152.codfw.wmnet to remove the /tmp partition (codfw videoscalers, switchover prep)
  • 06:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 - T132416 (duration: 00m 46s)
  • 06:28 _joe_: killing long-running puppet-agent on db2058 too
  • 06:20 _joe_: killing badly-started puppet agents on mc1010, tempdb2001,db1090, db2058, hydrogen, possibly others later
  • 06:13 marostegui: Deploy alter table on db1075 eqiad master (s3, image table) - T160415
  • 06:04 marostegui: Deploy schema change on s6 - db1093 - T17441
  • 06:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 (duration: 02m 00s)
  • 05:56 marostegui: Deploy alter table on db2108 codfw master (s3, image table) - T160415
  • 04:53 legoktm: started `mwscriptwikiset refreshLinks.php small.dblist` on terbium

2017-04-11

  • 23:58 thcipriani@tin: Synchronized wmf-config/CirrusSearch-production.php: SWAT: Enable deleted archive indexing & searching T109561 PART II (duration: 00m 45s)
  • 23:56 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable deleted archive indexing & searching T109561 PART I (duration: 00m 45s)
  • 23:29 ejegg: updated fundraising-tools from 0a42db3 to a8b8d72
  • 23:27 thcipriani@tin: Synchronized portals: SWAT: Bumping portals to master T128546 (duration: 00m 46s)
  • 23:26 thcipriani@tin: Synchronized portals/prod/wikipedia.org/assets: SWAT: Bumping portals to master T128546 (duration: 00m 46s)
  • 23:23 mutante: ocg: clearing host cache for ocg1001 which is shutdown for hardware repair. (on ocg1003: sudo -u ocg -g ocg nodejs-ocg /srv/deployment/ocg/ocg/mw-ocg-service/scripts/clear-host-cache.js -c /etc/ocg/mw-ocg-service.js ocg1001) T161158
  • 23:15 thcipriani@tin: Synchronized docroot/noc/conf/pageassessments.dblist: SWAT: Adding pageassessments.dblist for maintanence script T159438 PART II (duration: 00m 45s)
  • 23:14 thcipriani@tin: Synchronized dblists/pageassessments.dblist: SWAT: Adding pageassessments.dblist for maintanence script T159438 PART I (duration: 00m 45s)
  • 23:11 mutante: ocg1001 - scheduled downtime in icinga for host and all services, confirmed it's not actively doign things anymore, shutting down for hardware replacement (T161158)
  • 23:10 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Flow beta feature on frwikiversity T162022 (duration: 00m 46s)
  • 23:04 mutante: ocg1001 - apt-get clean for disk space
  • 22:36 mutante: ocg1003 started picking up jobs (mw-ocg-latexer) after it was enabled with gerrit:347781, ocg1001 was disabled in the same change. Also ganglia graphs confirm it. T84723 T161158
  • 22:22 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Enable alternate RevSlider slider on group0 T160410 (duration: 00m 45s)
  • 22:19 dzahn@puppetmaster1001: conftool action : set/pooled=no; selector: name=ocg1001.eqiad.wmnet
  • 22:17 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TwoColConflict BetaFeature on fiwiki (duration: 00m 46s)
  • 21:23 mobrovac@tin: Finished deploy [restbase/deploy@a4042a6]: Update the legal text in the API docs (duration: 06m 49s)
  • 21:17 mobrovac@tin: Started deploy [restbase/deploy@a4042a6]: Update the legal text in the API docs
  • 21:16 mobrovac@tin: Finished deploy [restbase/deploy@a4042a6]: Staging: Update the legal text in the API docs (duration: 03m 55s)
  • 21:12 mobrovac@tin: Started deploy [restbase/deploy@a4042a6]: Staging: Update the legal text in the API docs
  • 21:12 mobrovac@tin: Finished deploy [restbase/deploy@a4042a6]: Dev cluster: Update the legal text in the API docs (duration: 01m 37s)
  • 21:11 mobrovac@tin: Started deploy [restbase/deploy@a4042a6]: Dev cluster: Update the legal text in the API docs
  • 20:51 _joe_: killed running 'puppet agent t-v' on ruthenium
  • 19:20 ppchelko@tin: Finished deploy [electron-render/deploy@5492cdb]: Update to latest upstream, full deploy, attempt#2 T160764 (duration: 01m 25s)
  • 19:18 ppchelko@tin: Started deploy [electron-render/deploy@5492cdb]: Update to latest upstream, full deploy, attempt#2 T160764
  • 19:11 ppchelko@tin: Finished deploy [electron-render/deploy@5492cdb]: Update to latest upstream, full deploy, T160764 (duration: 03m 38s)
  • 19:08 ppchelko@tin: Started deploy [electron-render/deploy@5492cdb]: Update to latest upstream, full deploy, T160764
  • 19:08 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.20
  • 19:01 ppchelko@tin: Finished deploy [electron-render/deploy@5492cdb]: Update to latest upstream, canary on scb2001, attempt#3 T160764 (duration: 00m 52s)
  • 19:00 ppchelko@tin: Started deploy [electron-render/deploy@5492cdb]: Update to latest upstream, canary on scb2001, attempt#3 T160764
  • 18:34 elukey: restart hhvm on mw1165 (debug in /tmp/hhvm.5384.bt.)
  • 18:25 demon@tin: Finished scap: testwiki to wmf.20 to bootstrap (duration: 35m 27s)
  • 17:49 demon@tin: Started scap: testwiki to wmf.20 to bootstrap
  • 17:49 demon@tin: Pruned MediaWiki: 1.29.0-wmf.17 [keeping static files] (duration: 00m 16s)
  • 17:41 mobrovac@tin: Finished deploy [restbase/deploy@e470b9f]: Initial Scap3 config deploy - T116335 (duration: 10m 39s)
  • 17:30 mobrovac@tin: Started deploy [restbase/deploy@e470b9f]: Initial Scap3 config deploy - T116335
  • 17:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 (duration: 00m 57s)
  • 17:14 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) Successfully completed
  • 17:08 mobrovac: restbase enabling back puppet for T116335
  • 17:07 mobrovac@tin: Finished deploy [restbase/deploy@e470b9f]: Staging: Initial Scap3 config deploy, take 2 - T116335 (duration: 02m 12s)
  • 17:06 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) wipe and warmup caches
  • 17:06 marostegui: Deploy unscheduled alter table on db1044 (s3, image table) - T160415
  • 17:05 mobrovac@tin: Started deploy [restbase/deploy@e470b9f]: Staging: Initial Scap3 config deploy, take 2 - T116335
  • 17:05 ppchelko@tin: Finished deploy [electron-render/deploy@5492cdb]: Update to latest upstream, canary on scb2001, attempt#2 T160764 (duration: 03m 22s)
  • 17:04 marostegui: Deploy unscheduled alter table on db1015 (s3, image table) - T160415
  • 17:02 mobrovac@tin: Finished deploy [restbase/deploy@e470b9f]: Dev Cluster: Initial Scap3 config deploy, take 2 - T116335 (duration: 00m 58s)
  • 17:02 marostegui: Deploy unscheduled alter table on db1038 (s3, image table) - T160415
  • 17:02 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: nope, no wmf.19 for donatewiki. life is hard
  • 17:02 mobrovac@tin: Started deploy [restbase/deploy@e470b9f]: Dev Cluster: Initial Scap3 config deploy, take 2 - T116335
  • 17:01 ppchelko@tin: Started deploy [electron-render/deploy@5492cdb]: Update to latest upstream, canary on scb2001, attempt#2 T160764
  • 17:00 marostegui: Deploy unscheduled alter table on db1035 (s3, image table) - T160415
  • 16:58 marostegui: Deploy unscheduled alter table on db1077 (s3, image table) - T160415
  • 16:56 marostegui: Deploy unscheduled alter table on db1078 (s3, image table) - T160415
  • 16:54 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: donatewiki back to wmf.19. you put your left foot in, you put your left foot out...
  • 16:48 marostegui: Deploy unscheduled alter table on db1093 (adding pl_from index)
  • 16:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 (duration: 00m 42s)
  • 16:43 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t05_switch_traffic(eqiad, codfw) Successfully completed
  • 16:43 mobrovac@tin: Finished deploy [restbase/deploy@e470b9f]: Staging: Initial Scap3 config deploy - T116335 (duration: 01m 33s)
  • 16:42 ppchelko@tin: Finished deploy [electron-render/deploy@5492cdb]: Update to latest upstream, canary on scb2001 T160764 (duration: 04m 28s)
  • 16:41 mobrovac@tin: Started deploy [restbase/deploy@e470b9f]: Staging: Initial Scap3 config deploy - T116335
  • 16:40 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t05_switch_traffic(eqiad, codfw) Switch traffic flow to the appservers in the new datacenter
  • 16:37 ppchelko@tin: Started deploy [electron-render/deploy@5492cdb]: Update to latest upstream, canary on scb2001 T160764
  • 16:37 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: donatewiki still busted
  • 16:35 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: donatewiki back to wmf.19
  • 16:33 mobrovac@tin: Finished deploy [restbase/deploy@e470b9f]: Dev Cluster: Initial Scap3 config deploy - T116335 (duration: 01m 04s)
  • 16:32 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) Successfully completed
  • 16:32 mobrovac@tin: Started deploy [restbase/deploy@e470b9f]: Dev Cluster: Initial Scap3 config deploy - T116335
  • 16:28 mobrovac: restbase disabling puppet for T116335
  • 16:27 demon@tin: Synchronized README: no-op, co-master sync (duration: 00m 43s)
  • 16:24 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) wipe and warmup caches
  • 16:11 switchdc: (volans@sarin) END TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) Successfully completed
  • 16:08 switchdc: (volans@sarin) START TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) wipe and warmup caches
  • 16:08 volans: testing the codfw caches wipe+warm, take 2
  • 16:04 demon@tin: Synchronized scap/plugins/clean.py: syncing to both masters (duration: 00m 44s)
  • 15:56 switchdc: (volans@sarin) END TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) Failed to execute
  • 15:54 switchdc: (volans@sarin) START TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) wipe and warmup caches
  • 15:53 volans: testing the codfw caches wipe+warm: https://wikitech.wikimedia.org/wiki/Switch_Datacenter#Phase_4.1_-_Wipe_caches T160178
  • 15:25 thcipriani@tin: Synchronized README: test sync for new scap version 3.5.5 (duration: 00m 59s)
  • 15:19 godog: upload scap 3.5.5-1 - T127762
  • 15:05 ema: upgrade cp4005 (cache_upload) to linux 4.9 T162029
  • 14:31 moritzm: powercycled restbase1007, stuck during reboot
  • 14:18 moritzm: upgrading restbase1007 to Linux 4.9
  • 13:55 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RCFilters beta feature on fawiki, ruwiki, trwiki, and frwiki (T144458) (duration: 00m 39s)
  • 13:54 ottomata: reimaging stat1004 as jessie
  • 13:53 akosiaris: upgrade puppet agent to 3.8 across the jessie fleet. Do that in a stages, starting with parsoid hosts. move on to mw fleet next. T162462
  • 13:51 akosiaris: upgrade puppet agent to 3.8 across the jessie fleet. Do that in a stages, starting with parsoid hosts
  • 13:49 godog: roll-upgrade swift to 2.2.0 across eqiad machines - T162609
  • 13:45 hashar: Updating all Jenkins jobs using the git plugin due to JJB change cdfeb7b - https://phabricator.wikimedia.org/T162674
  • 13:39 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add autopatrolled group to svwiktionary (T161919) (duration: 00m 39s)
  • 13:34 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Increase default image thumbnail size on Finnish Wikipedia to 250px (T162376) (duration: 00m 39s)
  • 13:10 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Give sysops ability to promote users to eliminator at fawiki (T162396) (duration: 00m 39s)
  • 13:01 godog: roll-upgrade swift to 2.2.0 across codfw machines - T162609
  • 12:55 moritzm: powercycling wtp2013, stuck during reboot
  • 12:47 elukey: reimage mw2246 (Debian codfw videoscaler) to Trusty
  • 12:46 marostegui: Deploy schema change on db1069 (s7 instance) - T160390
  • 11:42 ema: upgrade cache_misc to linux 4.9 T162029
  • 11:33 elukey: resume reboot of analytics1040->1050 for kernel upgrades
  • 11:27 moritzm: wtp2* to Linux 4.9
  • 11:27 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: NOOP (Beta file only) - Remove redundant wmgUseRevisionSlider in InitialiseSettings-labs (duration: 00m 38s)
  • 11:09 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: NOOP - Remove redundant testwiki from wmgUseLinter (already has group0) (duration: 00m 39s)
  • 11:02 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: NOOP (Beta file only) - Fix some tabs (duration: 00m 39s)
  • 10:46 moritzm: upgrading wtp1020-wtp1024 to Linux 4.9
  • 10:13 moritzm: upgrading wtp1010-wtp1019 to Linux 4.9
  • 09:17 moritzm: install remaining pam updates from jessie point update
  • 09:11 godog: upgrade swift to 2.2.0 on ms-be2001 - T162609
  • 06:58 moritzm: restarted cassandra-a on restbase2004, crashed with "out of heap memory"
  • 06:50 marostegui: Deploy alter table enwiki.revision dbstore1002 - T132416
  • 06:48 moritzm: installing jasper security updates
  • 06:30 elukey: restart hhvm on mw1299 - dump debug in /tmp/hhvm.84379.bt
  • 06:28 marostegui: Deploy alter table enwiki.revision db1072 - T132416
  • 06:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 - T132416 (duration: 00m 43s)
  • 06:07 marostegui: Deploy schema change on db1041 (eqiad master) (s7) - T160390
  • 06:02 marostegui: Deploy schema change labsdb1003 (s7) - T160390
  • 06:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 - T132416 (duration: 00m 39s)
  • 02:59 bblack: jessie recdns software upgrades complete
  • 02:52 bblack@neodymium: conftool action : set/pooled=yes; selector: name=maerlant.wikimedia.org,service=pdns_recursor
  • 02:51 bblack: upgrading maerlant to pdns-recursor 4.x
  • 02:50 bblack@neodymium: conftool action : set/pooled=no; selector: name=maerlant.wikimedia.org,service=pdns_recursor
  • 02:48 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Apr 11 02:48:56 UTC 2017 (duration 5m 43s)
  • 02:43 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.19) (duration: 07m 16s)
  • 02:37 bblack@neodymium: conftool action : set/pooled=yes; selector: name=chromium.wikimedia.org,service=pdns_recursor
  • 02:32 bblack: upgrading chromium to pdns-recursor 4.x
  • 02:31 bblack@neodymium: conftool action : set/pooled=no; selector: name=chromium.wikimedia.org,service=pdns_recursor
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 07m 47s)
  • 02:16 bblack@puppetmaster1001: conftool action : set/pooled=yes; selector: service=pdns_recursor,name=nescio.wikimedia.org
  • 02:13 bblack: upgrading nescio to pdns-recursor 4.x
  • 02:06 bblack: jessie-recdns: unpausing upgrade process...

2017-04-10

  • 23:43 bblack: jessie-recdns: upgrade to pdns-recursor 4.x paused - hydrogen updated and in-service; chromium/nescio/maerlant still puppet-disabled. Going to leave things in this state for a while. If something seems amiss, hydrogen can be re-depooled via confctl: confctl select name=hydrogen.wikimedia.org,service=pdns_recursor set/pooled=no
  • 23:34 bblack@neodymium: conftool action : set/pooled=yes; selector: name=hydrogen.wikimedia.org,service=pdns_recursor
  • 23:33 bblack: upgrading hydrogen to pdns-recursor 4.x
  • 23:25 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Set ORES thresholds for fawiki, ruwiki, trwiki (duration: 00m 39s)
  • 23:18 bblack@neodymium: conftool action : set/pooled=no; selector: name=hydrogen.wikimedia.org,service=pdns_recursor
  • 23:04 bblack: puppet disabled on jessie recdns (maerlant, nescio, hydrogen, chromium) for complex upgrade process ( https://gerrit.wikimedia.org/r/#/c/346937/ )
  • 22:45 dapatrick: Deployed patch for T162621 to wmf18 and wmf19
  • 22:04 ejegg: updated CiviCRM from b6c8f3e to 908b9c1
  • 21:37 ejegg: updated payments-wiki from b5bcfa1 to 0b396a3
  • 21:33 gehel: logstash upgrade on all logstash1* nodes completed- T161908
  • 21:31 gehel: upgrading logstash on logstash1003 - T161908
  • 21:22 gehel: upgrading logstash on logstash1002 - T161908
  • 21:17 gehel: logstash upgrade on logstash1001 completed - T161908
  • 21:13 gehel: running puppet on logstash1001 to deploy new logstash plugins - T161908
  • 20:45 ejegg: updated payments-wiki from 9622a4b to b5bcfa1
  • 20:29 gehel: upgrading logstash on logstash1001 - T161908
  • 20:27 ebernhardson: deployed new logstash plugins to logstash100[123]
  • 20:16 bsitzmann@tin: Finished deploy [mobileapps/deploy@9bc8c07]: Update mobileapps to 1695900 (duration: 05m 27s)
  • 20:10 bsitzmann@tin: Started deploy [mobileapps/deploy@9bc8c07]: Update mobileapps to 1695900
  • 19:51 andrewbogott: upgrading qemu and oslo packages on labvirt1002
  • 19:38 gehel: disabling puppet on logstash1* - T161908
  • 19:38 gehel: starting logstash upgrade - some log messages will be lost! - T161908
  • 18:22 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable ORES review tool in hewiki T161621 (duration: 00m 39s)
  • 18:12 thcipriani: mwscript extensions/ORES/maintenance/CheckModelVersions.php hewiki && mwscript extensions/ORES/maintenance/PopulateDatabase.php hewiki
  • 18:06 thcipriani: create ores tables on hewiki
  • 17:51 elukey: restore Hadoop masters to analytics1001
  • 17:16 papaul: testing lvs2002 after mainboard replacement
  • 17:06 gehel@tin: Finished deploy [wdqs/wdqs@1cfbd8d]: (no justification provided) (duration: 01m 22s)
  • 17:04 gehel@tin: Started deploy [wdqs/wdqs@1cfbd8d]: (no justification provided)
  • 16:48 _joe_: not really restarting parsoid, still testing swtichdc
  • 16:45 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t09_restart_parsoid(codfw, eqiad) Rolling restart parsoid in eqiad and codfw
  • 16:02 mobrovac@tin: Finished deploy [restbase/deploy@2c70843]: Initial deployment with Scap3 (duration: 07m 52s)
  • 15:58 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Successfully completed
  • 15:58 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Disabling puppet on MediaWiki jobrunners and videoscalers
  • 15:55 mobrovac@tin: Started deploy [restbase/deploy@2c70843]: Initial deployment with Scap3
  • 15:47 cmjohnson1: troubleshooting link cr2-eqiad:xe-3/0/1 {#2014 to asw-b-eqiad:xe-1/1/2 per T162199
  • 15:35 mobrovac@tin: Finished deploy [restbase/deploy@a8d4d02]: Initial deployment with Scap3 (duration: 00m 10s)
  • 15:35 mobrovac@tin: Started deploy [restbase/deploy@a8d4d02]: Initial deployment with Scap3
  • 15:33 mobrovac: restbase enabling back puppet in prod
  • 15:31 mobrovac@tin: Finished deploy [restbase/deploy@a8d4d02]: Initial deployment with Scap3 on staging (duration: 03m 31s)
  • 15:28 mobrovac@tin: Started deploy [restbase/deploy@a8d4d02]: Initial deployment with Scap3 on staging
  • 15:19 mobrovac@tin: Finished deploy [restbase/deploy@a8d4d02]: (no justification provided) (duration: 01m 22s)
  • 15:18 mobrovac@tin: Started deploy [restbase/deploy@a8d4d02]: (no justification provided)
  • 15:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 after maintenance with full weight (duration: 00m 39s)
  • 15:05 mobrovac: restbase disabling puppet for upgrade to scap3 deploys
  • 15:01 andrewbogott: disabling puppet on labcontrol1001 to raise log levels
  • 14:58 moritzm: upgrading wtp1006-wtp1009 to Linux 4.9
  • 14:52 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Successfully completed
  • 14:52 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Disabling puppet on MediaWiki jobrunners and videoscalers
  • 14:48 marostegui: Deploy alter table enwiki.revision db1073 - T132416
  • 14:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T132416 (duration: 00m 39s)
  • 14:47 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Failed to execute
  • 14:46 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Start MediaWiki maintenance in the new master DC
  • 14:45 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t09_restore_ttl(codfw, eqiad) Successfully completed
  • 14:45 ema: upgrade cache_maps to linux 4.9 T162029
  • 14:45 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t09_restore_ttl(codfw, eqiad) Restore the TTL of all the MediaWiki discovery records
  • 14:45 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Failed to execute
  • 14:45 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Start MediaWiki maintenance in the new master DC
  • 14:39 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Successfully completed
  • 14:39 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Disabling puppet on MediaWiki jobrunners and videoscalers
  • 14:31 gehel: deploying new psotgresql replication check, might generate a few icinga alerts -T162345
  • 14:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1028 - T160390 (duration: 00m 38s)
  • 14:05 elukey: reimage anaytics1001 to Debian Jessie
  • 13:49 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 after maintenance with low weight (duration: 00m 38s)
  • 13:41 moritzm: upgrading wtp1002-wtp1005 to Linux 4.9
  • 13:30 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Set wgTranslateNumerals false on bhwiki - T160098 (duration: 00m 40s)
  • 13:26 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Create editprotected right on ptwikinews - T162577 (duration: 00m 40s)
  • 13:19 elukey: reboot analytics1040->1050 to pick up the new kernel
  • 13:17 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Increase default thumb size to 250px at nowiki - T155892 (duration: 00m 45s)
  • 13:16 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: pagePreviews: Enable NavPopups gadget detection - T160081 (duration: 00m 40s)
  • 13:00 twentyafterfour: stopped search indexer on iridium to lighten load on m3 databases.
  • 12:55 marostegui: Run pt-table-checksum on s4 - T162593
  • 12:40 akosiaris: upload apertium-spa-cat_2.0.0~r77288-2+wmf1 on apt.wikimedia.org jessie-wikimedia/main
  • 11:11 akosiaris: upload puppet_3.8.5-2~bpo8+1 on apt.wikimedia.org jessie-wikimedia/main
  • 11:00 akosiaris: upload apertium-cat_2.0.0~r77286-1+wmf1, apertium-spa_1.0.0~r77293-1+wmf1 on apt.wikmedia.org/jessie-wikimedia
  • 10:58 gehel: starting load test on elstic2020 - T149006
  • 10:48 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1004.eqiad.wmnet
  • 10:32 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1004.eqiad.wmnet
  • 10:31 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1003.eqiad.wmnet
  • 10:23 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1003.eqiad.wmnet
  • 10:23 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1002.eqiad.wmnet
  • 10:12 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1002.eqiad.wmnet
  • 10:11 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1001.eqiad.wmnet
  • 10:03 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1001.eqiad.wmnet
  • 10:02 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2004.codfw.wmnet
  • 10:01 gehel: rolling restart of maps1* (eqiad) cluster
  • 09:52 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2004.codfw.wmnet
  • 09:52 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2003.codfw.wmnet
  • 09:44 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2003.codfw.wmnet
  • 09:44 XioNoX: all interfaces back up on cr2-esams, BGP sessions up as well T162239
  • 09:44 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2002.codfw.wmnet
  • 09:33 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2002.codfw.wmnet
  • 09:29 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2001.codfw.wmnet
  • 09:18 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2001.codfw.wmnet
  • 09:17 XioNoX: remote hands work started to replace the FPC on cr2-esams T162239
  • 09:16 gehel: rolling restart of maps2* cluster
  • 08:52 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=wdqs
  • 08:51 godog: swift codfw-prod: bump ms-be2028 ms-be2039 object weight to 4000 - T158337
  • 08:48 gehel: reimage elastic2020 - T149006
  • 08:43 gehel: rolling restart of maps-test cluster
  • 08:39 elukey: manual failover of Hadoop master daemons from analyitics1001 to analytics1002 (T160333)
  • 07:48 _joe_: testing a dry-run of the switchdc software on sarin
  • 07:02 moritzm: installing pam updates from jessie point update
  • 06:26 marostegui: Deploy schema change labsdb1001 (s7) - T160390
  • 06:24 marostegui: Deploy schema change db1028 (s7) - T160390
  • 06:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1028 - T160390 (duration: 00m 39s)
  • 06:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1034 - T160390 (duration: 00m 38s)
  • 06:07 marostegui: Deploy schema change db1034 (s7) - T160390
  • 06:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add tempdb2001 to x1 as a slave - T162290 (duration: 00m 38s)
  • 06:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1034 - T160390 (duration: 00m 39s)
  • 02:49 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Apr 10 02:49:06 UTC 2017 (duration 5m 40s)
  • 02:43 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.19) (duration: 07m 32s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 08m 17s)

2017-04-09

  • 02:59 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Apr 9 02:59:29 UTC 2017 (duration 5m 35s)
  • 02:53 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.19) (duration: 07m 36s)
  • 02:28 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 07m 56s)

2017-04-08

  • 20:56 bblack: removed varnishkafka logs and daemon.log.1 on cp1052 to free disk space and clear alert
  • 17:43 chasemp: service nova-compute restart labvirt1002
  • 17:36 chasemp: nova reset-state on 15 nodepool stuck in deletion nodes, and force-delete
  • 17:29 chasemp: delete manual on labcontrol all instances in delete state on nodepool
  • 17:25 chasemp: openstack server delete 970a86ce-2549-4cf3-be91-1f8558ab1b32 (admin-monitoring stuck in build)
  • 17:21 chasemp: restart rabbitmq on labcontrol1001
  • 17:20 chasemp: restart nova-api on labnet
  • 16:00 bblack: banning obj.http.Content-Type ~ text/html on cache_upload
  • 15:46 bblack: banning obj.http.X-Orig-Content-Type !~ . on cache_upload in ulsfo
  • 14:56 bblack: banning obj.http.X-Orig-Content-Type !~ . on cache_upload in esams
  • 13:54 bblack: banning obj.http.X-Orig-Content-Type !~ . on cache_upload in codfw
  • 13:27 bblack: banning obj.http.X-Orig-Content-Type !~ . on cache_upload in eqiad
  • 11:55 bblack: banning obj.http.Content-Type ~ text/html on cache_upload
  • 10:55 jynus: setting labsdb1001 and labsdb1003 in read only mode
  • 09:55 reedy@tin: Finished scap: Rebuild EP l10n cache for namespace aliases T162481 (duration: 79m 11s)
  • 08:36 reedy@tin: Started scap: Rebuild EP l10n cache for namespace aliases T162481
  • 08:34 reedy@tin: Synchronized wmf-config/CommonSettings.php: T162481 (duration: 00m 39s)
  • 08:33 reedy@tin: Synchronized wmf-config/extension-list: T162481 (duration: 00m 40s)
  • 02:56 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Apr 8 02:56:37 UTC 2017 (duration 5m 33s)
  • 02:51 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.19) (duration: 08m 03s)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 08m 13s)

2017-04-07

  • 23:16 mutante: gerrit2001 - deleting netmon1001 backup (/srv/netmon1001), stop rsyncd, remove rsyncd config (T125020)
  • 23:06 ejegg: updated DjangoBannerStats from 220f80e to 9e6b117
  • 22:18 reedy@tin: Synchronized php-1.29.0-wmf.19/extensions/EducationProgram/EducationProgram.php: Load wgExtensionMessagesFiles in PHP entry point for mergeMessageLists T162481 (duration: 00m 49s)
  • 20:07 demon@tin: Synchronized README: no-op, testing master sync speed now (duration: 00m 38s)
  • 20:05 demon@tin: Synchronized README: no-op, co-master sync (duration: 00m 39s)
  • 19:41 demon@tin: Finished scap: no-op, final history sync (duration: 23m 05s)
  • 19:18 demon@tin: Started scap: no-op, final history sync
  • 18:40 demon@tin: Synchronized php-1.29.0-wmf.19/includes/specials/: no-op, cleaning up history (duration: 01m 00s)
  • 18:16 demon@tin: Synchronized php-1.29.0-wmf.19/includes/api/: No-op, cleaning up git history (duration: 00m 54s)
  • 17:17 demon@tin: Finished scap: no-op, cleaning up wmf.19 history (duration: 25m 07s)
  • 16:51 demon@tin: Started scap: no-op, cleaning up wmf.19 history
  • 16:29 demon@tin: Synchronized php-1.29.0-wmf.19/extensions/SyntaxHighlight_GeSHi/: no-op, cleaning up history (duration: 00m 44s)
  • 15:32 gehel: reimaging elstic2020 - T149006
  • 14:58 marostegui: Deploy schema change dbstore1001 (s7 wikis) - T160390
  • 14:40 marostegui: Deploy  schema change db1033 (already depooled) (s7) - T160390
  • 14:13 elukey: restart hadoop-hdfs-namenode on an1002 (Hadoop Master standby) to pick up new jvm settings
  • 14:07 elukey: restart hadoop-mapreduce-historyserver on an1001 to pick up the new jvm settings
  • 14:02 switchdc: (oblivian@sarin) Executing task switchdc.stages.t00_reduce_ttl(eqiad, codfw): Reduce the TTL of all the MediaWiki discovery records
  • 14:01 _joe_: running tests of the switchdc automation in dry-run mode
  • 14:01 switchdc: (oblivian@sarin) Executing task switchdc.stages.t00_disable_puppet(eqiad, codfw): Stop puppet execution on maintenance, jobqueues
  • 12:52 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Enable alternate RevisionSlider slider on beta BETA ONLY (duration: 00m 51s)
  • 12:48 bblack: banning cache_upload obj.http.Content-type ~ text/html
  • 12:46 bblack: banning cache_upload obj.http.Content-type == text/html
  • 12:45 bblack: banning cache_upload obj.http.Content-type ~ text
  • 10:53 elukey: increase Redis connection timeout manually (.3s -> .5s) on mw1306 as performance test - T125735
  • 09:22 marostegui: Deploy  schema change db1062 (already depooled) (s7) - T160390
  • 08:15 moritzm: upgrade mw1262-mw1265 to HHVM 3.18.2
  • 07:58 elukey: added "notifempty" to /etc/logrotate.d/nginx on cp1008, it should remove cronspam for access_pipe.log.1.gz
  • 07:51 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=wdqs
  • 07:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 - T160390 (duration: 00m 50s)
  • 07:50 marostegui: Deploy  schema change db1039 (already depooled) (s7) - T160390
  • 07:21 jynus: reimporting several damaged db tables on s2 T154485
  • 07:17 ariel@tin: Finished deploy [dumps/dumps@af61d8d]: I mean: handle page range generation for wikis with PAGES with hundreds of thousands of revisions (duration: 00m 02s)
  • 07:17 ariel@tin: Started deploy [dumps/dumps@af61d8d]: I mean: handle page range generation for wikis with PAGES with hundreds of thousands of revisions
  • 07:16 ariel@tin: Finished deploy [dumps/dumps@af61d8d]: handle page range generation for wikis with hundreds of thousands of revisions (duration: 00m 03s)
  • 07:16 ariel@tin: Started deploy [dumps/dumps@af61d8d]: handle page range generation for wikis with hundreds of thousands of revisions
  • 06:06 marostegui: Deploy schema change db1094 (s7) - T160390
  • 06:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 - T160390 (duration: 00m 49s)
  • 03:04 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Apr 7 03:04:52 UTC 2017 (duration 5m 13s)
  • 02:59 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.19) (duration: 14m 11s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 09m 54s)

2017-04-06

  • 23:14 dereckson@tin: Synchronized php-1.29.0-wmf.19/extensions/Popups: actions: Correctly delay FETCH_COMPLETE (Gerrit:346832) (duration: 00m 41s)
  • 22:23 maxsem@tin: Finished deploy [tilerator/deploy@9cf2338]: https://gerrit.wikimedia.org/r/#/c/346913/ to test hosts only (duration: 00m 18s)
  • 22:22 maxsem@tin: Started deploy [tilerator/deploy@9cf2338]: https://gerrit.wikimedia.org/r/#/c/346913/ to test hosts only
  • 22:15 ejegg: re-enabled adyen and paypal SmashPig job runners
  • 22:07 ejegg: re-enabled two main dedupe jobs and orphan rectifier
  • afk: set thank-you back size back to 400
  • 20:52 awight: change thank_you_batch from 400->1
  • 19:42 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Fix ORES threshold settings again (duration: 00m 40s)
  • 19:10 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.19
  • 18:48 legoktm@tin: Synchronized php-1.29.0-wmf.19/extensions/Linter/includes/RecordLintJob.php: Split statsd metrics by wiki - https://gerrit.wikimedia.org/r/#/c/346807 (duration: 00m 42s)
  • 18:17 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set $wgLinterStatsdSampleFactor (duration: 00m 45s)
  • 18:10 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Adjust plwiki, ptwiki ORES thresholds for new model deployment (duration: 00m 40s)
  • 18:00 switchdc: (volans@neodymium) Test switchdc IRC/SAL announcement (2)
  • 17:57 switchdc: (volans@neodymium) Test switchdc IRC/SAL announcement
  • 17:46 maxsem@tin: Finished deploy [tilerator/deploy@71aed11]: https://gerrit.wikimedia.org/r/#/c/346782/ to test hosts (duration: 00m 19s)
  • 17:45 maxsem@tin: Started deploy [tilerator/deploy@71aed11]: https://gerrit.wikimedia.org/r/#/c/346782/ to test hosts
  • 17:41 halfak@tin: Finished deploy [ores/deploy@3396b64]: T161748 (duration: 21m 08s)
  • 17:20 halfak@tin: Started deploy [ores/deploy@3396b64]: T161748
  • 17:19 arlolra@tin: Finished deploy [parsoid/deploy@b5c2a2b]: Updating Parsoid to 56ae82bb (duration: 08m 29s)
  • 17:13 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy Linter to medium wikis too - T148609 (duration: 00m 40s)
  • 17:11 arlolra@tin: Started deploy [parsoid/deploy@b5c2a2b]: Updating Parsoid to 56ae82bb
  • 16:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T160390 (duration: 00m 43s)
  • 16:18 elukey: restart hhvm on mw1227 - debug in /tmp/hhvm.30097.bt. - theads stuck in HPHP::Treadmill::getAgeOldestRequest
  • 16:17 hoo@tin: Synchronized wmf-config/Wikibase-production.php: Try using redisLockManager for test.wikidata.org (T159828) (duration: 00m 39s)
  • 16:11 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: Temporarily enable change dispatch logging on testwikidata (duration: 00m 45s)
  • 15:48 hoo@tin: Synchronized wmf-config/Wikibase.php: Fix Wikibase site groups for testwiki and test2wiki (duration: 00m 40s)
  • 15:36 hoo@tin: Synchronized wmf-config/Wikibase.php: Don't set removed Wikibase client settings (duration: 00m 40s)
  • 15:27 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add tempdb2001 to config files - T162290 (duration: 00m 40s)
  • 15:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add tempdb2001 to config files - T162290 (duration: 00m 39s)
  • 14:55 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 42s)
  • 14:51 hoo: Restarted apache on mwdebug1001 in order to test a potential CACHE_ACCEL issue
  • 14:46 hoo@tin: Synchronized wmf-config/: Don't use "enwiki" as Wikibase site id on testwiki and test2wiki (T94416) (duration: 01m 08s)
  • 14:12 hoo@tin: Synchronized wmf-config/Wikibase.php: Add testwiki and test2wiki to "specialSiteLinkGroups" on testwikidata (T94416) (duration: 00m 40s)
  • 14:04 elukey: reimage analytics1002 to Debian Jessie (Hadoop Master Node standby)
  • 13:44 gehel: re-generating tiles for tasmania on maps codfw cluster
  • 13:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T160390 (duration: 00m 39s)
  • 13:39 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1079 - T160390 (duration: 00m 43s)
  • 13:39 marostegui: Deploy schema change db1079 (s7 wikis) - T160390
  • 13:34 hashar: European SWAT completed
  • 13:30 marostegui: Deploy Deploy schema change dbstore1002 (s7 wikis) - T160390
  • 13:20 hashar@tin: Synchronized php-1.29.0-wmf.19/extensions/Popups: renderer: Pass event to behavior for processing - T162324 (duration: 00m 51s)
  • 12:51 ema: upgrade cp3007 to linux 4.9 T162029
  • 12:50 moritzm: upgraded mw1261 to HHVM 3.18.2 with cherrypicked fix for stat_cache deadlock, now running with stat_cache enabled again
  • 12:39 ema: rebooting cp2006 again to check for potential issues bringing up network ifaces / loading intel_uncore T162029
  • 12:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1081 original weight - T161088 (duration: 00m 40s)
  • 12:28 ema: cp2009 stuck rebooting, powercycled
  • 12:21 ema: upgrade cp2009 to linux 4.9 T162029
  • 12:16 moritzm: uploaded HHVM 3.18.2 to jessie-wikimedia/experimental
  • 11:51 ema: upgrade cp2006 to linux 4.9 T162029
  • 11:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increae db1081 weight - T161088 (duration: 00m 40s)
  • 10:59 _joe_: running some tests for the switchdc automation
  • 09:33 moritzm: installing freetype security updates on trusty
  • 08:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increae db1081 weight - T161088 (duration: 00m 39s)
  • 08:41 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: dc=codfw,cluster=wdqs
  • 08:40 moritzm: installing glibc updates on trusty
  • 08:37 gehel: shutting down wdqs codfw for data reimport - T162111
  • 08:34 hashar: starting Jenkins on contint1001
  • 08:27 moritzm: rebooting contint1001 to Linux 4.9
  • 08:02 elukey: restart hhvm on mw1194 - dump debug in /tmp/hhvm.1692.bt. - threads stuck in HPHP::Treadmill::getAgeOldestRequest
  • 07:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increae db1081 weight - T161088 (duration: 00m 39s)
  • 07:32 ema: cache_upload: ban all objects with content-type ~ "^text" T162035
  • 07:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1081 with low weight - T161088 (duration: 00m 48s)
  • 06:29 elukey: restart hhvm on mw1165 (jobrunner) - dump debug in /tmp/hhvm.19449.bt. - threads stuck in HPHP::Treadmill::getAgeOldestRequest
  • 06:09 marostegui: Deploy schema change db2029 (s7 codfw master) - T160390
  • 06:02 marostegui: Configure and start replication on db1081 after the defragment - T161088
  • 05:58 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2047 - T160390 (duration: 00m 40s)
  • 05:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1070 after compression - T153743 (duration: 00m 51s)
  • 04:06 twentyafterfour: restarting apache2 on iridium to apply a minor hotfix
  • 03:06 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Apr 6 03:06:35 UTC 2017 (duration 5m 59s)
  • 03:00 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.19) (duration: 15m 46s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 10m 28s)
  • 01:45 mutante: restarting gerrit to pick up config change gerrit:346180 (disable MD5)
  • 00:39 mutante: install1002/2002: deleting /srv/autoinstall/precise.cfg
  • 00:37 mutante: install1002/2002: deleteing /srv/tftboot/precise-installer | puppetmaster1002/2001: deleting /var/lib/puppet/volatile/tftpboot/precise-installer (clean up after gerrit:345549)
  • 00:25 twentyafterfour: Phabricator upgrade completed uneventfully, other than the undisputable fact that the new search functionality is awesome.
  • 00:21 mutante: added #wikimedia-traffic channel to stashbot config, test
  • 00:19 mutante: stopping and starting stashbot for config change - added #wikimedia-traffic channel
  • 00:19 twentyafterfour: updating phabricator, the service will be offline for just a few moments.
  • 00:08 twentyafterfour: preparing to update Phabricator to tag release/2017-04-05/1 #phab-2017-04-05

2017-04-05

  • 23:29 thcipriani@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ru.svg: SWAT: Update Russian Wikipedia logo T162036 (duration: 00m 40s)
  • 23:18 demon@tin: Synchronized wmf-config/CommonSettings.php: unbreak dashiki again (duration: 00m 40s)
  • 23:13 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy Page previews to stable on Hungrian and Hebrew Wikipedias T162162 (duration: 00m 40s)
  • 23:12 demon@tin: Synchronized php-1.29.0-wmf.19/extensions/Dashiki/: swattttttt (duration: 00m 41s)
  • 22:37 mobrovac: restbase deploying a8d4d027
  • 22:12 ppchelko@tin: Finished deploy [trending-edits/deploy@46544de]: Correctly calculate since parameter and allow to change decay for debugging, attempt 2 (duration: 07m 06s)
  • 22:05 ppchelko@tin: Started deploy [trending-edits/deploy@46544de]: Correctly calculate since parameter and allow to change decay for debugging, attempt 2
  • 22:04 ppchelko@tin: Finished deploy [trending-edits/deploy@46544de]: Correctly calculate since parameter and allow to change decay for debugging (duration: 02m 29s)
  • 22:02 ppchelko@tin: Started deploy [trending-edits/deploy@46544de]: Correctly calculate since parameter and allow to change decay for debugging
  • 21:58 demon@tin: Synchronized wmf-config/CommonSettings.php: bump video transcode timeouts, brion made me do it (duration: 00m 40s)
  • 20:53 ppchelko@tin: Finished deploy [trending-edits/deploy@475a5c0]: Fix edit scorer (duration: 05m 34s)
  • 20:47 ppchelko@tin: Started deploy [trending-edits/deploy@475a5c0]: Fix edit scorer
  • 20:44 ppchelko@tin: Finished deploy [trending-edits/deploy@475a5c0]: Fix edit scorer (duration: 02m 51s)
  • 20:41 ppchelko@tin: Started deploy [trending-edits/deploy@475a5c0]: Fix edit scorer
  • 20:27 arlolra: Updated Parsoid to 32b7c677 (T112043, T161936)
  • 20:18 arlolra@tin: Finished deploy [parsoid/deploy@f2d4eee]: Updating Parsoid to 32b7c677 (duration: 11m 26s)
  • 20:07 arlolra@tin: Started deploy [parsoid/deploy@f2d4eee]: Updating Parsoid to 32b7c677
  • 19:55 ppchelko@tin: Finished deploy [trending-edits/deploy@d8ca758]: Providing a debug endpoint (duration: 04m 59s)
  • 19:50 ppchelko@tin: Started deploy [trending-edits/deploy@d8ca758]: Providing a debug endpoint
  • 19:50 ppchelko@tin: Finished deploy [trending-edits/deploy@d8ca758]: Providing a debug endpoint (duration: 07m 56s)
  • 19:44 XioNoX: pushing https://www.irccloud.com/pastebin/Kecy61aZ/ to cr1/2.codfw for T162099
  • 19:43 awight: reenabled NL Fundraising campaigns
  • 19:42 ppchelko@tin: Started deploy [trending-edits/deploy@d8ca758]: Providing a debug endpoint
  • 19:38 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: roll back donatewiki to wmf.18
  • 19:37 awight: disabled NL campaigns per T162300
  • 19:12 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.19
  • 18:04 mutante: lvs2002 - power off via mgmt (it was down but still showed power as on)
  • 18:02 awight: rerunning paypal_audit
  • 16:57 moritzm: rearmed keyholder on mira after reboot
  • 15:20 elukey: playing with hhvm settings on mwdebug1002
  • 13:05 hashar@tin: Synchronized wmf-config/throttle.php: Add new throttle rule - T162089 (duration: 00m 40s)
  • 12:57 elukey: reimage analytics1035 (journal node) to Debian Jessie
  • 12:44 marostegui: Deploy schema change db2047 (s7) - T160390
  • 12:44 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2047 - T160390 (duration: 00m 41s)
  • 12:40 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2054 - T160390 (duration: 00m 44s)
  • 12:04 moritzm: upgrade remaining ca-certificates from jessie point update
  • 12:00 volans: re-enabled puppet on nitrogen/nihal/einsteinium, restarted ircecho
  • 11:42 volans: disabling ircecho for the merge of gerrit/346110 ( T159163 ) and postgres upgrade
  • 11:18 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 for maintenance (duration: 00m 40s)
  • 09:48 volans: deleted a third swift thumb that was making swiftrepl stuck in a loop: T162122
  • 09:11 elukey: reimage analytics1057 to Debian Jessie
  • 09:04 volans: deleted the 2 swift thumbs that were making swiftrepl stuck in a loop: T162122
  • 08:43 hoo: Ran scap pull on mwdebug1001 to revert local changes to Wikibase maintenance scripts
  • 08:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1034 after maintenance (duration: 00m 40s)
  • 07:44 marostegui: Migrate dbstore1002 enwiki.page and enwiki.categorylinks from TokuDB to InnoDB+compression - T159430
  • 06:56 marostegui: Stop replication on db1081 for maintenance - T161088
  • 06:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1081 - T161088 (duration: 00m 39s)
  • 06:55 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1081 - T161088 (duration: 00m 39s)
  • 06:36 elukey: restart hhvm on mw1288 (hhvm-dump-debug in /tmp/hhvm.92520.bt.)
  • 06:33 elukey: restart hhvm on mw1223 (hhvm-dump-debug in /tmp/hhvm.2164.bt.)
  • 06:22 marostegui: Deploy schema change db2054 (s7) - https://phabricator.wikimedia.org/T160390
  • 06:22 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2054 - T160390 (duration: 00m 43s)
  • 06:07 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2061 - T160390 (duration: 00m 40s)
  • 03:03 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Apr 5 03:03:18 UTC 2017 (duration 5m 53s)
  • 02:57 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.19) (duration: 07m 22s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 08m 47s)
  • 01:27 tstarling@tin: Synchronized php-1.29.0-wmf.18/extensions/ParserMigration/includes: scap test only, no code changes (duration: 00m 39s)
  • 01:26 tstarling@tin: Synchronized php-1.29.0-wmf.18/extensions/ParserMigration/includes: scap test only, no code changes (duration: 00m 40s)
  • 01:17 demon@tin: Synchronized scap/plugins/clean.py: fixes (duration: 00m 41s)
  • 00:57 demon@tin: Finished scap: wmf.14 again, testing testing (duration: 26m 48s)
  • 00:30 demon@tin: Started scap: wmf.14 again, testing testing
  • 00:29 tstarling@tin: Synchronized php-1.29.0-wmf.18/extensions/ParserMigration/includes: scap test only, no code changes (duration: 01m 21s)
  • 00:08 tstarling@tin: Synchronized php-1.29.0-wmf.18/extensions/ParserMigration/includes/MigrationEditPage.php: for bug fix gerrit 346478 (duration: 00m 56s)

2017-04-04

  • 23:55 tstarling@tin: Synchronized php-1.29.0-wmf.18/extensions/ParserMigration: (no justification provided) (duration: 00m 39s)
  • 23:50 reedy@tin: Synchronized php-1.29.0-wmf.18/extensions/Quiz: Revert "Start implementing Quiz generation using TemplateParser" (duration: 00m 42s)
  • 23:31 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Prepare for related pages config change (T160076) and set $wgOresFiltersThresholds on plwiki and ptwiki (duration: 00m 41s)
  • 23:29 jynus: unscheduled restart of dbstore1002 T162212
  • 23:19 demon@tin: Finished scap: re-syncing old wmf.14-16 branches...cleaned up a little too much (duration: 44m 32s)
  • 22:34 demon@tin: Started scap: re-syncing old wmf.14-16 branches...cleaned up a little too much
  • 22:01 mobrovac: SCB all services updated to use the new service-runner DNS caching
  • 22:00 mobrovac@tin: Finished deploy [trending-edits/deploy@5cc3969]: Bump service-runner to pick up new DNS caching (duration: 06m 40s)
  • 21:55 mobrovac@tin: Finished deploy [graphoid/deploy@5fc26cb]: Bump service-runner to pick up new DNS caching (duration: 02m 15s)
  • 21:54 mobrovac@tin: Started deploy [trending-edits/deploy@5cc3969]: Bump service-runner to pick up new DNS caching
  • 21:53 mobrovac@tin: Started deploy [graphoid/deploy@5fc26cb]: Bump service-runner to pick up new DNS caching
  • 21:52 mobrovac@tin: Finished deploy [mobileapps/deploy@b93488f]: Bump service-runner to pick up new DNS caching (duration: 02m 43s)
  • 21:49 mobrovac@tin: Started deploy [mobileapps/deploy@b93488f]: Bump service-runner to pick up new DNS caching
  • 21:48 mobrovac@tin: Finished deploy [cxserver/deploy@b4184d3]: Bump service-runner to pick up new DNS caching (duration: 03m 37s)
  • 21:45 mobrovac@tin: Started deploy [cxserver/deploy@b4184d3]: Bump service-runner to pick up new DNS caching
  • 21:44 mobrovac@tin: Finished deploy [mathoid/deploy@4eb6d9d]: Bump service-runner to pick up new DNS caching (duration: 03m 27s)
  • 21:40 mobrovac@tin: Started deploy [mathoid/deploy@4eb6d9d]: Bump service-runner to pick up new DNS caching
  • 21:36 mobrovac@tin: Finished deploy [eventstreams/deploy@cf892f4]: Bump service-runner to pick up new DNS caching (duration: 02m 04s)
  • 21:33 mobrovac@tin: Started deploy [eventstreams/deploy@cf892f4]: Bump service-runner to pick up new DNS caching
  • 21:29 mobrovac@tin: Finished deploy [citoid/deploy@7dbbac8]: Bump service-runner to pick up new DNS caching (duration: 03m 13s)
  • 21:27 awight: Finished migrating Fundraising jobs to process-controlb
  • 21:26 mobrovac@tin: Started deploy [citoid/deploy@7dbbac8]: Bump service-runner to pick up new DNS caching
  • 21:20 jynus: applying mariadb MDEV#7383 patch on db1034 T159319
  • 21:18 mutante: running puppet across labvirt10* to replace cert
  • 21:12 mutante: revoked old labvirt-star.eqiad.wmnet cert - created new csr, signed it (CA: wmf_ca_2014_2017). deploying new labvirt-star.eqiad valid for 720 days (T162085)
  • 20:48 catrope@tin: Synchronized php-1.29.0-wmf.19/extensions/Echo/: T162173 (duration: 00m 43s)
  • 20:00 paravoid: rolling out a border-in4 ACL update across core routers (T160055)
  • 19:17 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.19
  • 18:57 awight: enabled pilot process-control job: banner history queue consumer
  • 18:55 demon@tin: Synchronized php: symlink repoint (duration: 00m 39s)
  • 18:55 awight: disabled banner history queue consumer
  • 18:51 demon@tin: Finished scap: wmf.19 bootstrap (duration: 35m 16s)
  • 18:16 demon@tin: Started scap: wmf.19 bootstrap
  • 17:53 andrewbogott: disabling puppet on labvirts to roll out a nova config change
  • 17:40 volans: stopped ircecho to avoid IRC spam
  • 16:03 hoo: Updated the Wikidata property suggester with data from last Monday's JSON dump and applied the T132839 workarounds
  • 15:59 elukey: reimage analytics1052 (Hadoop Journal node) to Debian Jessie
  • 15:59 jynus: running ANALIZE on revision table for on eswiki,cawiki on db1034
  • 15:56 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1034 for maintenance (duration: 00m 44s)
  • 14:39 moritzm: rebooting praseodymium to Linux 4.9
  • 14:34 moritzm: rebooting xenon to Linux 4.9
  • 14:27 moritzm: rebooting cerium to Linux 4.9
  • 14:06 elukey: reimage analytics1039 and 1051 to Debian Jessie
  • 13:11 akosiaris: add LVS IPs to the url-downloader blacklist now that all nodejs services no longer require it anymore. See https://gerrit.wikimedia.org/r/207490
  • 13:09 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: BETA ONLY Enable interwikisorting on BETA wiktionaries (duration: 00m 44s)
  • 13:05 moritzm: installing ca-certificates updates from jessie point update
  • 13:00 ema: cache_upload: ban all objects with content-type ~ "^text" T162035
  • 12:19 ema: upgrade cp2003 to linux 4.9 T162029
  • 11:58 moritzm: installing e2fsprogs update from jessie point update
  • 11:53 elukey: reimage analytics10[36,37,38] to Debian Jessie
  • 11:46 marostegui: Deploy schema change db2061 (s7) - T160390
  • 11:46 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2061 - T160390 (duration: 00m 44s)
  • 11:39 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2068 - T160390 (duration: 00m 58s)
  • 09:40 moritzm: rebooting wtp1001 to Linux 4.9
  • 09:10 volans: restarted swiftrepl (repl_all.sh loop) on ms-fe1005
  • 08:47 moritzm: rebooting mw1265 to Linux 4.9
  • 08:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1015 - T159319 (duration: 00m 45s)
  • 07:54 moritzm: rebooting bast2001 to Linux 4.9
  • 07:35 elukey: reimage analytics103[234] to Debian Jessie
  • 06:43 marostegui: Deploy alter table on db2019 (codfw s4 master) - this will generate lag on codfw for s4 - T161683
  • 06:35 marostegui: Deploy schema change db2068 (s7) - T160390
  • 06:34 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2068 - T160390 (duration: 00m 44s)
  • 06:27 marostegui: Deploy schema change db1015 (s3) - https://phabricator.wikimedia.org/T159319
  • 02:39 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Apr 4 02:39:47 UTC 2017 (duration 5m 28s)
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 14m 27s)
  • 01:31 reedy@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Disable LoginNotify on wikis that have no Echo T158878 (duration: 00m 44s)
  • 00:45 mutante: install1002/2002: sudo -i reprepro --delete clearvanished to remove precise distro after merging gerrit:345550

2017-04-03

  • 23:54 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Deploy ParserMigration extension T141586 (for real) (duration: 00m 44s)
  • 23:41 thcipriani@tin: Finished scap: SWAT: Deploy ParserMigration extension T141586 (l10nupdate only) (duration: 22m 24s)
  • 23:19 thcipriani@tin: Started scap: SWAT: Deploy ParserMigration extension T141586 (l10nupdate only)
  • 23:10 thcipriani@tin: Synchronized wmf-config: SWAT: Test LoginNotify on Beta cluster T158878 (duration: 00m 46s)
  • 22:39 volans: completed restart of swift-proxies in eqiad, ms-fe1005 was missing due to swiftrepl stuck/running
  • 22:37 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,name=ms-fe1005.eqiad.wmnet
  • 22:35 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,name=ms-fe1005.eqiad.wmnet
  • 22:06 mutante: power cycling lvs2002, it was down and console showed nothing
  • 20:47 bsitzmann@tin: Finished deploy [mobileapps/deploy@20ab197]: Update mobileapps to fdd4e31 (duration: 03m 05s)
  • 20:44 bsitzmann@tin: Started deploy [mobileapps/deploy@20ab197]: Update mobileapps to fdd4e31
  • 19:21 hashar: Finished deployment of project-logos optimization for T161999 / https://gerrit.wikimedia.org/r/#/c/346057/ . And purged the related logos
  • 19:18 hashar@tin: Synchronized static/images/project-logos: Optimize a few project logos - T161999 (duration: 00m 44s)
  • 19:16 andrewbogott: in testlabs, deleted ou=projects,dc=wikimedia,dc=org and ou=roles,dc=wikimedia,dc=org as per T126758
  • 19:15 mutante: phabricator/ops: adding ayounsi to WMF-NDA (project 61) and acl*operations-team (project 29) (T162073)
  • 18:37 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Configure Babel for elwikisource (T161593) (duration: 00m 44s)
  • 18:29 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Convert reference lists to 'responsive' on hewiki (T161804) (duration: 00m 52s)
  • 17:02 gehel@tin: Finished deploy [wdqs/wdqs@d7c367a]: (no justification provided) (duration: 01m 29s)
  • 17:01 gehel@tin: Started deploy [wdqs/wdqs@d7c367a]: (no justification provided)
  • 15:43 hoo: Updated email for "Lucie Kaffee" on wikitech from work address (wikimedia.de) to known volunteer address (upon request)
  • 14:54 marostegui: Deploy alter table to unify revision table across all the s3 wikis on db1015 - T159319
  • 14:49 ema: cache_upload: ban all objects with content-type: text/html T162035
  • 14:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1015 - T159319 (duration: 00m 44s)
  • 14:26 ariel@tin: Finished deploy [dumps/dumps@905a845]: fix stub recombines, broken by too agressive 'cleanup' of local vars (duration: 00m 02s)
  • 14:26 ariel@tin: Started deploy [dumps/dumps@905a845]: fix stub recombines, broken by too agressive 'cleanup' of local vars
  • 14:23 cwd: restarted jenkins to stop ArrayIndexOutOfBoundsException error
  • 14:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086 - T160390 (duration: 00m 51s)
  • 13:38 zfilipin@tin: Synchronized php-1.29.0-wmf.18/extensions/cldr/: SWAT: Translate Atikamekw language name in French (duration: 00m 51s)
  • 13:29 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add NS100 (Portal) to ladwiki, Add rollback user group in fawikisource (duration: 00m 47s)
  • 13:27 hashar: terbium: scap pull for ladwiki namespace additions
  • 13:15 moritzm: upgrading restbase-dev* to Linux 4.9
  • 13:09 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: enwiki: Temporarily disable Wikidata descriptions (T161805) (duration: 00m 45s)
  • 12:37 elukey: reimage analytics10[29,30,31] to Debian Jessie
  • 12:28 ema: banning 200px-Status_iucn3.1_LC_cs.svg.png from esams frontends T162035
  • 11:49 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,name=ms-fe1006.eqiad.wmnet
  • 11:45 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,name=ms-fe1006.eqiad.wmnet
  • 11:35 joal@tin: Finished deploy [analytics/refinery@cc73c40]: (no justification provided) (duration: 07m 23s)
  • 11:31 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,name=ms-fe1008.eqiad.wmnet
  • 11:28 joal@tin: Started deploy [analytics/refinery@cc73c40]: (no justification provided)
  • 11:28 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,name=ms-fe1008.eqiad.wmnet
  • 11:21 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,name=ms-fe1007.eqiad.wmnet
  • 11:18 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,name=ms-fe1007.eqiad.wmnet
  • 11:08 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,name=ms-fe1006.eqiad.wmnet
  • 11:05 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,name=ms-fe1006.eqiad.wmnet
  • 11:04 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,name=ms-fe1005.eqiad.wmnet
  • 10:43 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,name=ms-fe1005.eqiad.wmnet
  • 10:38 volans: upgrading swift-proxy in eqiad to use discovery URLs
  • 08:46 marostegui: Deploy alter table db1086 (s7) on revision table to unify PK and indexes - T160390
  • 08:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 - T160390 (duration: 00m 44s)
  • 07:39 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1261.eqiad.wmnet
  • 07:25 marostegui: Deploy alter table dbstore2001 (s7) on revision table to unify PK and indexes - T160390
  • 07:25 _joe_: rebooting copper to clean up at least partially the docker mess
  • 07:14 moritzm: switched default kernel for jessie installations to Linux 4.9
  • 07:06 _joe_: removing stale files on copper for docker, all local images will be wiped away
  • 07:03 moritzm: instaling gnutls security updates on trusty
  • 06:51 marostegui: Deploy InnoDB compression on dewiki - db1070 - T150438
  • 06:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1070 to compress it - T153743 (duration: 00m 44s)
  • 06:40 _joe_: manually restarted replication for etcd
  • 06:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1057 entry - T160435 (duration: 00m 44s)
  • 06:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1057 entry - T160435 (duration: 00m 54s)
  • 06:12 marostegui: Remove partitions from metawiki.pagelinks (s7) on codfw master (db2029) this will generate lag on codfw - T153300
  • 05:59 marostegui: Resume pt-table-checksum on wikidata - T161294
  • 05:53 _joe_: powercycling mw2256, unresponsive to ping, blank console
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 09m 39s)

2017-04-02

  • 08:18 elukey: powercycle ms-be1016 (stuck in console, answers pings but not ssh)
  • 07:25 ariel@tin: Finished deploy [dumps/dumps@1ac3fb3]: var/method name cleanups, refactor, pregenerate page ranges for page content jobs, auto retry of failed page ranges (duration: 00m 03s)
  • 07:25 ariel@tin: Started deploy [dumps/dumps@1ac3fb3]: var/method name cleanups, refactor, pregenerate page ranges for page content jobs, auto retry of failed page ranges
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 09m 44s)

2017-04-01

  • 19:01 elukey: restart hhvm on mw1191 (dump debug in /tmp/hhvm.16619.bt.) - threads stuck in HPHP::Treadmill::getAgeOldestRequest
  • 02:37 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Apr 1 02:37:30 UTC 2017 (duration 5m 20s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 13m 27s)

2017-03-31

  • 23:11 mutante: ruthenium: logrotate --force /etc/logrotate.d/parsoid (note this is existing file "parsoid" not new file "parsoid_testing") (T161920)
  • 20:18 elukey: stopping jobrunners on mw116[89] and restarting hhvm after https://gerrit.wikimedia.org/r/345881
  • 19:44 Reedy: Stop badge hacks from messing up the entire page on IE 11 on MonoBook T161869
  • 19:42 reedy@tin: Synchronized php-1.29.0-wmf.18/extensions/Echo: Stop badge hacks from messing up the entire page on IE 11 on MonoBook T161689 (duration: 00m 50s)
  • 19:16 mutante: ruthenium also deleting ancient "htmldumper" data, gwicke confirmed it's not needed anymore
  • 18:27 mutante: ruthenium mounting /dev/mapper/ruthenium--vg-tank into /srv/visualdiff/pngs | deleted "mysql" and "dumps" data that was on previously unmounted partition , subbu checked that wasn't needed anymore, we still need logrotate (T161920)
  • 18:14 mutante: ruthenium mounting /dev/mapper/ruthenium--vg-tank which wasnt used at all.. bam.. over 477GB of free space
  • 16:02 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,name=ms-fe2008.codfw.wmnet
  • 16:00 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=codfw,name=ms-fe2008.codfw.wmnet
  • 15:59 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,name=ms-fe2007.codfw.wmnet
  • 15:57 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=codfw,name=ms-fe2007.codfw.wmnet
  • 15:55 mobrovac@tin: Finished deploy [trending-edits/deploy@26b5eb4]: Config change: lower min_edits to 15 T160127 (duration: 06m 37s)
  • 15:55 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,name=ms-fe2006.codfw.wmnet
  • 15:52 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=codfw,name=ms-fe2006.codfw.wmnet
  • 15:49 mobrovac@tin: Started deploy [trending-edits/deploy@26b5eb4]: Config change: lower min_edits to 15 T160127
  • 15:44 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,name=ms-fe2005.codfw.wmnet
  • 15:22 volans@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=imagescaler-rw,name=eqiad
  • 15:17 volans@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=imagescaler-rw,name=eqiad
  • 15:02 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=codfw,name=ms-fe2005.codfw.wmnet
  • 15:01 oblivian@puppetmaster1001: conftool action : set/ttl=300; selector: dnsdisc=restbase-async
  • 15:01 oblivian@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=restbase-async,name=eqiad
  • 14:58 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=restbase-async,name=eqiad
  • 14:56 oblivian@puppetmaster1001: conftool action : set/ttl=10; selector: dnsdisc=restbase-async
  • 14:55 _joe_: reducing ttl on the restbase-async discovery record, then flipping eqiad to active
  • 14:55 volans: deploying the use of discovery URL to swift-proxy hosts in codfw T160178#3136906
  • 14:09 _joe_: performing a rolling restart of changeprop after puppet runs on scb*
  • 14:00 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=restbase-async,name=codfw
  • 13:23 elukey: restart hhvm on mw116[89] after https://gerrit.wikimedia.org/r/345829
  • 13:19 gehel: rolling restart of maps-test cluster for kernel upgrade
  • 13:09 moritzm: rebooting bromine to Linux 4.9
  • 12:10 moritzm: rebooting mwdebug* to Linux 4.9
  • 12:05 moritzm: rebooting pybal-test* to Linux 4.9
  • 11:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 after maintenance (duration: 00m 49s)
  • 10:47 akosiaris: uploaded jessie-wikimedia kubernetes_1.4.6-4 on apt.wikimedia.org/jessie-wikimedia
  • 09:59 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2244.codfw.wmnet
  • 09:56 elukey: set pooled=yes mw210[56789], mw2260 and mw2213 (and cleaned up old /srv/mediawiki dirs that were causing rsync spam in scap pull)
  • 09:52 marostegui: Adding rev_timestamp index to revision page db1066 (s1) - T132416
  • 09:49 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 for maintenance (duration: 00m 44s)
  • 09:47 elukey: restart hhvm on mw1197 - hhvm dump debug in /tmp/hhvm.14540.bt. - threads stuck in Treadmill::getAgeOldestRequest (HHVM 3.12)
  • 09:37 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1062 (duration: 00m 45s)
  • 09:35 godog: fix long-standing swift-account-server REPLICATE backtrace error on ms-be1022 - https://bugs.launchpad.net/swift/+bug/1424108
  • 09:21 godog: delete stray nginx error log with debug logging on thumbor1002
  • 08:28 moritzm: repooled mw1261 for more HHVM 3.18 debugging
  • 07:29 marostegui: Start pt-table-checksum on s5 wikidatawiki - T161294
  • 02:44 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Mar 31 02:44:11 UTC 2017 (duration 5m 30s)
  • 02:38 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 15m 54s)

2017-03-30

  • 23:58 aude@tin: Synchronized php-1.29.0-wmf.18/extensions/Wikidata: Fixes for special pages (duration: 02m 15s)
  • 23:07 catrope@tin: Synchronized php-1.29.0-wmf.18/extensions/ORES/modules/: T161706 (duration: 00m 51s)
  • 21:34 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 -> wmf.18
  • 21:16 awight: reenabled ingenico orphan rectifier (jenkins)
  • 21:08 awight: disable ingenico orphan rectifier (jenkins)
  • 21:07 demon@tin: Synchronized php-1.29.0-wmf.18/extensions/Echo/includes/model/Event.php: fix logging class reference (duration: 00m 47s)
  • 19:25 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.18
  • 18:59 MaxSem: Portals were not deployed: https://phabricator.wikimedia.org/T161832
  • 18:54 maxsem@tin: Synchronized portals/: (no justification provided) (duration: 00m 48s)
  • 18:45 maxsem@tin: Synchronized portals: (no justification provided) (duration: 00m 44s)
  • 18:45 maxsem@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 43s)
  • 18:38 maxsem@tin: Synchronized portals: (no justification provided) (duration: 00m 44s)
  • 18:29 maxsem@tin: Synchronized portals: (no justification provided) (duration: 00m 45s)
  • 18:28 maxsem@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 44s)
  • 18:24 maxsem@tin: Synchronized portals: (no justification provided) (duration: 00m 44s)
  • 18:24 godog: swift eqiad-prod add ms-be1028 -> ms-be1039 - T160640
  • 18:23 maxsem@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 44s)
  • 18:18 maxsem@tin: Synchronized portals: (no justification provided) (duration: 00m 44s)
  • 18:17 maxsem@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 45s)
  • 17:32 elukey: shutdown analytics1039 to apply new thermal paste - T132256
  • 16:17 godog: upgrade thumbor to 0.1.37 on thumbor100[12]
  • 16:03 _joe_: restarting hhvm on mw1191, stuck in HPHP::Treadmill::getAgeOldestRequest
  • 15:59 twentyafterfour@tin: Synchronized php-1.29.0-wmf.17/includes/: sync I7c5c0a refs T159319 (duration: 01m 41s)
  • 14:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1057 entry from s1 shard - T160435 (duration: 00m 44s)
  • 14:38 godog: run stress test (w/ bonnie) on new swift hw - T160640
  • 14:33 andrewbogott: upgrading nova-compute to 12.0.6 on all labvirts
  • 14:33 moritzm: rebooting restbase2001 to Linux 4.9
  • 14:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1090 - T17441 (duration: 00m 45s)
  • 14:22 kaldari@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 48s)
  • 14:21 kaldari: sync InitialiseSettings.php to enable cookie blocking on English Wikipedia
  • 14:06 oblivian@tin: Synchronized wmf-config/ProductionServices.php: switch to discovery for cxserver,eventbus (duration: 00m 43s)
  • 14:01 oblivian@tin: Synchronized wmf-config/ProductionServices.php: switch to discovery for some records (duration: 00m 47s)
  • 13:46 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow eliminators and autoreviewers to move a file on ptwiki (T161532) Assign move-categorypages to sysops&bots only on nlwiki (T161551) Enable Multimedia Viewer at officewiki (T160420) (duration: 00m 44s)
  • 13:19 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: [cleanup] Remove expired rules (T161530) (duration: 00m 45s)
  • 12:49 moritzm: rebooting bast4001 for kernel update to 4.9
  • 12:18 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=eventbus,name=codfw
  • 12:15 hoo: Updated the Constraints table on Wikidata, per T160506.
  • 12:02 moritzm: installing glibc security updates on trusty
  • 11:55 moritzm: installing jbig2dec security updates
  • 10:03 moritzm: repooling mw1261 for additional test
  • 09:48 root@tin: Synchronized wmf-config/db-eqiad.php: Uniform maintenance message and indentation (duration: 00m 47s)
  • 09:34 root@tin: Synchronized wmf-config/db-codfw.php: Uniform maintenance message and indentation (duration: 00m 44s)
  • 09:06 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1261.eqiad.wmnet
  • 09:05 elukey: depooling mw1261 (hhvm-dump-debug in /tmp/hhvm.98736.bt.)
  • 08:38 moritzm: repooling mw1261 to reproduce hhvm deadlock with higher debug level
  • 08:13 marostegui: Convert UNIQUE keys to PK on db1090 (s2) - T17441
  • 08:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 - T17441 (duration: 00m 44s)
  • 07:43 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2021.codfw.wmnet
  • 07:41 gehel: pull elastic2021 back into active duty - T149006
  • 07:05 ema: upgrading twisted to 16.2.0 on lvs100[123] (eqiad primaries) T160433
  • 06:45 moritzm: installing apparmor security updates on trusty
  • 06:25 marostegui: Logging backwards for the record: restart mysql on db1047 for maintenance - T160454
  • 05:56 marostegui: Deploy schema change on db2014 - codfw master (this will generate lag on codfw) - T73563
  • 05:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 - T17441 (duration: 00m 45s)
  • 05:51 ema: upgrading twisted to 16.2.0 on lvs100[456] (eqiad secondaries) T160433
  • 03:03 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Mar 30 03:03:31 UTC 2017 (duration 5m 49s)
  • 02:57 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 13m 43s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 07m 20s)
  • 01:52 twentyafterfour: phd fixed on iridium. libphutil was out of sync with phd source
  • 01:11 twentyafterfour: running `puppet agent --test` on iridium
  • 00:10 twentyafterfour: Phabricator update completed.
  • 00:10 mutante: ruthenium low on disk space, because /srv/visualdiff/pngs (parsoid-vd-tests) is pretty large and /srv isn't a separate mount
  • 00:06 twentyafterfour: updating phabricator on iridium
  • 00:04 mutante: ruthenium - apt-get clean gets a little more disk space

2017-03-29

  • 23:46 reedy@tin: Synchronized php-1.29.0-wmf.18/extensions/Quiz: Fix undefined variable stateObject T161735 (duration: 00m 49s)
  • 23:43 reedy@tin: Synchronized wmf-config/CommonSettings.php: Dont use EP_NS in CommonSettings (duration: 00m 44s)
  • 21:23 krinkle@tin: Synchronized errorpages/: I15295835a1a (duration: 00m 44s)
  • 20:56 thcipriani@tin: Synchronized php-1.29.0-wmf.18/extensions/ProofreadPage/includes/page/ProofreadPagePage.php: findIndexTitle T161734 (duration: 00m 46s)
  • 20:35 halfak@tin: Finished deploy [ores/deploy@554ea12]: T160638 (duration: 18m 40s)
  • 20:31 arlolra: Updated Parsoid to b1b27146 (T161558, T160207, T153798)
  • 20:21 arlolra@tin: Finished deploy [parsoid/deploy@bc798dc]: Updating Parsoid to b1b27146 (duration: 07m 26s)
  • 20:16 halfak@tin: Started deploy [ores/deploy@554ea12]: T160638
  • 20:13 arlolra@tin: Started deploy [parsoid/deploy@bc798dc]: Updating Parsoid to b1b27146
  • 20:09 ppchelko@tin: Finished deploy [changeprop/deploy@ef62908]: Fix metrics for regex topics (duration: 00m 56s)
  • 20:08 ppchelko@tin: Started deploy [changeprop/deploy@ef62908]: Fix metrics for regex topics
  • 19:46 ppchelko@tin: Finished deploy [changeprop/deploy@1150cf5]: Config: Enabling regex-based topic subscription (duration: 01m 45s)
  • 19:44 ppchelko@tin: Started deploy [changeprop/deploy@1150cf5]: Config: Enabling regex-based topic subscription
  • 19:16 awight: re-run today's ingenico audit job
  • 19:15 awight: pick at paypal scab: re-run audit parser
  • 19:10 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 back to 1.29.0-wmf.17
  • 19:05 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.18
  • 18:45 bblack: varnish active/active deploy done ( https://gerrit.wikimedia.org/r/#/c/339667/ ) - all caches running the new code, puppet re-enabled, etc.
  • 18:43 hoo: Started a Wikidata TTL dump run on snapshot1007 using Zend (due to T161695).
  • 18:22 catrope@tin: Synchronized php-1.29.0-wmf.18/includes/page/WikiPage.php: T159319 (duration: 00m 44s)
  • 18:22 catrope@tin: Synchronized php-1.29.0-wmf.18/includes/Title.php: T159319 (duration: 00m 46s)
  • 18:10 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Save->Publihs on Wikipedias except dewiki and enwiki (T131132); set wgOOUIEditPage false everywhere (duration: 00m 57s)
  • 17:58 awight: disabling PayPal audit parser
  • 17:56 ppchelko@tin: Finished deploy [changeprop/deploy@e4547cd]: Support regexed topics (duration: 00m 55s)
  • 17:55 ppchelko@tin: Started deploy [changeprop/deploy@e4547cd]: Support regexed topics
  • 17:31 godog: remove ge-3/0/27 from interface-range labs-instance-ports (now for ms-be1031)
  • 17:25 bblack: puppet disabled on all cp* ahead of careful deploy for https://gerrit.wikimedia.org/r/#/c/339667/
  • 17:12 mutante: removing parsoid-tests.wikimedia.org from DNS - replaced by more specific parsoid-rt-tests and parsoid-vd-tests
  • 17:12 nuria@tin: Finished deploy [eventlogging/analytics@2874077]: (no justification provided) (duration: 00m 03s)
  • 17:12 nuria@tin: Started deploy [eventlogging/analytics@2874077]: (no justification provided)
  • 17:11 elukey: restarting nginx on eqiad appservers to pick up the new certs
  • 16:55 marostegui: Stop eventlog syncs to db1047 and dbstore1002 for maintenance - T160454
  • 16:53 marostegui: Disable puppet on db1047 and dbstore1002 for maintenance - T160454
  • 16:51 elukey: upgrading ssl cert appservers.svc.eqiad.wmnet to include the new discovery endpoints
  • 16:51 _joe_: actually performing the parsoid rolling restart in codfw
  • 16:31 _joe_: rolling restart of parsoid in codfw
  • 14:32 moritzm: installing apparmor security updates on trusty
  • 14:31 elukey: upgrading ssl cert api.svc.eqiad.wmnet to include the new discovery endpoints
  • 14:14 andrewbogott: disabling puppet on labs hosts for a staged rollout of https://gerrit.wikimedia.org/r/#/c/345275/
  • 14:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1091 - T17441 (duration: 00m 44s)
  • 13:49 elukey: upgrading ssl cert rendering.svc.eqiad.wmnet to include the new discovery endpoints
  • 13:08 reedy@tin: Synchronized wmf-config/CommonSettings.php: use wfLoadExtension for VisualEditor (duration: 00m 44s)
  • 12:53 elukey: reimage analytics1045 to Debian Jessie
  • 12:52 _joe_: depooling wtp1001 to test puppet/confd transfer of responsibilities
  • 11:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 - T17441 (duration: 00m 44s)
  • 11:30 hoo: Started a Wikidata JSON dump run on snapshot1007 using Zend (due to T161695).
  • 11:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 - T17441 (duration: 00m 44s)
  • 11:03 elukey: upgrading ssl cert appservers.svc.codfw.wmnet to include the new discovery endpoints
  • 11:01 moritzm: Linux 4.9 uploaded for jessie-wikimedia (along with new meta package linux-meta-4.9 and updated firmware)
  • 11:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 - T17441 (duration: 00m 44s)
  • 10:27 godog: reimage netmon1001 with jessie
  • 10:12 ema: emptying /srv/log/parsoid/main.log.1 (3.2G!) on ruthenium to reclaim some disk space
  • 10:11 elukey: upgrading ssl cert api.svc.codfw.wmnet to include the new discovery endpoints
  • 09:39 ema: upgrading twisted to 16.2.0 on lvs200[123] (codfw primaries) T160433
  • 08:54 ema: upgrading twisted to 16.2.0 on lvs200[456] (codfw secondaries) T160433
  • 08:39 ema: apt.w.o: set digest-algo to sha256 in gpg.conf T132325
  • 08:29 elukey: upgrading ssl cert rendering.svc.codfw.wmnet to include the new discovery endpoints
  • 07:57 marostegui: Convert s6 UNIQUE keys into PK on db1093 - T17441
  • 07:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 - T17441 (duration: 00m 54s)
  • 06:01 marostegui: Keep converting UNIQUE keys to PK on s4 - db1091 - T17441
  • 03:15 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Mar 29 03:15:16 UTC 2017 (duration 5m 53s)
  • 03:09 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 14m 55s)
  • 02:35 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 13m 41s)
  • 01:41 krinkle@tin: Synchronized errorpages/404.php: Match 404.html and default.html - Id58e25afbe (duration: 00m 44s)
  • 01:16 mutante: rsyncing librenms/torrus/smokeping app data from netmon1001 to gerrit2001. adding alias "syncit" to do it all at once (T125020)
  • 00:57 paravoid: Removing upload.wikimedia.org/index.html ("swift delete root index.html") from both eqiad/codfw

2017-03-28

  • 23:22 thcipriani@tin: Synchronized php-1.29.0-wmf.17/extensions/NavigationTiming/modules/ext.navigationTiming.js: SWAT: ext.NavigationTiming: Restore unsampled Save Timing T161368 (duration: 00m 45s)
  • 23:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable header version 2 on all wikis T160471 (duration: 00m 45s)
  • {{safesubst:SAL entry|1=22:45 urandom: T111113: Restarting Cassandra instances, eqiad row 'd' {{done]}}}
  • 22:21 mutante: DNS - creating new language "dty" (T161529) - running "authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones" to trigger re-creation of zone files after change in langs.tmpl. (gerrit:345077) | https://www.ethnologue.com/language/dty
  • 22:19 mutante: DNS - creating new language "dty" (T160865) - running "authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones" to trigger re-creation of zone files after change in langs.tmpl. (gerrit:345077) | https://www.ethnologue.com/language/dty
  • 21:55 urandom: T111113: Restarting Cassandra instances, eqiad row 'd'
  • 21:55 urandom: T111113: Restarting Cassandra instances, eqiad row 'b' Yes check.svg Done
  • 21:18 andrewbogott: upgraded nova-compute on labvirt1014 because it contains a long-awaited bugfix
  • 21:08 urandom: T111113: Restarting Cassandra instances, eqiad row 'b'
  • 21:08 urandom: T111113: Restarting Cassandra instances, eqiad row 'a' Yes check.svg Done
  • 20:24 mutante: ms-fe1001 thru msfe1004 - scheduled last downtime for host and services in icinga - shutdown -h now, turn them off, revoke puppet certs, salt-keys... (T160986)
  • 20:22 mutante: mc1019 - puppet fail due to Failed resource /etc/redis/replica since 4 days
  • 20:21 urandom: T111113: Restarting Cassandra instances, eqiad row 'a'
  • 20:21 mutante: copper - puppet errors due to Failed resource /var/lib/docker/devicemapper ??
  • 20:19 mutante: mwdebug1002 - same, was low on disk space, 'apt-get clean' freed > 3GB
  • 20:18 mutante: mwdebug1001 - was low on disk space, 'apt-get clean' - freed about 4GB
  • 20:15 mutante: mw1261 - depooled
  • 20:14 mutante: mw1261 runs with HHVM 3.18 - which seems to have a bug leading to a deadlock every 4-5 hours
  • 20:14 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.29.0-wmf.18
  • 20:13 mutante: mw1261 HHVM crash as predicted by Moritz - ran sudo hhvm-dump-debug. Backtrace saved as /tmp/hhvm.79460.bt.
  • 20:06 mutante: ms-fe100[1-4] - disable/stop puppet, stop salt minion, decom (T160986)
  • 19:57 thcipriani@tin: Finished scap: testwiki to php-1.29.0-wmf.18 and rebuild l10n cache (duration: 40m 19s)
  • 19:37 mobrovac: restbase deploying d477f495
  • 19:33 urandom: T111113: Restarting Cassandra instances, codfw row 'd' Yes check.svg Done
  • 19:17 thcipriani@tin: Started scap: testwiki to php-1.29.0-wmf.18 and rebuild l10n cache
  • 18:45 urandom: T111113: Restarting Cassandra instances, codfw row 'd'
  • 18:44 urandom: T111113: Restarting Cassandra instances, codfw row 'c' Yes check.svg Done
  • 18:18 ppchelko@tin: Finished deploy [changeprop/deploy@1689d86]: Rename event field in logs (duration: 00m 52s)
  • 18:18 ppchelko@tin: Started deploy [changeprop/deploy@1689d86]: Rename event field in logs
  • 17:53 urandom: T111113: Restarting Cassandra instances, codfw row 'c'
  • 17:22 thcipriani: starting branch cut for 1.29.0-wmf.18
  • 17:07 godog: swift codfw-prod: bump ms-be2028 ms-be2039 object weight to 3000 - T158337
  • 17:06 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2021.codfw.wmnet
  • 16:39 urandom: T111113: Restarting remaining Cassandra instances, rack 'b', codfw (restbase20{02,07,10})
  • 16:19 urandom: T111113: Restarting Cassandra on restbase2001 to apply mandatory client encryption (canary)
  • 15:56 gehel: banning elastic2021 to run same tests as elastic2020 - T149006
  • 14:41 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2256.codfw.wmnet
  • 14:40 marostegui: Convert UNIQUE keys into PK on db1091 (commonswiki) - T17441
  • 14:38 ppchelko@tin: Finished deploy [changeprop/deploy@bfbaa17]: Increase log level for processinng failures (duration: 01m 07s)
  • 14:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1091 - T17441 (duration: 00m 43s)
  • 14:38 elukey: ran restart-hhvm on mw1242, hhvm threads stuck (dump debug in /tmp/hhvm.9008.bt.) - HHVM 3.12
  • 14:37 ppchelko@tin: Started deploy [changeprop/deploy@bfbaa17]: Increase log level for processinng failures
  • 13:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092 - T17441 (duration: 00m 43s)
  • 13:44 elukey: started hhvm on mw1261 (still depooled) - no hhvm process running
  • 13:29 RoanKattouw: Ran initUserPreference.php -s ores-enabled -t rcenhancedfilters and -s ores-enabled -t oresHighlight on plwiki and ptwiki
  • 13:22 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RCFilters beta feature on plwiki and ptwiki T158336 (duration: 00m 43s)
  • 12:58 moritzm: depooled mw1261
  • 10:39 ema: upgrading twisted to 16.2.0 on lvs3001 and lvs3002 (esams primaries) T160433
  • 10:36 ema: upgrading twisted to 16.2.0 on lvs3003 and lvs3004 (esams secondaries) T160433
  • 10:27 marostegui: Convert dewiki UNIQUE keys into PK on db1092 - https://phabricator.wikimedia.org/T17441
  • 10:15 elukey: Switching hue.w.o's backend (cache misc) from anaytics1027 to thorium - T159527
  • 10:10 moritzm: upgraded mw1262 to HHVM 3.18
  • 08:48 marostegui: Convert wikidatawiki UNIQUE keys into PK on db1092 - T17441
  • 08:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 - T17441 (duration: 00m 44s)
  • 08:29 akosiaris: enable IGMP snooping on all VLANs on asw2-d-eqiad. T133387
  • 07:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T17441 (duration: 00m 43s)
  • 07:18 moritzm: installing eject security updates on trusty hosts
  • 06:11 marostegui: Keep converting unique keys into PK on db1089 - T17441
  • 06:01 marostegui: Deploy schema change on s2.enwiktionary.templatelinks - on codfw master, this will generate lag on codfw slaves (which have been silenced) - T154097
  • 05:52 marostegui: Run pt-table-checksum on es2 - T161510
  • 02:39 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Mar 28 02:39:53 UTC 2017 (duration 5m 28s)
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 12m 37s)
  • 00:36 reedy@tin: Synchronized private: Remove mwblocker.log (duration: 00m 44s)
  • 00:34 reedy@tin: Synchronized wmf-config/CommonSettings.php: Remove $wgProxyList (duration: 00m 43s)

2017-03-27

  • 23:06 ebernhardson@tin: Synchronized php-1.29.0-wmf.17/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: SWAT T160006, turning off cirrussearch AB test for sistersearch (duration: 00m 44s)
  • 22:00 bawolff: deployed patch T151735
  • 21:27 andrewbogott: disabling puppet on labvirt* and labcontrol* to stagger roll out of https://gerrit.wikimedia.org/r/#/c/344689/
  • 20:29 arlolra: Updated Parsoid to 6eaad376 (T160599, T161178, T133267)
  • 20:21 arlolra@tin: Finished deploy [parsoid/deploy@371ba4f]: Updating Parsoid to 6eaad376 (duration: 07m 06s)
  • 20:14 arlolra@tin: Started deploy [parsoid/deploy@371ba4f]: Updating Parsoid to 6eaad376
  • 19:57 mutante: ruthenium/varnish misc - remove parsoid-tests.wikimedia.org server_name / backend - replaced by parsoid-rt-test and parsoid-vd-tests
  • 19:56 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Linter: whitelist parsoid canaries too - https://gerrit.wikimedia.org/r/#/c/344998/ - T160573 (duration: 00m 44s)
  • 18:08 mobrovac@tin: Finished deploy [mobileapps/deploy@92f693c]: Remove the proxy from the config, deploying to scb2004 (duration: 00m 43s)
  • 18:07 mobrovac@tin: Started deploy [mobileapps/deploy@92f693c]: Remove the proxy from the config, deploying to scb2004
  • 18:05 mobrovac@tin: Finished deploy [mobileapps/deploy@92f693c]: Remove the proxy from the config (duration: 03m 29s)
  • 18:02 mobrovac@tin: Started deploy [mobileapps/deploy@92f693c]: Remove the proxy from the config
  • 17:19 mutante: tin/mira: welcome new mediawiki deployer 'musikanimal' (T161181)
  • 17:03 gehel@tin: Finished deploy [wdqs/wdqs@d07586c]: (no justification provided) (duration: 01m 26s)
  • 17:02 gehel@tin: Started deploy [wdqs/wdqs@d07586c]: (no justification provided)
  • 16:40 mobrovac: restbase deploying f53bec41
  • 16:34 _joe_: cleaned the bc cache on mw1261, restarted hhvm and repooled
  • 15:46 mobrovac@tin: Finished deploy [mobileapps/deploy@aed916b]: Add discovery.wmnet to no_proxy_list (duration: 04m 05s)
  • 15:42 mobrovac@tin: Started deploy [mobileapps/deploy@aed916b]: Add discovery.wmnet to no_proxy_list
  • 15:38 mobrovac@tin: Finished deploy [cxserver/deploy@40e86ad]: Add discovery.wmnet to no_proxy_list (duration: 02m 39s)
  • 15:35 mobrovac@tin: Started deploy [cxserver/deploy@40e86ad]: Add discovery.wmnet to no_proxy_list
  • 14:33 dcausse: rebuilding ttmserver index in elastic@codfw from wasat
  • 14:14 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Add autopatrolled group to svwiki (T161210) (duration: 00m 50s)
  • 13:40 dereckson@tin: Synchronized wmf-config/CommonSettings.php: no-op, to force resync (duration: 00m 43s)
  • 13:29 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Fix wgLogoHD 2.5x key (T161416) (duration: 00m 43s)
  • 13:26 dereckson@tin: Synchronized wmf-config/CirrusSearch-common.php: [cirrus] enable more accurate regex timeout (T161095) (duration: 00m 44s)
  • 13:22 moritzm: repooled mw1261 (now that fix for lcfirst() issue from T161095 is deployed)
  • 13:20 dereckson@tin: Synchronized static/images/project-logos/: Add khw.wikipedia logos to static resources (T160865) (duration: 00m 43s)
  • 13:19 dereckson@tin: Synchronized wmf-config/: [es5 upgrade] step 5: restore normal operations (T157479, 2/2) (duration: 00m 49s)
  • 13:18 dereckson@tin: Synchronized tests/cirrusTest.php: [es5 upgrade] step 5: restore normal operations (T157479, 1/2) (duration: 00m 48s)
  • 13:08 dereckson@tin: Synchronized wmf-config/CirrusSearch-common.php: Updates and typo fixes to CirrusSearch-common.php (gerrit:344933) (duration: 00m 43s)
  • 12:47 marostegui: Run pt-table-checksum for a couple of hundred small wikis in es2 - T161510
  • 12:44 jynus: deploying semi-sync replication to all hosts on codfw T161007
  • 12:21 ladsgroup@tin: Synchronized php-1.29.0-wmf.17/extensions/Wikidata/vendor/composer/installed.json: Third try for Update Wikidata - fix term validation (T161263) Part III (duration: 00m 43s)
  • 12:19 ladsgroup@tin: Synchronized php-1.29.0-wmf.17/extensions/Wikidata/extensions/Wikibase/: Third try for Update Wikidata - fix term validation (T161263) Part II (duration: 01m 32s)
  • 12:19 godog: upgrade grafana to 4.2.0 on krypton T161193
  • 12:17 ladsgroup@tin: Synchronized php-1.29.0-wmf.17/extensions/Wikidata/composer.lock: Third try for Update Wikidata - fix term validation (T161263) Part I (duration: 00m 44s)
  • 12:16 Amir1: start of ladsgroup@tin:/srv/mediawiki-staging$ scap sync-file php-1.29.0-wmf.17/extensions/Wikidata/composer.lock 'Third try for Update Wikidata - fix term validation (T161263) Part I'
  • 12:02 _joe_: experimenting with cxserver config on scb2004
  • 12:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1053 T160415 - T73563 (duration: 01m 07s)
  • 11:51 marostegui: Deploy new index on db1040, s4 primary master table: commonswiki.image - T160415
  • 11:14 akosiaris: upgraded bacula-sd to 7.4.3+dfsg-1+sid1~bpo8+1 on heze as well
  • 11:03 akosiaris: performed bacula schema change on db1016 for database bacula
  • 11:00 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=ores
  • 10:59 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=.*-ro
  • 10:54 akosiaris: upgrade bacula director and storage daemon to 7.4.3
  • 10:47 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=(kartotherian|search)
  • 10:20 hashar: Restarting Jenkins to drop the Throttle Concurrent Builds plugin - T158596
  • 10:16 ladsgroup@tin: Synchronized php-1.29.0-wmf.17/extensions/Wikidata: Second try for Update Wikidata - fix term validation (T161263) (duration: 02m 05s)
  • 10:15 Amir1: start of ladsgroup@tin:/srv/mediawiki-staging/php-1.29.0-wmf.17$ scap sync-dir php-1.29.0-wmf.17/extensions/Wikidata "Second try for Update Wikidata - fix term validation (T161263)"
  • 09:56 _joe_: rolling restart of restbase in codfw to pick up the new parsoid config
  • 09:54 ladsgroup@tin: Synchronized php-1.29.0-wmf.17/extensions/Wikidata: Update Wikidata - fix term validation (T161263) (duration: 02m 22s)
  • 09:53 mforns@tin: Finished deploy [analytics/aqs/deploy@a5e1775]: (no justification provided) (duration: 01m 41s)
  • 09:52 Amir1: start of ladsgroup@tin:/srv/mediawiki-staging/php-1.29.0-wmf.17$ scap sync-dir php-1.29.0-wmf.17/extensions/Wikidata "Update Wikidata - fix term validation (T161263)"
  • 09:52 mforns@tin: Started deploy [analytics/aqs/deploy@a5e1775]: (no justification provided)
  • 09:35 mforns@tin: Finished deploy [analytics/aqs/deploy@80a9de4]: (no justification provided) (duration: 01m 49s)
  • 09:33 mforns@tin: Started deploy [analytics/aqs/deploy@80a9de4]: (no justification provided)
  • 08:42 jynus: deploying semisync replication to all hosts (eqiad and codfw) on s6 T161007
  • 08:38 hashar@tin: Synchronized php-1.29.0-wmf.17/languages/classes/LanguageKk.php: Check for string initialization in lcfirst() for HHVM 3.18 - T161095 (duration: 00m 52s)
  • 08:16 marostegui: Deploy alter tables on db1089 (depooled) for a bunch of tables to convert UNIQUE keys into PK for testing - T17441
  • 08:01 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2014 after maintenance (duration: 00m 43s)
  • 07:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T17441 (duration: 00m 45s)
  • 07:17 elukey@puppetmaster1001: conftool action : set/pooled=active; selector: name=mw2256.codfw.wmnet
  • 06:26 marostegui: Deploy alter table s4 (commonswiki) db1053 - https://phabricator.wikimedia.org/T73563 https://phabricator.wikimedia.org/T160415
  • 06:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1053 T160415 - T73563 (duration: 00m 56s)
  • 06:18 _joe_: disabling puppet on authdns while merging a dns change
  • 06:06 marostegui: Resume pt-table-checksum on dewiki - T161294
  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Mar 27 02:26:53 UTC 2017 (duration 5m 25s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 08m 22s)

2017-03-26

  • 10:06 _joe_: restarting apache2 on puppetmaster2002, passenger probably stuck
  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Mar 26 02:26:05 UTC 2017 (duration 5m 25s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 07m 20s)

2017-03-25

  • 22:28 legoktm@tin: Synchronized wmf-config/: No-op labs only changes https://gerrit.wikimedia.org/r/#/c/344788/ (duration: 00m 52s)
  • 20:08 Krinkle: Ran mwscript deleteEqualMessages.php on public wikis (T45917) - deleted 5 pages across 5 wikis
  • 13:43 dereckson@tin: Synchronized wmf-config/throttle.php: Add throttle rule for BordeauxJS (T161402) (duration: 00m 50s)
  • 03:15 Krinkle: Re-create optimised indexes for xhgui in mongodb on tungsten per https://github.com/perftools/xhgui/tree/v0.7.0#installation (lost after T161196)
  • 02:36 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Mar 25 02:36:46 UTC 2017 (duration 5m 27s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 12m 22s)

2017-03-24

  • 20:03 krinkle@tin: Synchronized php-1.29.0-wmf.17/StartProfiler.php: touch - T161286 - (symlink) (duration: 00m 42s)
  • 19:55 krinkle@tin: Synchronized wmf-config/StartProfiler.php: T161286 - include hostname (duration: 00m 49s)
  • 19:33 krinkle@tin: Synchronized wmf-config/StartProfiler.php: touch - T161286 - hhvm cache maybe? (duration: 00m 43s)
  • 18:10 ejegg: updated CiviCRM from d3c439f to b6c8f3e
  • 17:50 ebernhardson: restart elasticsearch on relforge100[12] to test reindex api over https
  • 15:27 jynus: running unscheduled ALTER TABLE on arbcom_cswiki.archive T104756
  • 13:47 moritzm: installing freetype security updates on trusty
  • 12:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 T160415 - T73563 (duration: 00m 44s)
  • 12:13 marostegui: Start first run of pt-table-checksum on s5 (dewiki) - T161294
  • 11:18 godog: upgrade grafana to 4.2.0 on labmon1001 - T161193
  • 09:39 godog: pool prometheus100[34] - T148408
  • 08:23 marostegui: Deploy schema change s4 db2019 (codfw master) - T160415
  • 08:01 ema: upgrading twisted to 16.2.0 on lvs4001 and lvs4002 (ulsfo primaries) T160433
  • 07:49 marostegui: Deploy schema change s4 on db1069 and db1056 - T160415 - T73563
  • 07:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1056, repool db1059 T160415 - T73563 (duration: 00m 43s)
  • 07:42 moritzm: installing git updates on trusty
  • 07:35 dcausse: cirrus: refresh comp suggest indices in elastic@codfw
  • 07:26 ema: upgrading twisted to 16.2.0 on lvs4003 and lvs4004 (ulsfo secondaries) T160433
  • 07:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1070, db1071 and db1082 - T137191 (duration: 00m 43s)
  • 06:10 Krinkle: Removing xhgui.results entries before 1-Dec-2016 finished. Running xhgui->command(compact=>results) now. T161196
  • 02:31 Krinkle: Reverted patch - https://gerrit.wikimedia.org/r/#/c/344569/
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 08m 35s)
  • 02:28 Krinkle: Reminder to incident doc writer: Was difficult figuring out what the last "real" patch was, scap message for SAL is manually written (not says which commit in which repo), and git log contains noise from security patches. We need simple revert options from the flat git tree at /srv/mediawiki
  • 02:26 Krinkle: Reminder to incident doc writer: Logstash was (and is) not responsive serving Kibana-rendered errors about logstash Service unavailable
  • 02:25 Krinkle: All apaches are back up
  • 02:24 krinkle@tin: Synchronized php-1.29.0-wmf.17/extensions/Wikidata: revert (duration: 02m 34s)
  • 02:24 MaxSem: Killed l10nupdate on tin, was blocking emergency pushes
  • 02:22 Krinkle: Hard-killed all l10nupdate processes and rm'ed scap lock
  • 02:11 Krinkle: Removing xhgui.results entries from before 1 December 2016 in MongoDB on tungsten (T161196)
  • 01:45 mutante: bacula - on helium, attempt to start bacula-director process, attempt to fix permissions on key files as codified in director.pp
  • 01:40 catrope@tin: Finished scap: Wikidata cherry-picks (with i18n) (duration: 25m 03s)
  • 01:15 catrope@tin: Started scap: Wikidata cherry-picks (with i18n)

2017-03-23

  • 23:26 Krinkle: Removing xhgui.results entries from before 1 June 2016 (T161196)
  • 23:12 thcipriani@tin: Synchronized php-1.29.0-wmf.17/extensions/ORES: SWAT: Stats: Invert "false" thresholds so they are correct T161250 (duration: 00m 52s)
  • 23:05 Pchelolo: update RESTBase to 2536b25c7 - eqiad
  • 22:56 Pchelolo: update RESTBase to 2536b25c7 - staging
  • 22:39 Pchelolo: update RESTBase to 2536b25c7 - codfw
  • 21:36 krinkle@tin: Synchronized wmf-config/StartProfiler.php: (no justification provided) (duration: 00m 53s)
  • 21:05 ejegg: rolled back payments-wiki to 9622a4b
  • 21:00 ejegg: updated payments from 9622a4b to bb956bf
  • 20:16 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.17
  • 19:34 thcipriani@tin: Synchronized php: Swap symlink for 1.29.0-wmf.17 (duration: 00m 43s)
  • 19:11 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.17
  • 18:51 bblack: systemctl enable+start of lldpd on cp2009, cp1051, cp1061 (mysteriously dead and disabled)
  • 18:16 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Remove exception on Other Projects sidebar for Dutch Wikipedia (T159634) (duration: 00m 47s)
  • 18:04 Pchelolo: update RESTBase to 9d2b393fb - production
  • 17:52 Pchelolo: update RESTBase to 9d2b393fb - staging
  • 16:46 mobrovac@tin: Started restart [parsoid/deploy@0c22f72]: (no justification provided)
  • 16:45 _joe_: reenabling puppet on all jobqueue redises
  • 16:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1082 weight - T137191 (duration: 00m 43s)
  • 16:22 hashar: Merged operations/puppet.git Jenkins job in a single one that runs tox then rake - T160923
  • 16:10 urandom: T111113: Live-hacking client encryption to be non-optional, to verify cqlsh encryption, restbase1007-a.eqiad.wmnet
  • 16:07 mobrovac: restbase deploy 752ca4b7
  • 15:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1082 weight - T137191 (duration: 00m 43s)
  • 15:32 moritzm: upgrading restbase-test* to Linux 4.9
  • 14:59 akosiaris: enabling and running puppet on rdb200X fleet in a rolling restart scheme
  • 14:59 akosiaris: disabled puppet on rdb* fleet
  • 14:56 andrewbogott: dist-upgrading labvirt1001 and rebooting it a few times
  • 14:22 moritzm: installing exim4 updates from jessie point release
  • 14:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 with low weight - T137191 (duration: 00m 48s)
  • 13:53 dcausse: cirrus: refreshing comp suggest indices in elastic@eqiad to measure times
  • 12:59 marostegui: Deploy schema change s4 on db1064 https://phabricator.wikimedia.org/T160415 - https://phabricator.wikimedia.org/T73563
  • 12:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1059, depool db1064 T160415 - T73563 (duration: 00m 43s)
  • 12:27 moritzm: installing libxml2 security updates
  • 12:21 marostegui: Deploy schema change s4 on labsdb1003 https://phabricator.wikimedia.org/T160415 - https://phabricator.wikimedia.org/T73563
  • 12:05 jynus: converting es2014 tables back to uncompressed InnoDB T129350
  • 11:08 godog: codfw-prod: bump ms-be2028 ms-be2039 object weight to 2000 T158337
  • 11:01 godog: pool prometheus200[34] / depool prometheus200[12] - T148408
  • 11:01 jynus@tin: Synchronized wmf-config/db-codfw.php: Pool all es2XXX servers, depool es2014 for maintenance (duration: 00m 43s)
  • 10:59 hashar: Actually restarting Jenkins for email plugins upgrades
  • 10:20 hashar: Jenkins jobs got slightly blocked because I forgot to cancel the shutdown when jobs had to run.
  • 09:58 hashar: Jenkins: upgrading plugins email-ext and mailer
  • 09:14 hashar: Jenkins upgrading SSH Slaves plugin. Might cause disruption in CI
  • 08:47 moritzm: repooled mw1261 now that T161095 is deployed
  • 08:29 marostegui: Stop db1070 MySQL db1070 for maintenance - T137191
  • 08:06 moritzm: installing audiofile security updates
  • 07:37 marostegui: Deploy schema change s4 on db1059 and labsdb1001 T160415 - T73563
  • 07:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1068, depool db1059 T160415 - T73563 (duration: 00m 43s)
  • 07:08 marostegui: Stop MySQL db1082 for maintenance - https://phabricator.wikimedia.org/T137191
  • 06:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 - T137191 (duration: 00m 44s)
  • 03:14 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Mar 23 03:14:05 UTC 2017 (duration 5m 47s)
  • 03:08 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 14m 39s)
  • 02:35 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 13m 26s)
  • 01:46 mutante: added ottomata and milimetric to "wmf-deployers" in Gerrit web ui, both have existing (deployment resp. root) shell already (T161157)

2017-03-22

  • 22:59 RainbowSprinkles: gerrit: Quick service restart, picking up new config
  • 21:25 awight: reenabling Jenkins orphan rectifier job
  • 21:18 andrewbogott: rebooting labvirt1001 because it is being terrible. https://phabricator.wikimedia.org/T159835
  • 21:05 demon@tin: Synchronized wmf-config/InitialiseSettings-labs.php: No-op, beta (duration: 00m 47s)
  • 21:00 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: No-op, beta (duration: 00m 43s)
  • 20:52 awight: disabling Ingenico orphan rectifier
  • 20:43 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: Group0 to php-1.29.0-wmf.17
  • 20:05 thcipriani@tin: Synchronized php-1.29.0-wmf.17/extensions/ZeroPortal/includes/ApiZeroPortal.php: Failure to parse json config should result in a usable error T161036 (duration: 00m 42s)
  • 20:04 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Remove redundant whitelist read list for grantswiki (duration: 00m 44s)
  • 19:55 thcipriani@tin: Synchronized php-1.29.0-wmf.17/extensions/Flow: Make sure topiclist queries always join against workflow table T121644 (duration: 00m 59s)
  • 19:45 thcipriani@tin: Synchronized php-1.29.0-wmf.17/includes/Revision.php: getRevisionText() cache the converted text (duration: 00m 44s)
  • 19:44 mutante: rsyncing /srv of netmon1001 to /srv/netmon1001 on gerrit2001 (T125020)
  • 19:37 jynus: deploying m2 dns additions on codfw
  • 19:12 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db1094 with full weight (duration: 00m 43s)
  • 17:10 _joe_: restarted ocg on ogc1001, not serving http queries
  • 16:55 jynus: shutting down es2016's mariadb to clone to es2015
  • 15:41 hashar@tin: Synchronized php-1.29.0-wmf.16/languages/classes/LanguageKk.php: Check for string initialization in ucfirst() to make HHVM 3.18 happy - T161095 (duration: 00m 44s)
  • 15:40 hashar@tin: Synchronized php-1.29.0-wmf.16/languages/classes/LanguageAz.php: Check for string initialization in ucfirst() to make HHVM 3.18 happy - T161095 (duration: 00m 48s)
  • 15:36 hashar: Deploying LanguageAz.php and LanguageKk.php hotfix for HHVM 3.18 on mwdebug* and mw1261 - T161095
  • 15:34 hashar@tin: Synchronized php-1.29.0-wmf.17/languages/classes/LanguageKk.php: Check for string initialization in ucfirst() to make HHVM 3.18 happy - T161095 (duration: 00m 54s)
  • 15:33 hashar@tin: Synchronized php-1.29.0-wmf.17/languages/classes/LanguageAz.php: Check for string initialization in ucfirst() to make HHVM 3.18 happy - T161095 (duration: 00m 59s)
  • 15:25 ema: cp*: removed linux-image-amd64, linux-image-3.16.0-4-amd64 and linux-image-4.4.0-1-amd64 to reduce churn
  • 14:54 moritzm: rebooting elastic2001 to Linux 4.9
  • 14:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1087 original weight - T137191 (duration: 00m 44s)
  • 14:24 marostegui: Deploy schema change s4 to db1068 - https://phabricator.wikimedia.org/T160415 https://phabricator.wikimedia.org/T73563
  • 14:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1081, depool db1068 T160415 - T73563 (duration: 00m 43s)
  • 13:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1087 weight - T137191 (duration: 00m 47s)
  • 13:39 volans: stopped ircecho to avoid the message spam
  • 13:15 dcausse: eu swat done
  • 13:09 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: [cirrus] Enable the completion suggester (duration: 00m 43s)
  • 13:07 bblack@puppetmaster1001: conftool action : set/ttl=275; selector: dnsdisc=appservers-rw
  • 12:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Enable db1087 for API - T137191 (duration: 00m 42s)
  • 12:29 dcausse: cirrus: reindexing lost writes (2017-03-21T13:30:00Z to 2017-03-21T17:50:00Z) during es5 upgrade in elastic@eqiad (T157479)
  • 12:26 marostegui: Deploy schema change on s4 to db1081 and labsdb1011 - T160415 T73563
  • 12:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1084, depool db1081 T160415 - T73563 (duration: 00m 43s)
  • 12:20 gehel: maps restarting kartotherian - T150354
  • 12:18 gehel: installing latest mapnik version on maps servers
  • 12:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 with low weight - T137191 (duration: 00m 43s)
  • 12:09 gehel: maps upgrade to nodejs 6 completed - T150354
  • 12:09 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1004.eqiad.wmnet
  • 12:05 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1004.eqiad.wmnet
  • 12:04 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1003.eqiad.wmnet
  • 12:02 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1003.eqiad.wmnet
  • 12:01 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1002.eqiad.wmnet
  • 11:58 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1002.eqiad.wmnet
  • 11:57 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1001.eqiad.wmnet
  • 11:54 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1001.eqiad.wmnet
  • 11:53 gehel: maps codfw fully upgraded to nodejs 6, starting upgrade on maps eqiad - T150354
  • 11:51 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2003.codfw.wmnet
  • 11:51 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2004.codfw.wmnet
  • 11:46 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2004.codfw.wmnet
  • 11:41 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2002.codfw.wmnet
  • 11:34 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2002.codfw.wmnet
  • 11:33 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2001.codfw.wmnet
  • 11:27 gehel: maps2001.codfw.wmnet upgraded to nodejs6
  • 11:19 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2001.codfw.wmnet
  • 11:15 akosiaris: Enable IGMP snooping for private1-d-eqiad on asw2-d. T133387
  • 11:15 akosiaris: Enable IGMP snooping for private1-d-eqiad. T133387
  • 11:05 gehel: disabling puppet on all maps servers - T150354
  • 11:04 gehel: upgrade maps to nodejs 6 - T150354
  • 10:53 akosiaris: cr1-eqiad: set ae4 and members to enable again. T133387
  • 10:41 akosiaris: reoot asw2-d T133387
  • 10:31 dcausse: cirrus: rebuilding comp suggest indices in elastic@eqiad
  • 10:15 akosiaris: Upgrading asw2-d-eqiad to JunOS 14.1X53 (T133387)
  • 10:09 akosiaris: cr1-eqiad: set ae4 and members to disable. T133387
  • 09:55 moritzm: upgrading mw1261 to HHVM 3.18.1
  • 09:50 moritzm: upgrading mwdebug* to HHVM 3.18.1
  • 09:40 marostegui: Deploy alter table s4 (commonswiki) db1084 - T73563 T160415
  • 09:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 T160415 - T73563 (duration: 00m 43s)
  • 09:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1091 T160415 - T73563 (duration: 00m 43s)
  • 09:20 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus2002.codfw.wmnet
  • 08:46 marostegui: Stop MySQL db1070 to clone db1087 from it - T137191
  • 07:53 dcausse: rebuilding ttmserver index in elastic@eqiad to catchup lost writes during es5 upgrade
  • 07:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1091 T160415 - T73563 (duration: 00m 43s)
  • 07:10 oblivian@puppetmaster1001: conftool action : set/ttl=300; selector: dnsdisc=.*
  • 07:05 marostegui: Stop MySQL db1087 - T137191
  • 06:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 - T137191 (duration: 00m 43s)
  • 06:43 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2037 T160415 - T73563 (duration: 00m 43s)
  • 06:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1092 weight - T137191 (duration: 00m 49s)
  • 05:44 _joe_: finished tests on citoid/dns discovery; restbase successfully detects the change
  • 05:18 _joe_: depooling temporarily citoid in eqiad from dns discovery
  • 02:40 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Mar 22 02:40:42 UTC 2017 (duration 5m 29s)
  • 02:35 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 13m 20s)
  • 02:02 krinkle@tin: Synchronized errorpages/: minor tweaks - I60344bd519d (duration: 00m 54s)
  • 00:15 Dereckson: SWAT done.
  • 00:15 dereckson@tin: Synchronized php-1.29.0-wmf.16/extensions/CirrusSearch/: CompSuggest: Increase default limit from 50 to 255 + speed optimization (Gerrit:343962 + Gerrit:343966) (duration: 00m 55s)
  • 00:05 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Allow translationadmin self-add for beta.wikiversity admins (T160120) (duration: 00m 43s)

2017-03-21

  • 23:57 eileen: update civicrm from 92e3b85 to d3c439f
  • 23:44 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Mapframe on sv.wikipedia (T161032) (duration: 00m 43s)
  • 23:29 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Translate on beta.wikiversity (T160120) (duration: 00m 45s)
  • 23:19 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Make rcenhancedfilters available as beta feature, enable on test wikis (Gerrit:343435 + Gerrit:343436) (duration: 00m 51s)
  • 22:45 mutante: lists: deactivate arbcom-ko per T160892 and Google translation of Korean talk pages
  • 22:44 Dereckson: Run namespaceDupes on pnbwiki (T159976)
  • 22:28 Dereckson: Create Translate tables on betawikiversity (T160120)
  • 20:59 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: Revert Group0 to 1.29.0-wmf.17
  • 20:56 demon@tin: Synchronized wmf-config/InitialiseSettings.php: logging for bad header stuff (duration: 00m 52s)
  • 20:33 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: T161001 Turn off completion suggester until length error is fixed (duration: 00m 44s)
  • 20:29 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: Group0 to 1.29.0-wmf.17
  • 19:54 thcipriani@tin: Finished scap: testwiki to php-1.29.0-wmf.17 and rebuild l10n cache (duration: 51m 16s)
  • 19:35 mutante: phab2001 - same as iridium, phab search config change
  • 19:33 mutante: iridium - ran puppet after gerrit:343936 - phabricator config change to use cluster search applied
  • 19:22 chasemp: clean out admin-monitoring for nova-fullstack T160908
  • 19:10 mutante: ruthenium - dev API enabled in parsoid config for parsoid rt tests
  • 19:03 thcipriani@tin: Started scap: testwiki to php-1.29.0-wmf.17 and rebuild l10n cache
  • 18:18 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase weight of db1092 and db1094 (duration: 00m 42s)
  • 18:02 twentyafterfour: refreshing phabricator's elasticsearch index in eqiad
  • 17:56 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: [es5 upgrade] step 4: repool eqiad for writes (3/3) (duration: 00m 42s)
  • 17:54 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: [es5 upgrade] step 4: repool eqiad for writes (2/3) (duration: 00m 42s)
  • 17:53 dcausse@tin: Synchronized wmf-config/CommonSettings.php: [es5 upgrade] step 4: repool eqiad for writes (1/3) (duration: 00m 42s)
  • 17:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1092 weight - T137191 (duration: 00m 42s)
  • 17:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1092 weight - T137191 (duration: 00m 45s)
  • 17:13 thcipriani: starting branch cut for 1.29.0-wmf.17
  • 17:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092 with low weight - T137191 (duration: 00m 42s)
  • 17:02 urandom: T111113: Rolling restart of RESTBase, eqiad, complete
  • 16:52 urandom: T111113: Rolling restart of RESTBase, eqiad
  • 16:41 urandom: T111113: Rolling restart of RESTBase, codfw, complete
  • 16:17 urandom: T111113: Enabling RESTBase client encryption on (remaining) codfw nodes
  • 16:11 urandom: T111113: Enabling RESTBase client encryption on restbase2001.codfw.wmnet (canary)
  • 15:56 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2020.codfw.wmnet
  • 15:27 moritzm: removed "Directory Managers" group from LDAP (Bug T157131)
  • 15:01 bd808@tin: Synchronized php-1.29.0-wmf.16/extensions/OpenStackManager/special/SpecialNovaInstance.php: SpecialNovaInstance: Remove some totally useless domain code. (T160995) (duration: 00m 43s)
  • 14:58 gehel: elasticsearch upgrade on eqiad is completed - T157479
  • 14:50 moritzm: installing gnutls security updates on trusty (jessie already fixed)
  • 14:44 gehel: elasticsearch eqiad, full cluster restart after cleanup of known old indices - T157479
  • 14:39 gehel: deleting old v2 indices from each elasticsearch server - T157479
  • 14:34 gehel: deleting old v2 indices from elastic1030: azbwiki_general_first, vewikimedia_content_1415331110, vewikimedia_general_1415331150 - T157479
  • 14:07 gehel: upgrading elasticsearch eqiad to v5.x - T157479
  • 14:01 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2044, depool db2037 T160415 - T73563 (duration: 00m 42s)
  • 13:44 dcausse: eu SWAT done
  • 13:39 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: [es5 upgrade] Enable completion suggester (duration: 00m 42s)
  • 13:30 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: [es5 upgrade] step 3: depool eqiad for writes (take 2) (3/3) (duration: 00m 41s)
  • 13:29 gehel: rolling restart of wdqs to load new configuration options
  • 13:29 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: [es5 upgrade] step 3: depool eqiad for writes (take 2) (2/3) (duration: 00m 43s)
  • 13:27 dcausse@tin: Synchronized wmf-config/CommonSettings.php: [es5 upgrade] step 3: depool eqiad for writes (take 2) (1/3) (duration: 00m 42s)
  • 13:15 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: T157111 pagePreviews: Increase perf instrumentation sample (duration: 00m 58s)
  • 13:14 Reedy: Make that clear 2FA for RickinBaltimore per T160671
  • 13:12 Reedy: Clear centralauth for RickinBaltimore per T160671
  • 12:54 moritzm: installing r-base security updates
  • 12:47 gehel: running stress and bonnie on elastic2020 - T149006
  • 12:34 Dereckson: Created OATHAuth tables on projectcomwiki (T143138)
  • 12:27 Dereckson: Create account Superzerocool on projectcomwiki (bureaucrat, T143138)
  • 11:00 ema: upgrading twisted to 16.2.0 on lvs1007-12 T160433
  • 10:33 marostegui: Run pt-table-checksum on s6 (ruwiki) - https://phabricator.wikimedia.org/T160509
  • 09:42 moritzm: installing libevent security updates on remaining hosts in eqiad
  • 09:42 marostegui: Stop MySQL db1070 to clone db1092 from it - T137191
  • 09:14 akosiaris: enable bacula deamons on helium, everything looks ok
  • 09:09 moritzm: installing wireshark security updates
  • 09:06 hashar: CI deploying config hack "High priority test pipeline"  : https://gerrit.wikimedia.org/r/343318 - T160667
  • 08:43 gehel: shutting down elasticsearch on elastic2020, investigating T149006
  • 07:50 gehel: banning elastic2020 from cluster to investigate T149006
  • 07:36 marostegui: Stop mysql db1092 for maintenance - T137191
  • 07:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 - T137191 (duration: 00m 42s)
  • 07:22 marostegui: Run pt-table-checksum on s6 (jawiki) - T160509
  • 07:18 marostegui: Deploy schema change on db2044 and labsdb1009 (s4) - https://phabricator.wikimedia.org/T160415 - https://phabricator.wikimedia.org/T73563
  • 07:18 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2044 - T160415 - T73563 (duration: 00m 41s)
  • 06:49 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2051 - T160415 - T73563 (duration: 01m 07s)
  • 06:01 joal@tin: Finished deploy [analytics/refinery@c3a9139]: (no justification provided) (duration: 06m 39s)
  • 05:55 joal@tin: Started deploy [analytics/refinery@c3a9139]: (no justification provided)
  • 03:38 eileen: update civicrm from 21afe66 to 92e3b85
  • 03:10 eileen: update civicrm from 0ed1659 to 21afe66
  • 02:37 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Mar 21 02:37:33 UTC 2017 (duration 5m 22s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 13m 22s)
  • 01:59 eileen: update civicrm from f454f16 to 0ed1659
  • 00:37 Amir1: ladsgroup@terbium:/srv/mediawiki/php-1.29.0-wmf.16$ mwscript extensions/ORES/maintenance/PopulateDatabase.php --wiki=etwiki is done now (T159609)
  • 00:30 Amir1: ladsgroup@terbium:/srv/mediawiki/php-1.29.0-wmf.16$ mwscript extensions/ORES/maintenance/PopulateDatabase.php --wiki=etwiki (T159609)
  • 00:28 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES review tool in etwiki (T159609) (duration: 00m 42s)
  • 00:13 Amir1: mwscript maintenance/sql.php --wiki=etwiki extensions/ORES/sql/(ores_model|ores_classification).sql (T159609)
  • 00:04 Krinkle: mwscript deleteEqualMessages.php on public wikis (T45917)
  • 00:02 eileen: update civicrm from e058e8c to f454f16

2017-03-20

  • 23:59 mutante: phab2001 / iridium - running puppet after gerrit:343635 - switches phab search to codfw
  • 23:58 dereckson@tin: Synchronized php-1.29.0-wmf.16/extensions/CirrusSearch/includes/CompletionSuggester.php: Don't pass null suggest queries to elasticsearch (T160896) (duration: 00m 42s)
  • 23:54 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Restrict page images to lead section (T152115) (duration: 00m 43s)
  • 23:48 dereckson@tin: Synchronized php-1.29.0-wmf.16/extensions/CirrusSearch/includes/BuildDocument/Completion/SuggestBuilder.php: Gerrit:343754 Allow completion suggester to work with titles that look like integers (duration: 00m 45s)
  • 23:47 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable wgCiteResponsiveReferences on fr. en. it. la. no.wp + en.wikt (duration: 00m 46s)
  • 23:39 dereckson@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Gerrit:343781 Test ORES migration on ruwiki beta too (labs only, no-op in prod) (duration: 00m 42s)
  • 22:57 mutante: ruthenium: running puppet after gerrit:343782 added missing diffserver unit file. puppet run looked good: Visualdiff::Server[diffserver]/Service[diffserver]/ensure: ensure changed 'stopped' to 'running', systemctl status says failed though
  • 22:54 ppchelko@tin: Finished deploy [trending-edits/deploy@e4fa9b8]: Config: Set up 'trends_at' property T160127 (duration: 06m 20s)
  • 22:47 ppchelko@tin: Started deploy [trending-edits/deploy@e4fa9b8]: Config: Set up 'trends_at' property T160127
  • 22:45 ejegg: updated payments-wiki from f991f15 to 9622a4b
  • 22:38 ppchelko@tin: Finished deploy [trending-edits/deploy@5d3eb7f]: Do not purge articles that have trended T160127 (duration: 07m 57s)
  • 22:31 mutante: ruthenium - gerrit:343682 applied - puppet: OK nginx: OK diffserver service refresh: failed @ssastry
  • 22:30 ppchelko@tin: Started deploy [trending-edits/deploy@5d3eb7f]: Do not purge articles that have trended T160127
  • 20:52 mutante: DNS - new Wikipedias "khw" (Khowar) and "kbp" (Kabiye) created (T160868) (T160865) ( on ns0/ns1: authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones to trigger template recreation after edit to langs.tmpl)
  • 20:47 mutante: DNS - ns2 - authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones to create new WP languages 'khw' and 'kbp'
  • 20:19 bsitzmann@tin: Finished deploy [mobileapps/deploy@815ebb5]: Update mobileapps to c0ab01d (duration: 07m 31s)
  • 20:14 reedy@tin: Synchronized php-1.29.0-wmf.16/includes/api/ApiQueryAllPages.php: Limit query=allpages filterredir if MiserMode T160916 (duration: 00m 42s)
  • 20:12 reedy@tin: Synchronized php-1.29.0-wmf.16/includes/specials/SpecialAllPages.php: Re-enable Special:AllPages, disable redirect filter if MiserMode T160916 (duration: 00m 42s)
  • 20:12 bsitzmann@tin: Started deploy [mobileapps/deploy@815ebb5]: Update mobileapps to c0ab01d
  • 19:45 mutante: lists: disabled wikimediaro-l due to inactivity (disabling lists is easy nowadays and also revertable): fermium: sudo /usr/local/sbin/disable_list <list name> | (T146563)
  • 19:42 mobrovac@tin: Finished deploy [changeprop/deploy@decb6a1]: (no justification provided) (duration: 00m 56s)
  • 19:41 mobrovac@tin: Started deploy [changeprop/deploy@decb6a1]: (no justification provided)
  • 18:39 thcipriani@tin: Synchronized wmf-config: SWAT: Revert "Revert "Turn off patrolling for FlaggedRevs in bswiki"" T158662 (duration: 00m 44s)
  • 18:28 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: pagePreviews: Enable by default on "stage 0" wikis T136602 (duration: 00m 42s)
  • 18:18 ariel@tin: Finished deploy [dumps/dumps@91d3215]: more default config fixes, flagged rev table config fix (duration: 00m 02s)
  • 18:18 ariel@tin: Started deploy [dumps/dumps@91d3215]: more default config fixes, flagged rev table config fix
  • 17:35 akosiaris: slow rolling restart of redis databases in codfw T159850
  • 17:22 ariel@tin: Finished deploy [dumps/dumps@80d88cd]: fic buglet due to new default config file (duration: 00m 02s)
  • 17:22 ariel@tin: Started deploy [dumps/dumps@80d88cd]: fic buglet due to new default config file
  • 17:09 gehel@tin: Finished deploy [wdqs/wdqs@e9e7c95]: (no justification provided) (duration: 01m 41s)
  • 17:07 gehel@tin: Started deploy [wdqs/wdqs@e9e7c95]: (no justification provided)
  • 16:48 mobrovac: restbase deploying e4c327b0
  • 15:59 hashar: Special:AllPages being blank has a public task: https://phabricator.wikimedia.org/T160916
  • 15:50 dcausse@tin: Synchronized wmf-config/CommonSettings.php: Revert: T157479 [es5 upgrade] step 3: depool eqiad for writes (1/3) (duration: 00m 42s)
  • 15:49 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: Revert: T157479 [es5 upgrade] step 3: depool eqiad for writes (2/3) (duration: 00m 42s)
  • 15:40 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: Revert: T157479 [es5 upgrade] step 3: depool eqiad for writes (3/3) (duration: 00m 42s)
  • 15:39 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: T157479 [es5 upgrade] step 3: depool eqiad for writes (3/3) (duration: 00m 41s)
  • 15:37 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: T157479 [es5 upgrade] step 3: depool eqiad for writes (2/3) (duration: 00m 46s)
  • 15:34 dcausse@tin: Synchronized wmf-config/CommonSettings.php: T157479 [es5 upgrade] step 3: depool eqiad for writes (1/3) (duration: 00m 45s)
  • 15:15 hashar@tin: Synchronized php-1.29.0-wmf.16/extensions/Translate: ElasticTTM: set the index when deleting docs (duration: 00m 53s)
  • 15:08 hashar@tin: Synchronized php-1.29.0-wmf.16/includes/specials/SpecialWatchlist.php: Restoring Watchlist: Fix form and preference overriding https://gerrit.wikimedia.org/r/#/c/343433/ (duration: 00m 51s)
  • 14:36 hashar@tin: Synchronized php-1.29.0-wmf.16/includes/specials/SpecialAllPages.php: Disable SpecialAllPages on all wikis. Temporary workaround (duration: 01m 08s)
  • 14:36 hashar: Disabled Special:AllPages on all wikis making it spurts a blank page instead. ( https://gerrit.wikimedia.org/r/#/c/343647/ )
  • 14:32 akosiaris: disable puppet on all rdb* nodes to shepherd https://gerrit.wikimedia.org/r/343027 into production. T159850
  • 14:28 elukey: (Correct one) Temporary hack for T160888 - moved /srv/mw-log/archive/api.log-20170224.gz to /srv/mw-log/archive/api_log_backup_elukey/ to avoid rsync timeouts to stat1002 (the file is big and close to being deleted for retention)
  • 14:27 elukey: Temporary hack for T160886 - moved /srv/mw-log/archive/api.log-20170224.gz to /srv/mw-log/archive/api_log_backup_elukey/ to avoid rsync timeouts to stat1002 (the file is big and close to being deleted for retention)
  • 14:22 hashar@tin: Synchronized php-1.29.0-wmf.16/includes/specials/SpecialWatchlist.php: reverts commit SpecialWatchlist.php 0d675d2 (duration: 00m 43s)
  • 13:55 jynus: shutting down es2015 for maintenance T160242
  • 13:41 zfilipin@tin: Synchronized wmf-config/: SWAT: Enable CollaborationKit on beta enwiki (T138325) (duration: 00m 44s)
  • 13:35 zfilipin@tin: Synchronized php-1.29.0-wmf.16/tests/phpunit/includes/specials/SpecialWatchlistTest.php: SWAT: Watchlist: Fix form and preference overriding (T160734) (duration: 00m 48s)
  • 13:34 zfilipin@tin: Synchronized php-1.29.0-wmf.16/includes/specials/SpecialWatchlist.php: SWAT: Watchlist: Fix form and preference overriding (T160734) (duration: 01m 01s)
  • 11:39 akosiaris: return rdb1007 client-output-buffer-limit config to initially configured value T159850
  • 10:09 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 after crash (duration: 00m 47s)
  • 09:42 godog: swift bump ms-be2028 -> ms-be2039 weight - T158337
  • 09:37 jynus: restarting db1094 for upgrade
  • 09:02 dcausse: refreshing ttm documents in elastic@codfw
  • 08:47 hashar: Jenkins: depooling / deleting Precise instances. T158652
  • 08:28 dcausse: cirrus: refreshing all comp sugggest indices in elastic@codfw
  • 02:23 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Mar 20 02:23:38 UTC 2017 (duration 5m 25s)
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 06m 44s)

2017-03-19

  • 18:40 ariel@tin: Finished deploy [dumps/dumps@8cff500]: generate json status files for use by downloaders (duration: 00m 02s)
  • 18:39 ariel@tin: Started deploy [dumps/dumps@8cff500]: generate json status files for use by downloaders
  • 10:43 ariel@tin: Finished deploy [dumps/dumps@87d748b]: dump magic words and namespace info (duration: 00m 02s)
  • 10:43 ariel@tin: Started deploy [dumps/dumps@87d748b]: dump magic words and namespace info
  • 02:24 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Mar 19 02:24:53 UTC 2017 (duration 5m 23s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 07m 29s)

2017-03-18

  • 20:16 chasemp: labstore1005 service nfs-exportd restart
  • 19:43 chasemp: test on labstore1004 nfs-exportd candidate /root/nfs-exportd-candidate.py --observer-pass xxxxxx --interval 0 --config-path /etc/nfs-mounts.yaml --exports-d-path /root/fake_export/ --debug
  • 18:38 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 after crash (duration: 01m 02s)
  • 18:20 jynus: powercycling db1094
  • 02:36 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Mar 18 02:36:00 UTC 2017 (duration 5m 23s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 11m 29s)
  • 00:06 mutante: lists: creating new list wikimedia-nys (Noongar language) (T159499)

2017-03-17

  • 23:48 mutante: lists: creating new list wikispecies-admin (T159625)
  • 23:36 catrope@tin: Synchronized php-1.29.0-wmf.16/extensions/VisualEditor/lib/ve: Fixes for T154123 T160479 T160190 T160197 (duration: 00m 42s)
  • 23:31 mutante: lists: making Steinsplitter and Zhuyifei1999 list admins of commons-poty (T160672)
  • 16:16 elukey: reimage restbase-dev1001.eqiad.wmnet
  • 14:01 marostegui: Deploy schema change on dbstore1001 and db2051 (s4) - T160415 - T73563
  • 14:00 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2051 - T160415 - T73563 (duration: 00m 42s)
  • 13:39 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2058 - T160415 - T73563 (duration: 01m 06s)
  • 12:46 chasemp: labsdb10[01|03] maintain-views --table user_groups --all-database --replace-all --debug
  • 12:44 chasemp: labsdb10[09|10|11] maintain-views --table user_groups --all-database --replace-all --debug
  • 11:33 elukey: reimage analytics1044 (Hadoop Worker node) to Debian Jessie
  • 10:58 akosiaris: reimage helium.eqiad.wmnet to jessie
  • 09:04 jynus: killing 11h-running query on db1089 from terbium (orphan process)
  • 08:32 marostegui: Deploy schema change on dbstore2002 and db2058 (s4) - T160415 T73563
  • 08:31 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2058 - T160415 - T73563 (duration: 00m 43s)
  • 08:00 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2065 - T160415 - T73563 (duration: 00m 44s)
  • 07:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1070 - T157931 (duration: 00m 45s)
  • 02:39 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Mar 17 02:39:10 UTC 2017 (duration 5m 22s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 12m 12s)
  • 01:54 urandom: T111113: Rolling restarts of Cassandra complete
  • 01:12 urandom: T111113: Rolling restarts of Cassandra, eqiad, rack 'd'
  • 00:41 ebernhardson@tin: Synchronized php-1.29.0-wmf.16/resources/src/mediawiki.special/: SWAT: Fix search result percentage width when no interwiki sidebar shown (duration: 00m 42s)
  • 00:40 ebernhardson@tin: Synchronized php-1.29.0-wmf.16/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: SWAT: enabled sister search AB test on 8 wikis (duration: 00m 43s)
  • 00:34 urandom: T111113: Rolling restarts of Cassandra, eqiad, rack 'b'
  • 00:23 urandom: T111113: Rolling restarts of Cassandra on restbase1016
  • 00:13 urandom: T111113: Rolling restarts of Cassandra on restbase1011
  • 00:03 urandom: T111113: Rolling restarts of Cassandra on restbase1010

2017-03-16

  • 23:46 reedy@tin: Synchronized php-1.29.0-wmf.16/extensions/CodeReview: Fix preg_ error again (duration: 00m 47s)
  • 23:25 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable PageViewInfo to group2 T125917 (duration: 00m 49s)
  • 23:24 urandom: T111113: Rolling restarts of Cassandra in codfw, rack 'd' *correction*
  • 23:24 urandom: T111113: Rolling restarts of Cassandra in codfw, rack 'b'
  • 22:34 urandom: T111113: Rolling restarts of Cassandra in codfw, rack 'a'
  • 21:50 urandom: T111113: Rolling restarts of Cassandra in codfw, rack 'b'
  • 21:36 urandom: T111113: Restarting Cassandra on restbase1007-{b,c} to enable (optional) client encryption
  • 21:19 urandom: T111113: Restarting Cassandra on restbase1007-a to enable (optional) client encryption
  • 21:17 ebernhardson: reindexing group2 in cirrussearch for codfw downtime during 2.x -> 5.x upgrade
  • 21:06 ejegg: updated new CiviCRM from cca5921 to e058e8c
  • 20:08 mutante: repooled elastic2010, depooled correct host elastic2020 instead (T149006)
  • 20:08 dzahn@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=elastic2020.codfw.wmnet
  • 20:08 dzahn@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2010.codfw.wmnet
  • 20:06 mutante: depooled elastic2010 since it is powered-off/down. (set/pooled=inactive) - (T149006)
  • 20:05 dzahn@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=elastic2010.codfw.wmnet
  • 20:05 twentyafterfour: restarted phd on iridium to fix workers dieing
  • 19:26 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.16
  • 19:07 thcipriani@tin: Synchronized php-1.29.0-wmf.16/extensions/VisualEditor/lib/ve: SWAT: Update VE core submodule to wmf/1.29.0-wmf.16 HEAD (50a6323d7) T154123 T160479 (duration: 00m 44s)
  • 19:02 gehel: restart relforge to activate new plugins - T160674
  • 16:57 ebernhardson: started cirrus completion indices rebuild for group2 on wasat.codfw.wmnet
  • 16:48 ebernhardson: manually adjusted wikiversions on wasat.codfw.wmnet to point all wikis at wmf.16 to rebuild cirrus completion search indices before group2 rolls forward
  • 16:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1070 - T157931 (duration: 00m 41s)
  • 16:32 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2065 - T160415 - T73563 (duration: 00m 42s)
  • 16:01 marostegui: Deploy schema change on s4 (commonswiki) https://phabricator.wikimedia.org/T73563 and https://phabricator.wikimedia.org/T160415
  • 16:00 elukey: racadm serveraction powerdown on mw2256 for hw maintenance
  • 15:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092 - T160415 (duration: 00m 42s)
  • 15:44 godog: reboot ms-be1008 after disk swap to clear stuck mkfs.xfs
  • 15:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 - T160415 (duration: 00m 42s)
  • 15:27 otto@tin: Finished deploy [eventlogging/eventbus@75ab39c]: /v1/schemas/:schema_uri endpoint, T159179 (duration: 00m 14s)
  • 15:27 otto@tin: Started deploy [eventlogging/eventbus@75ab39c]: /v1/schemas/:schema_uri endpoint, T159179
  • 15:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 - T160415 (duration: 00m 42s)
  • 15:13 elukey: restart hhvm on mw1200, high load and queued requests - hhvm-dump-debug on /tmp/hhvm.27107.bt.
  • 15:09 elukey: restart hhvm on mw1207, high load and queued requests - hhvm-dump-debug on /tmp/hhvm.27441.bt.
  • 15:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 - T160415 (duration: 00m 42s)
  • 14:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 - T160415 (duration: 00m 41s)
  • 14:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 - T160415 (duration: 00m 42s)
  • 14:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1070 with low weight - T157931 (duration: 00m 45s)
  • 14:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1045 - T160415 (duration: 00m 43s)
  • 14:12 Dereckson: EU SWAT, round 2, done
  • 14:11 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Create Wikichanzo namespace for swwiki T158041) (duration: 00m 42s)
  • 14:07 dereckson@tin: Synchronized wmf-config/throttle.php: Add Odia Wikipedia's 100 Women Editathon throttle rule (T160619) (duration: 00m 57s)
  • 13:52 Dereckson: Resume EU SWAT for two new changes
  • 13:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1045 - T160415 (duration: 00m 58s)
  • 13:38 marostegui: Shutdown es2015 for maintenance - T160242
  • 13:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1026 - T160415 (duration: 00m 42s)
  • 13:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1026 - T160415, Repool db1067 - T160435 (duration: 00m 42s)
  • 13:01 addshore: EU SWAT done
  • 12:59 addshore@tin: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule T160427 (lift of IP cap for RIT - March 25, 2017) (duration: 00m 43s)
  • 12:49 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: wmgUseInterwikiSorting true for wikidataclients T160465 T150183 (duration: 00m 42s)
  • 12:39 marostegui: Deploy schema change on s5 - T160415
  • 12:38 addshore@tin: Synchronized php-1.29.0-wmf.16/extensions/InterwikiSorting: Use ExtensionFunctions instead of BeforeInitialize hook T160465 (duration: 00m 44s)
  • 12:31 addshore@tin: Synchronized php-1.29.0-wmf.16/extensions/InterwikiSorting: Use ExtensionFunctions instead of BeforeInitialize hook T160465 (duration: 00m 43s)
  • 12:17 addshore@tin: Synchronized php-1.29.0-wmf.15/extensions/InterwikiSorting: Use ExtensionFunctions instead of BeforeInitialize hook T160465 (duration: 00m 43s)
  • 11:48 godog: repair prometheus' leveldb database archived_fingerprint_to_metric on bast3002, upgrade prometheus to latest version from jessie-backports
  • 11:26 moritzm: enabled BBR as TCP congestion control algorithm on cp1008
  • 11:04 joal@tin: Finished deploy [analytics/aqs/deploy@006bf8c]: (no justification provided) (duration: 03m 30s)
  • 11:01 joal@tin: Started deploy [analytics/aqs/deploy@006bf8c]: (no justification provided)
  • 10:59 joal@tin: Finished deploy [analytics/aqs/deploy@006bf8c]: (no justification provided) (duration: 02m 13s)
  • 10:56 joal@tin: Started deploy [analytics/aqs/deploy@006bf8c]: (no justification provided)
  • 10:12 volans: upgraded cumin to version 0.0.2 in the repository and on neodymium/sarin
  • 10:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 - T160415 (duration: 00m 41s)
  • 09:56 moritzm: installing libevent security updates
  • 09:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T160415 (duration: 00m 42s)
  • 09:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 - T160415 (duration: 00m 42s)
  • 09:46 moritzm: upgrading apache on cobalt/gerrit
  • 09:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 - T160415 (duration: 00m 47s)
  • 09:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T160415 (duration: 00m 42s)
  • 09:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 - T160415 (duration: 00m 42s)
  • 09:11 moritzm: upgrading apache on fermium/lists.wikimedia.org
  • 09:10 moritzm: upgrading apache on mendelevium/OTRS
  • 09:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 - T160415 (duration: 00m 42s)
  • 09:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 - T160415 (duration: 00m 41s)
  • 08:57 godog: codfw-prod: add ms-be203[1-9] - T158337
  • 08:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 - T160415 (duration: 00m 41s)
  • 08:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T160415 (duration: 00m 41s)
  • 08:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T160415 (duration: 00m 43s)
  • 08:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T160415 (duration: 00m 41s)
  • 08:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1080 - T160415 (duration: 00m 46s)
  • 08:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080 - T160415 (duration: 00m 47s)
  • 08:12 moritzm: upgrading apache on einsteinium/icinga.wikimedia.org
  • 07:51 marostegui: Deploy schema change on s1 - T160415
  • 07:36 marostegui: Deploy schema change on s7 - T160415
  • 07:08 marostegui: Starting pt-table-checksum on s6 (frwiki) - T160509
  • 03:01 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Mar 16 03:01:37 UTC 2017 (duration 5m 50s)
  • 02:55 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 13m 39s)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 08m 51s)
  • 00:14 eileen: updated civicrm from f1a3d64 to cca5921

2017-03-15

  • 23:36 twentyafterfour: train unblocked and wmf.16 is deployed to group1 wikis.
  • 23:32 twentyafterfour@tin: Synchronized php-1.29.0-wmf.16/extensions/ApiFeatureUsage/ApiFeatureUsageQueryEngineElastica.php: deploy I2d8603 refs T160578 T158997 (duration: 00m 42s)
  • 23:25 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Restrict page images to lead section on cawiki T152115 (duration: 00m 42s)
  • 23:17 thcipriani@tin: Synchronized wmf-config: SWAT: Set $wgOresExtension for I63b11eff3a4 T159763 (duration: 00m 44s)
  • 23:12 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy PageViewInfo to group1 T125917 (duration: 00m 43s)
  • 22:51 twentyafterfour@tin: Synchronized wmf-config/CirrusSearch-common.php: Deploy I4980da refs T160569 and T158997 (duration: 00m 42s)
  • 22:34 mutante: Cassandra test hosts: deploy break-fix gerrit:342912 , run puppet on cerium and praseodymium. on xenon puppet is disabled.
  • 21:54 twentyafterfour@tin: Synchronized wmf-config/CirrusSearch-common.php: Deploy I67d712 refs T160569 and T158997 (duration: 00m 42s)
  • 21:52 eileen: civicrm update from 639eb68 to f1a3d64
  • 21:29 twentyafterfour@tin: Synchronized wmf-config: deploy Iad9849 to fix 160569 and unblock the train refs T158997 (duration: 00m 49s)
  • 21:01 twentyafterfour@tin: Synchronized wmf-config: deploy I489c4a to fix 160569 and unblock the train refs T158997 (duration: 00m 45s)
  • 20:54 ladsgroup@tin: Finished deploy [ores/deploy@bc0bc74]: Mid-March deploy of ORES (T160279) (duration: 26m 46s)
  • 20:44 gehel: restarting postgresql on maps clusters - T160209
  • 20:38 urandom: T111113: Restarting xenon (RESTBase Staging) to enable client encryption (canary)
  • 20:34 jynus@tin: Synchronized wmf-config/db-eqiad.php: Move db1067 from s2 to s1 as a db1057 replacement (duration: 00m 42s)
  • 20:30 twentyafterfour: T160569 blocks the train until I can figure out what is causing it. The frequency is low so I haven't reverted to wmf.15, group 1 remains on wmf.16 refs T158997
  • 20:27 ladsgroup@tin: Started deploy [ores/deploy@bc0bc74]: Mid-March deploy of ORES (T160279)
  • 20:13 bsitzmann@tin: Finished deploy [mobileapps/deploy@fa43048]: Update mobileapps to bb8fcf2 (duration: 03m 51s)
  • 20:09 bsitzmann@tin: Started deploy [mobileapps/deploy@fa43048]: Update mobileapps to bb8fcf2
  • 20:05 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.16
  • 19:56 twentyafterfour@tin: Synchronized php-1.29.0-wmf.16/includes/specialpage/: deploy revert of 5b15728 (duration: 00m 44s)
  • 19:46 jynus: shutting down db1067 for maintenance (as a db1057 replacement) T160435
  • 19:16 mobrovac@tin: Finished deploy [changeprop/deploy@b68bf51]: Deploy producer fix for T159200 (duration: 00m 51s)
  • 19:15 mobrovac@tin: Started deploy [changeprop/deploy@b68bf51]: Deploy producer fix for T159200
  • 18:35 legoktm@tin: Synchronized php-1.29.0-wmf.16/resources/src/mediawiki.widgets/mw.widgets.SearchInputWidget.js: mw.widgets.SearchInputWidget: Do not pass to TextInputWidget - T148471 (2/2) (duration: 00m 42s)
  • 18:34 legoktm@tin: Synchronized php-1.29.0-wmf.16/includes/widget/SearchInputWidget.php: mw.widgets.SearchInputWidget: Do not pass to TextInputWidget - T148471 (1/2) (duration: 00m 41s)
  • 18:32 legoktm@tin: Synchronized php-1.29.0-wmf.16/includes/libs/filebackend/SwiftFileBackend.php: Make sure Swift store operations close the source file handle - T159607 (duration: 00m 44s)
  • 18:25 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy Linter to group0 and small wikis - T148609 (duration: 00m 42s)
  • 18:21 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy PageViewInfo to group0 - T125917 (duration: 00m 42s)
  • 18:20 otto@tin: Finished deploy [eventstreams/deploy@eb8698e]: T159200 (duration: 06m 18s)
  • 18:19 ppchelko@tin: Finished deploy [trending-edits/deploy@85be190]: Update to node-rdkafka 0.8.0. T159200 (duration: 06m 11s)
  • 18:19 legoktm@tin: Synchronized wmf-config/logging.php: Use custom LogstashFormatter - T145133, T151290 (duration: 00m 42s)
  • 18:15 mobrovac@tin: Finished deploy [changeprop/deploy@614cb4b]: Deploy for switching to librdkafka 0.9.4 T159200 (duration: 00m 33s)
  • 18:15 mobrovac@tin: Started deploy [changeprop/deploy@614cb4b]: Deploy for switching to librdkafka 0.9.4 T159200
  • 18:14 mobrovac: restbase deploying f047dabb
  • 18:13 ppchelko@tin: Started deploy [trending-edits/deploy@85be190]: Update to node-rdkafka 0.8.0. T159200
  • 18:13 otto@tin: Started deploy [eventstreams/deploy@eb8698e]: T159200
  • 18:12 ottomata: upgrading librdkafka on scb eqiad nodes T159200
  • 18:12 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Show 'Publish' not 'Save' on most public wikis -T131132 (duration: 00m 42s)
  • 18:08 mobrovac@tin: Finished deploy [changeprop/deploy@614cb4b]: Deploy to EQIAD canary for switching to librdkafka 0.9.4 T159200 (duration: 00m 20s)
  • 18:07 mobrovac@tin: Started deploy [changeprop/deploy@614cb4b]: Deploy to EQIAD canary for switching to librdkafka 0.9.4 T159200
  • 18:07 ppchelko@tin: Finished deploy [trending-edits/deploy@85be190]: Update to node-rdkafka 0.8.0. Canary on scb1001.eqiad.wmnet. T159200 (duration: 01m 07s)
  • 18:06 ppchelko@tin: Started deploy [trending-edits/deploy@85be190]: Update to node-rdkafka 0.8.0. Canary on scb1001.eqiad.wmnet. T159200
  • 18:06 otto@tin: Started deploy [eventstreams/deploy@eb8698e]: T159200
  • 17:55 ppchelko@tin: Finished deploy [trending-edits/deploy@85be190]: Update to node-rdkafka 0.8.0 in codfw. T159200 (duration: 03m 51s)
  • 17:53 mobrovac@tin: Finished deploy [changeprop/deploy@614cb4b]: Deploy to CODFW for switching to librdkafka 0.9.4 T159200 (duration: 01m 44s)
  • 17:52 otto@tin: Finished deploy [eventstreams/deploy@eb8698e]: T159200 (duration: 01m 35s)
  • 17:51 mobrovac@tin: Started deploy [changeprop/deploy@614cb4b]: Deploy to CODFW for switching to librdkafka 0.9.4 T159200
  • 17:51 ppchelko@tin: Started deploy [trending-edits/deploy@85be190]: Update to node-rdkafka 0.8.0 in codfw. T159200
  • 17:50 otto@tin: Started deploy [eventstreams/deploy@eb8698e]: T159200
  • 17:50 ottomata: upgrading librdkafka on scb in codfw T159200
  • 17:46 otto@tin: Finished deploy [eventstreams/deploy@eb8698e]: T159200 (duration: 00m 17s)
  • 17:46 otto@tin: Started deploy [eventstreams/deploy@eb8698e]: T159200
  • 17:43 mobrovac@tin: Finished deploy [changeprop/deploy@614cb4b]: Canary deploy for switching to librdkafka 0.9.4 T159200 (duration: 00m 53s)
  • 17:43 ppchelko@tin: Finished deploy [trending-edits/deploy@85be190]: Trending: Update to node-rdkafka 0.8.0. Canary on scb2001. T159200 (duration: 01m 21s)
  • 17:42 mobrovac@tin: Started deploy [changeprop/deploy@614cb4b]: Canary deploy for switching to librdkafka 0.9.4 T159200
  • 17:41 ppchelko@tin: Started deploy [trending-edits/deploy@85be190]: Trending: Update to node-rdkafka 0.8.0. Canary on scb2001. T159200
  • 17:21 demon@tin: Synchronized wmf-config/CommonSettings.php: Stop calling an idiot user an idiot (duration: 00m 42s)
  • 17:03 demon@tin: Synchronized wmf-config/: pruning old extensionmessages files (duration: 00m 49s)
  • 15:58 moritzm: upgraded jessie systems running HHVM in deployment-prep to 3.18.1+dfsg-1+wmf1
  • 15:47 moritzm: uploaded new HHVM 3.18 package with backported patch for stat_cache regression (T158176)
  • 15:45 marostegui: For the record: deployed schema change on s2 and s6 for image table (add an index) - T160415
  • 14:22 moritzm: installing chromium security update on osmium
  • 14:05 moritzm: uploaded python-phabricator 0.6.1-1~bpo8~trusty1 for trusty-wikimedia to apt.wikimedia.org (required for Phabricator support in offboarding script running on terbium (trusty))
  • 13:48 phuedx@tin: Synchronized wmf-config/InitialiseSettings.php: T160403: Add d to enwikisource's import list (duration: 00m 42s)
  • 13:37 phuedx@tin: Synchronized wmf-config/InitialiseSettings.php: T157111: pagePreviews: Enable perf instrumentation (duration: 00m 42s)
  • 13:18 phuedx@tin: Synchronized wmf-config/InitialiseSettings.php: 342456: Remove "editusercssjs". (duration: 02m 50s)
  • 13:14 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Enable Cognate on beta wiktionary sites T156241 Beta Only (again) (duration: 02m 45s)
  • 13:04 gehel: syncing puppet git repo on wdqs-puppet.wikidata-query.eqiad.wmflabs
  • 12:13 godog: deploy thumbor 0.1.36-1 on thumbor100*
  • 10:41 Dereckson: Run namespaceDupes.php for pnb.wiktionary (T159976): all looks good for this one
  • 10:37 Dereckson: Run namespaceDupes.php for pnb.wikipedia (T159976)
  • 10:34 ema: upgrade cp4001 (misc) and cp4011 (maps) to linux 4.9 T154934
  • 09:11 marostegui: Disable parallel replication on dbstore2002, dbstore2001, dbstore1002, dbstore1001 - T160407
  • 09:02 marostegui: Disable parallel replication on x1 slaves (db1029, db2033) - T160407
  • 08:27 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Enable Cognate on beta wiktionary sites T156241 Beta Only (duration: 02m 48s)
  • 08:26 moritzm: removed imagemagick 6.8.9.9-5+deb8u7+wmf1 from apt.wikimedia.org (the sharpen patch is folded into the new 6.8.9.9-5+deb8u8 security update)
  • 08:22 marostegui: Deploy alter table x1 testing parallel replication - T160407
  • 08:11 moritzm: installing imagemagick security updates
  • 07:26 marostegui: Enable parallel replication on x1 slaves - T160407
  • 02:35 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 13m 33s)
  • 00:55 eileen: update civicrm from 31f19d6 to 639eb68
  • 00:41 maxsem@tin: Synchronized wmf-config/logging.php: https://gerrit.wikimedia.org/r/342778 (duration: 02m 46s)
  • 00:32 maxsem@tin: Synchronized php-1.29.0-wmf.16/extensions/RelatedSites/: Hide DMOZ links with https://gerrit.wikimedia.org/r/#/c/342753/ + https://gerrit.wikimedia.org/r/#/c/342768/ (duration: 02m 48s)
  • 00:27 maxsem@tin: Synchronized php-1.29.0-wmf.15/extensions/RelatedSites/: Hide DMOZ links with https://gerrit.wikimedia.org/r/#/c/342753/ + https://gerrit.wikimedia.org/r/#/c/342768/ (duration: 02m 48s)
  • 00:19 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/340697/2 (duration: 02m 53s)
  • 00:08 mutante: depooled mw2256 because it's down again (T155180)
  • 00:08 dzahn@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2256.codfw.wmnet
  • 00:05 dzahn@puppetmaster1001: conftool action : get/pooled; selector: dc=eqiad,name=mw2256.codfw.wmnet

2017-03-14

  • 23:59 maxsem@tin: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/342148/ (duration: 02m 47s)
  • 23:55 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/342148/ (duration: 02m 47s)
  • 23:42 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: - (duration: 02m 50s)
  • 23:31 tgr@tin: Finished scap: T125917: Deploy PageViewInfo to testwiki (duration: 48m 58s)
  • 22:42 tgr@tin: Started scap: T125917: Deploy PageViewInfo to testwiki
  • 21:29 ebernhardson: reindexed search in group0 for mondays codfw search downtime/upgrade
  • 20:45 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Beta Cluster only (duration: 02m 50s)
  • 20:17 twentyafterfour: scap was unable to connect to mw2256.codfw.wmnet
  • 20:14 twentyafterfour@tin: Finished scap: full scap of new branch, move test wikis to 1.29.0-wmf.16 refs T158997 (duration: 56m 05s)
  • 19:18 twentyafterfour@tin: Started scap: full scap of new branch, move test wikis to 1.29.0-wmf.16 refs T158997
  • 19:14 ema: restarting pybal on lvs1010-11 T160405
  • 19:13 Reedy: Delete 2FA for User:Conny per request on IRC. Identy verified via Lydia_WMDE
  • 18:42 nuria@tin: Finished deploy [eventlogging/analytics@417c40f]: (no justification provided) (duration: 00m 02s)
  • 18:42 nuria@tin: Started deploy [eventlogging/analytics@417c40f]: (no justification provided)
  • 18:39 gehel: removing swap from elasticsearch servers - T158884
  • 18:37 ottomata: upgrading librdkafka to 0.9.4 and restarting varnishkafka on cache text hosts
  • 18:19 ottomata: upgrading librdkafka to 0.9.4 and restarting varnishkafka on cache upload hosts
  • 18:13 ottomata: upgrading librdkafka to 0.9.4 and restarting varnishkafka on cache misc hosts
  • 18:11 nuria@tin: Finished deploy [eventlogging/analytics@c3ccb4a]: (no justification provided) (duration: 00m 03s)
  • 18:11 nuria@tin: Started deploy [eventlogging/analytics@c3ccb4a]: (no justification provided)
  • 17:07 ottomata: upgrading librdkafka to 0.9.4 on cache misc and restarting varnishkafka
  • 16:29 jynus: no reponse from db1057 after powercycle- trying to hard reset it
  • 16:10 urandom: T111113: Restart Cassandra in RESTBase Staging to enable optional client encryption
  • 15:39 godog: shut ms-be2002 for idrac / bios troubleshooting T155689
  • 15:24 chasemp: silence toolschecker precise job start check in anticipation of removal
  • 15:18 twentyafterfour: preparing to branch 1.29.0-wmf.16 refs T158997
  • 14:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1080 - T132416 (duration: 00m 40s)
  • 14:36 marostegui: Enabled parallel replication (5 threads) on db2033 (x1) - T160407
  • 14:20 chasemp: labsdb100[9|10|11] 'maintain-views --all-databases --table page --replace-all --debug'
  • 14:18 chasemp: labsdb1003 time maintain-views --all-databases --table page --replace-all --debug
  • 14:01 Dereckson: Purged portals URL
  • 13:56 dereckson@tin: Synchronized portals: Resync portals/ directory after touch (duration: 00m 42s)
  • 13:56 chasemp: labsdb1001 maintain-views --all-databases --table page --replace-all --debug
  • 13:46 dereckson@tin: Synchronized portals: Bump to e576c18522ff (duration: 00m 41s)
  • 13:45 dereckson@tin: Synchronized portals/prod/wikipedia.org/assets: Bump to e576c18522ff (duration: 00m 41s)
  • 13:18 elukey: started redis-cli --bigkeys -i 0.1 on rdb1008 (eqiad jobqueue slave)
  • 13:15 dereckson@tin: Synchronized portals: (no justification provided) (duration: 00m 41s)
  • 13:14 dereckson@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 41s)
  • 13:00 gehel: restarting elasticsearch on relforge1001 to test gelf appender
  • 12:41 elukey: reimage analytics1043 to Debian Jessie
  • 12:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1054 with full weight after warmup (duration: 00m 40s)
  • 12:28 jynus: stopping mariadb on db1057, preparing to backup and reimage
  • 12:24 addshore@tin: Synchronized dblists/: T150183 wmgUseInterwikiSorting true for all wikidata clients #1 #2 PT 4/4 (duration: 00m 41s)
  • 12:23 addshore@tin: Synchronized docroot/: T150183 wmgUseInterwikiSorting true for all wikidata clients #1 #2 PT 3/4 NOOP (duration: 00m 44s)
  • 12:19 addshore@tin: Synchronized wmf-config/CommonSettings.php: T150183 wmgUseInterwikiSorting true for all wikidata clients #1 #2 PT 2/4 (duration: 00m 41s)
  • 12:18 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T150183 wmgUseInterwikiSorting true for all wikidata clients #1 #2 PT 1/4 (duration: 00m 52s)
  • 09:15 addshore@tin: Synchronized dblists/interwikisorting.dblist: wmgUseInterwikiSorting true for wikidata clients, excluding wikipedias T150183 (duration: 00m 42s)
  • 08:38 elukey: moved some log files from /var/log/upstart/$logname.log.1 to /var/log/upstart/$logname.log.1.bis on labvirt1014, labtestvirt2001, labtestnet2001, labnet1001 to reduce cronspam
  • 08:15 moritzm: installing icu security updates on trusty (jessie already fixed)
  • 08:07 moritzm: installing icoutils security update on trusty (jessie already fixed)
  • 07:26 moritzm: installing python-imaging/pillow security updates on trusty (jessie already fixed)
  • 07:07 marostegui: Deploy alter table enwiki.revision db1080 - T132416
  • 07:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080 - T132416 (duration: 00m 41s)
  • 07:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1083 - T132416 (duration: 00m 41s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 12m 36s)

2017-03-13

  • 23:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1054 after upgrade with low weight (duration: 00m 41s)
  • 22:29 bawolff@tin: Synchronized php-1.29.0-wmf.15/extensions/SemanticForms/includes/SF_ValuesUtils.php: Backport bb42c6f401b9 (duration: 00m 48s)
  • 21:40 bawolff: Deployed fix for T160266
  • 20:45 addshore: InterwikiSorting deploy (to group0) done
  • 20:43 addshore@tin: Synchronized wmf-config/CommonSettings.php: T150183 Enable InterwikiSorting on group0 #1 #2 PT 4/4 (duration: 00m 40s)
  • 20:42 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T150183 Enable InterwikiSorting on group0 #1 #2 PT 3/4 (duration: 00m 41s)
  • 20:41 addshore@tin: Synchronized docroot/noc/conf/interwikisorting.dblist: T150183 Enable InterwikiSorting on group0 #1 #2 PT 2/4 NOOP (duration: 00m 42s)
  • 20:39 addshore@tin: Synchronized dblists/interwikisorting.dblist: T150183 Enable InterwikiSorting on group0 #1 #2 PT 1/4 (duration: 00m 51s)
  • 18:37 dcausse@tin: Synchronized php-1.29.0-wmf.15/extensions/CirrusSearch/: Make incoming link counting compatible with 5.x (duration: 00m 53s)
  • 18:06 jynus: chowning /var/lib/git/operations/puppet to gitpuppet on labscontrol1002
  • 18:03 jynus: chowning /var/lib/git/operations/puppet to gitpuppet on labscontrol1001
  • 17:46 reedy@tin: Synchronized wmf-config/throttle.php: Throttle rule for event currently ongoing (duration: 00m 43s)
  • 17:29 gehel: re-configuring cluster settings after elasticsearch upgrade - T158680
  • 17:29 dcausse: done re-enabling writes to elastic@codfw (elastic5 upgrade)
  • 17:28 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: [es5 upgrade] step 2: repool codfw and send wmf16 to codfw 3/3 (duration: 00m 41s)
  • 17:26 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: [es5 upgrade] step 2: repool codfw and send wmf16 to codfw 2/3 (duration: 00m 44s)
  • 17:24 dcausse@tin: Synchronized wmf-config/CommonSettings.php: [es5 upgrade] step 2: repool codfw and send wmf16 to codfw 1/3 (duration: 00m 46s)
  • 17:23 gehel@tin: Finished deploy [wdqs/wdqs@202a106]: (no justification provided) (duration: 01m 46s)
  • 17:22 gehel@tin: Started deploy [wdqs/wdqs@202a106]: (no justification provided)
  • 17:19 jynus: stopping mariadb at db1054 and preparing for backup and reimage
  • 17:18 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1003.eqiad.wmnet
  • 16:57 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1054 for upgrade (duration: 00m 53s)
  • 16:55 godog: outdated swift rings pushed in eqiad-prod, pushed again updated rings from git repo - T158337
  • 16:35 godog: add ms-be2028/29/30 to swift codfw-prod, initial add - T158337
  • 16:25 gehel: restarting elasticsearch on all codfw cluster after upgrade - T158680
  • 16:23 gehel: restarting elasticsearch on elastic2001 after upgrade - T158680
  • 16:06 gehel: upgrading plugins to 5.1.2 on elasticsearch codfw - T158680
  • 15:41 gehel: shutting down elasticsearch on codfw for v5.1.2 upgrade - T158680
  • 15:21 dcausse: elastic@codfw stopped to receive writes
  • 15:21 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: [es5 upgrade] step 1: depool codfw for writes 2/2 (duration: 00m 44s)
  • 15:19 dcausse@tin: Synchronized wmf-config/CommonSettings.php: [es5 upgrade] step 1: depool codfw for writes 1/2 (duration: 00m 45s)
  • 14:33 marostegui: Deploy alter table enwiki.revision db1083 - T132416
  • 14:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083 - T132416 (duration: 00m 41s)
  • 14:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T132416 (duration: 00m 41s)
  • 13:01 hashar@tin: Synchronized wmf-config/CommonSettings.php: +$wgAvailableRights[] = autoreviewrestore; (duration: 00m 41s)
  • 12:08 ema: restart pybal on lvs1003 to add swift-https_443
  • 12:05 moritzm: install libevent security updates
  • 11:56 elukey: reimage analytics1042 (Hadoop worker node) to Debian Jessie
  • 11:15 godog: bounce pybal on lvs1006 to try picking up swift https changes
  • 11:06 zeljkof: purge bswiki logo - T158815
  • 10:44 Dereckson: Update site statistics on gu.wikipedia (T160328)
  • 09:23 gehel: downgrading elasticsearch to v5.1.2 on relforge, a full reindex will be needed - T156150
  • 08:40 marostegui: Compress dewiki - db1070 - T153743
  • 08:31 marostegui: Stop replication on labsdb1009,10 and 11 - T153743
  • 08:30 marostegui: Stop MySQL on db1095 (sanitarium2) to take a backup - T153743
  • 08:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1070 - T153743 (duration: 00m 41s)
  • 08:08 marostegui: Deploy alter table s6 - db1050 (master) - T159414
  • 08:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1030 - T159414 (duration: 00m 41s)
  • 07:46 moritzm: upgrading apache on remaining mediawiki servers in eqiad
  • 07:24 marostegui: Deploy alter table enwiki.revision db1089 - T132416
  • 07:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T132416 (duration: 00m 41s)
  • 07:13 marostegui: Deploy alter table s6 revision table on db1030 - T159414
  • 07:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1030 - T159414 (duration: 00m 52s)
  • 06:52 elukey: powercycle mw2256, stuck in boot (looked in the console)
  • 02:28 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Mar 13 02:28:18 UTC 2017 (duration 5m 21s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 08m 57s)

2017-03-12

  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Mar 12 02:25:37 UTC 2017 (duration 5m 32s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 07m 25s)

2017-03-11

  • 08:39 jynus: powercycle es2015 - unresponsive
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 07m 41s)
  • 00:19 smalyshev@tin: Finished deploy [wdqs/wdqs@UNKNOWN]: Deploy new updater on 1003 for potential connection drop fix (duration: 02m 15s)
  • 00:16 smalyshev@tin: Started deploy [wdqs/wdqs@UNKNOWN]: Deploy new updater on 1003 for potential connection drop fix
  • 00:08 smalyshev@tin: Finished deploy [wdqs/wdqs@UNKNOWN]: Deploy new updater on 1003 for potential connection drop fix (duration: 00m 16s)
  • 00:08 smalyshev@tin: Started deploy [wdqs/wdqs@UNKNOWN]: Deploy new updater on 1003 for potential connection drop fix
  • 00:07 SMalyshev: going to deploy updater patch on wdq1003. The host is in maintenance, not a production deployment.

2017-03-10

  • 21:42 hashar: restarted Zuul
  • 20:06 gehel: restart kartotherian / tilerator(ui) on maps-test*
  • 20:06 gehel@tin: Finished deploy [kartotherian/deploy@76adf21]: (no justification provided) (duration: 00m 54s)
  • 20:05 gehel@tin: Started deploy [kartotherian/deploy@76adf21]: (no justification provided)
  • 20:03 gehel@tin: Finished deploy [tilerator/deploy@b501046]: (no justification provided) (duration: 00m 16s)
  • 20:03 gehel@tin: Started deploy [tilerator/deploy@b501046]: (no justification provided)
  • 19:57 gehel: restarting tilerator(ui) on maps-test2004
  • 19:57 gehel@tin: Finished deploy [tilerator/deploy@b501046]: (no justification provided) (duration: 00m 04s)
  • 19:57 gehel@tin: Started deploy [tilerator/deploy@b501046]: (no justification provided)
  • 19:47 gehel: restarting tilerator(ui) on maps-test2004
  • 19:47 gehel@tin: Finished deploy [tilerator/deploy@b501046]: (no justification provided) (duration: 00m 03s)
  • 19:47 gehel@tin: Started deploy [tilerator/deploy@b501046]: (no justification provided)
  • 19:45 gehel: failed tilerator deploy on maps-test2004
  • 19:45 gehel@tin: Finished deploy [tilerator/deploy@b501046]: (no justification provided) (duration: 01m 20s)
  • 19:44 gehel@tin: Started deploy [tilerator/deploy@b501046]: (no justification provided)
  • 19:36 ejegg: ran wmf_civicrm db updates through 7500 - Add benevity as a financial type for benevity imports.
  • 19:34 gehel: restart kartotherian on maps-test2004
  • 19:28 gehel@tin: Finished deploy [kartotherian/deploy@76adf21]: (no justification provided) (duration: 00m 23s)
  • 19:27 gehel@tin: Started deploy [kartotherian/deploy@76adf21]: (no justification provided)
  • 19:19 gehel: upgrading kartotherian on maps-test2004 - T150354
  • 19:07 MaxSem: Unmasked kartotherian on maps-test2004
  • 18:28 smalyshev@tin: Finished deploy [wdqs/wdqs@1f2973c]: Deploy new updater on 1003 for potential connection drop fix (duration: 00m 03s)
  • 18:28 smalyshev@tin: Started deploy [wdqs/wdqs@1f2973c]: Deploy new updater on 1003 for potential connection drop fix
  • 17:28 ottomata: installed librdkafka 0.9.4 via dpkg -i on cp1052 (cache text) and restarted varnishkafka in preparation for fleet upgrade next week
  • 17:24 ottomata: installed librdkafka 0.9.4 via dpkg -i on cp1058 (cache misc) and restarted varnishkafka in preparation for fleet upgrade next week
  • 16:44 papaul: oresrdb2002 - signing puppet certs, salt-key, initial run
  • 16:25 elukey: reboot mw22(5[1-9]|60) to enable mw-cgroup mountpoint
  • 15:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1030 - T159414 (duration: 02m 42s)
  • 15:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1022 - T159414 (duration: 00m 45s)
  • 15:03 marostegui: Stop slave db2033 for maintenance - T159707
  • 14:05 hashar: contint1001 and contint2001 : Migrating git-daemon to systemd . Would stop zuul merger briefly
  • 13:58 elukey: added 3 new MW api-appservers (mw2251-53) and 7 new appservers (mw2254-60) to codfw
  • 13:35 hashar: Restarting Jenkins. Deadlocks in ssh connections. T160168
  • 07:28 moritzm: upgrading libarchive on trusty systems (jessie already fixed)
  • 07:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Added weight 1 for db1061 - T159414 (duration: 00m 40s)
  • 07:13 marostegui: Deploy alter table s6 revision table on db1022 - T159414
  • 07:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1022 - T159414 (duration: 00m 41s)
  • 04:29 mutante: codfw mw jobrunner: they start but then fail again shortly after: mw2248 jobrunner[67314]: [Fri Mar 10 04:23:07 2017] [hphp] [67314:7f6a34b746c0:0:000024] [] LightProcess::closeShadow failed due to exception: Failed in afdt::sendRaw: Broken pipe
  • 04:12 mutante: more codfw appservers ... - systemctl start jobchron, systemctl start jobrunner (both were failed but are now active (running)
  • 04:09 mutante: mw2155 - systemctl start jobchron, systemctl start jobrunner (both were failed but are now active (running)
  • 04:02 mutante: mw2249 systemctl start jobrunner - now Active: active (running)
  • 03:56 mutante: codfw appserver jobrunner service fail related to https://gerrit.wikimedia.org/r/#/c/259660/ ?
  • 03:54 mutante: codfw appservers showing "systemd degraded" alerts are failed jobrunner service unit. after puppet-agent "Mediawiki::Jobrunner/Package[jobrunner]/ensure) ensure changed..." ..then jobrunner.service: main process exited, code=exited, status=143/n/a
  • 02:51 AaronSchulz: Restarted job services for 5101424 (statsd batching) after monitoring mw1161
  • 02:39 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Mar 10 02:39:25 UTC 2017 (duration 5m 28s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 12m 17s)
  • 00:54 ppchelko@tin: Finished deploy [trending-edits/deploy@1673068]: Replayed events are purged based on current timestamp T160136 (duration: 06m 24s)
  • 00:48 ppchelko@tin: Started deploy [trending-edits/deploy@1673068]: Replayed events are purged based on current timestamp T160136
  • 00:39 ppchelko@tin: Finished deploy [trending-edits/deploy@a5716b9]: Replayed events are purged based on current timestamp T160136 (duration: 02m 23s)
  • 00:38 dereckson@tin: Synchronized php-1.29.0-wmf.15/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.ArticleTargetLoader.js: ArticleTargetLoader: wikitext switch shouldn't require FullRestbaseURL (T158692) (duration: 00m 41s)
  • 00:37 ppchelko@tin: Started deploy [trending-edits/deploy@a5716b9]: Replayed events are purged based on current timestamp T160136
  • 00:31 ppchelko@tin: Finished deploy [trending-edits/deploy@a5716b9]: Replayed events are purged based on current timestamp T160136 (duration: 07m 17s)
  • 00:30 eileen: update CiviCRM from d20ed40 to 31f19d6
  • 00:24 ppchelko@tin: Started deploy [trending-edits/deploy@a5716b9]: Replayed events are purged based on current timestamp T160136
  • 00:22 dereckson@tin: Synchronized wmf-config/CommonSettings.php: Move NavigationTiming config to EventLogging section + Remove setting of unused $wgPercentHHVM (Gerrit:342147 and Gerrit:342149, no-op) (duration: 00m 40s)
  • 00:19 maxsem@tin: Finished deploy [tilerator/deploy@160f314]: https://gerrit.wikimedia.org/r/#/c/342153/ - revert submodule updates due to broken manik->libc dependency (duration: 00m 16s)
  • 00:19 maxsem@tin: Started deploy [tilerator/deploy@160f314]: https://gerrit.wikimedia.org/r/#/c/342153/ - revert submodule updates due to broken manik->libc dependency

2017-03-09

  • 22:50 mutante: prometheus1003/1004 - systemctl stop prometheus (as opposed to /etc/init.d/prometheus), as they are low on disk but are not in production yet
  • 22:49 maxsem@tin: Finished deploy [tilerator/deploy@fb06c99]: https://gerrit.wikimedia.org/r/#/c/342140/ (duration: 00m 05s)
  • 22:48 maxsem@tin: Started deploy [tilerator/deploy@fb06c99]: https://gerrit.wikimedia.org/r/#/c/342140/
  • 22:46 mutante: prometheus1003 - stopping service: [....] Stopping monitoring system and time series database: prometheusInvalid --pidfile argument: '/var/run/prometheus/prometheus.pid' (Parent directory does not exist)
  • 22:46 maxsem@tin: Finished deploy [tilerator/deploy@fb06c99]: https://gerrit.wikimedia.org/r/#/c/342140/ (duration: 00m 21s)
  • 22:45 maxsem@tin: Started deploy [tilerator/deploy@fb06c99]: https://gerrit.wikimedia.org/r/#/c/342140/
  • 22:18 maxsem@tin: Finished deploy [tilerator/deploy@367df80]: no-op (duration: 00m 22s)
  • 22:18 maxsem@tin: Started deploy [tilerator/deploy@367df80]: no-op
  • 22:00 mobrovac@tin: Finished deploy [trending-edits/deploy@57a654e]: Bump max_pages for T156411 (duration: 06m 07s)
  • 21:54 mobrovac@tin: Started deploy [trending-edits/deploy@57a654e]: Bump max_pages for T156411
  • 21:37 mutante: fluorine - puppet node clean, puppet node deactivate, salt-key -d, remove from Icinga.. (T159996)
  • 21:35 mutante: fluorine - shutdown -h now (decom) T159996
  • 20:09 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.15
  • 20:02 mutante: cobalt: remove crontab entry of user gerrit2 that created reviewer counts, gzip /var/www/reviewer-counts.json and moved to /root/ for backup (re: gerrit:341592) T54329
  • 19:53 reedy@tin: Synchronized php-1.29.0-wmf.15/extensions/ConfirmEdit: Fixup maintenance script (duration: 00m 43s)
  • 19:22 legoktm: foreachwiki extensions/WikimediaMaintenance/createExtensionTables.php linter
  • 18:21 moritzm: rebooting cp1008 for upgrade to Linux 4.9
  • 17:50 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1003.eqiad.wmnet
  • 17:45 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1004.eqiad.wmnet
  • 17:11 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1003.eqiad.wmnet
  • 17:11 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1002.eqiad.wmnet
  • 17:11 bblack: reboot lvs1001 (post-incident cleanup reboot)
  • 17:02 bblack: reboot lvs1004 (post-incident cleanup reboot)
  • 16:58 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1001.eqiad.wmnet
  • 16:10 elukey: remove Piwik/bohrium health check from Varnish cache misc (https://gerrit.wikimedia.org/r/#/c/342007/)
  • 15:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1085 - T159414 (duration: 00m 41s)
  • 15:07 reedy@tin: Synchronized php-1.29.0-wmf.15/extensions/ConfirmEdit: Fixup maintenance script (duration: 00m 43s)
  • 15:02 moritzm: installing nettle security updates
  • 14:42 zeljkof: EU SWAT finished
  • 14:39 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add HD logos for several projects (T150618) (duration: 00m 41s)
  • 14:38 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Add HD logos for several projects (T150618) (duration: 00m 42s)
  • 14:35 moritzm: removed cn=svn group from LDAP directory (Bug: T129788)
  • 14:25 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: throttle] Add new throttle rule+remove expired rules (T159957) (duration: 00m 45s)
  • 14:15 addshore@tin: Synchronized wmf-config/CommonSettings.php: Don't show rdf2latex table hint with ElectronPdfService enabled T157432 (duration: 00m 49s)
  • 13:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 with normal weight after warmup (duration: 00m 40s)
  • 13:52 moritzm: removed cn=svnadm group from LDAP directory (Bug: T129788)
  • 13:46 moritzm: removed cn=trebuchet group from LDAP directory (Bug: T129788)
  • 13:43 gehel: invalidating Tasmania zoom level 10 tiles in varnish - T159631
  • 13:21 marostegui: Deploy alter table s6 revision table on db1085 - T159414
  • 13:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1085 - T159414 (duration: 00m 41s)
  • 13:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 - T159414 (duration: 00m 43s)
  • 12:34 moritzm: rebooting multatuli to Linux 4.9
  • 12:23 jynus: purging old rc rows from non-production database replicas
  • 11:24 marostegui: Stop replication db2033 - T159707
  • 10:49 marostegui: Deploy alter table s6 revision table on db1088 - T159414
  • 10:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 - T159414 (duration: 00m 41s)
  • 10:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 - T159414 (duration: 00m 42s)
  • 10:25 ema: service systemd-sysctl restart on lvs hosts
  • 08:21 marostegui: Deploy alter table s6 revision table on db1093 - T159414
  • 08:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 - T159414 (duration: 00m 49s)
  • 08:10 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 after maintenance with low weight (duration: 00m 43s)
  • 05:24 bblack: poweroff lvs1001 from idrac
  • 03:15 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Mar 9 03:15:39 UTC 2017 (duration 5m 53s)
  • 03:09 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 14m 35s)
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 14m 34s)
  • 01:08 twentyafterfour: phabricator update complete.
  • 01:06 twentyafterfour: updating phabricator to tag release/2017-03-08/1
  • 00:54 mutante: iridium - tested stop/start of phd service with upstart, unlink /etc/init.t/phd which was the formerly used symlink to a phab php script
  • 00:41 mutante: iridium - re-enable puppet, convert to base::service unit, phd restarting
  • 00:36 mutante: iridium - temp. disable puppet | phab1001 - converting service to base::service_unit (T137928)
  • 00:18 catrope@tin: Synchronized php-1.29.0-wmf.15/extensions/Echo/modules/styles/mw.echo.ui.NotificationBadgeWidget.less: Fix RTL popup alignment (T159999) (duration: 00m 42s)

2017-03-08

  • 22:10 legoktm: resuming running refreshLinks.php on small wikis
  • 21:43 arlolra@tin: Started restart [parsoid/deploy@0c22f72]: (no justification provided)
  • 21:41 legoktm@tin: Synchronized wmf-config/CommonSettings.php: Enable Linter on testwiki - T148609 (2/2) (duration: 00m 41s)
  • 21:39 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Linter on testwiki - T148609 (1/2) (duration: 00m 44s)
  • 21:38 legoktm: mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=testwiki linter
  • 21:31 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.15
  • 21:31 twentyafterfour@tin: Synchronized php-1.29.0-wmf.15/extensions/CodeReview/backend/CodeCommentLinker.php: deploy https://gerrit.wikimedia.org/r/#/c/341857/ (duration: 00m 46s)
  • 21:27 arlolra: Updated Parsoid to dec47257 (T59603)
  • 21:19 arlolra@tin: Finished deploy [parsoid/deploy@0c22f72]: Updating Parsoid to dec47257 (duration: 08m 19s)
  • 21:11 arlolra@tin: Started deploy [parsoid/deploy@0c22f72]: Updating Parsoid to dec47257
  • 19:54 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Reenable Collection on srn.wikipedia (T158467) (duration: 00m 46s)
  • 19:43 madhuvishy: Upgraded nslcd and libnss-ldapd in labstore100[1,2,4,5]
  • 19:36 reedy@tin: Synchronized php-1.29.0-wmf.14/extensions/ConfirmEdit: Maintenance script updates (duration: 00m 50s)
  • 17:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: db1070 ROW based replication comments - T153743 (duration: 00m 41s)
  • 17:28 Pchelolo: update RESTBase to 20e2c44c
  • 17:25 Pchelolo: update RESTBase to 20e2c44c: canary on restbase1007
  • 17:23 Pchelolo: update RESTBase to 20e2c44c: staging
  • 17:21 moritzm: installing Ubuntu imagemagick security updates (jessie already fixed)
  • 16:13 marostegui: Deploy alter table s6 revision table on dbstore1002 - T159414
  • 16:06 mobrovac@tin: Finished deploy [eventstreams/deploy@78e248c]: Deploy for T159486 (duration: 01m 48s)
  • 16:04 mobrovac@tin: Started deploy [eventstreams/deploy@78e248c]: Deploy for T159486
  • 15:37 moritzm: uploaded firmware-nonfree 20161130 for jessie-wikimedia/experimental to apt.wikimedia.org
  • 15:33 reedy@tin: Synchronized wmf-config/CommonSettings.php: Remove EducationProgram config back compat (duration: 00m 41s)
  • 15:32 reedy@tin: Synchronized wmf-config/flaggedrevs.php: Whitespace (duration: 00m 41s)
  • 15:29 moritzm: uploaded linux 4.9.13 for jessie-wikimedia/experimental to apt.wikimedia.org
  • 15:19 elukey: rebooting mw22(5[4-9]|60) as part of sanity check for T155180
  • 15:08 elukey: rebooting mw225[123] as part of sanity check for T155180
  • 14:42 zeljkof: EU SWAT finished
  • 14:42 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add HD logos for several projects (T150618) (duration: 00m 41s)
  • 14:41 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Add HD logos for several projects (T150618) (duration: 00m 44s)
  • 14:27 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update logo for bswiki (Bosnian Wikipedia) (T158815) (duration: 00m 41s)
  • 14:26 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Update logo for bswiki (Bosnian Wikipedia) (T158815) (duration: 00m 41s)
  • 14:16 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule (T159803) (duration: 00m 41s)
  • 13:42 marostegui: Deploy alter table s6 revision table on db1023 - T159414
  • 13:11 godog: make mwlog1001 the primary logging host, deprecate fluorine
  • 12:35 godog: add mwlog[12]001 to analytics-in4 term rsync-http-https - T123728
  • 11:35 moritzm: installing texlive-base security updates
  • 10:34 jynus: restarting labsdb1004's mariadb T159572
  • 10:31 marostegui: Shutdown postgresql on labsdb1007 for maintenance - T157359
  • 10:12 elukey: reimage analytics1041 to Debian Jessie
  • 09:51 gehel: re-enabled waterline import on maps[12]001 - T159631
  • 09:39 marostegui: Stop replication on db2033 - T159707
  • 09:07 ariel@tin: Finished deploy [dumps/dumps@e30fbd0]: run monitor.py relative to cwd, to pick up default config files (duration: 00m 02s)
  • 09:07 ariel@tin: Started deploy [dumps/dumps@e30fbd0]: run monitor.py relative to cwd, to pick up default config files
  • 09:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1070 - T153743 (duration: 00m 41s)
  • 08:36 moritzm: upgrading apache on mw1161-mw1208
  • 08:36 marostegui: Restart mysql on db1070 to change binlog to ROW - T153743
  • 08:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1070 - T153743 (duration: 00m 41s)
  • 07:27 marostegui: Start pt-table-checksum on plwiki (s2) - T154485
  • 07:19 marostegui: Deploy alter table s6 revision table on db1061 - T159414
  • 07:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1060 original weight - T158193 (duration: 00m 47s)
  • 03:40 krinkle@tin: Synchronized docroot/noc/: Fix conftool link (I2f34be0a5), Remove IE6 css (Iae8a356e2), add db-codfw.php (I9f02dee3c) (duration: 00m 42s)
  • 03:17 bblack: authdns back to normal (puppet enabled, do normal things!)
  • 03:09 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Mar 8 03:09:21 UTC 2017 (duration 5m 49s)
  • 03:03 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 15m 08s)
  • 02:46 bblack: disabling puppet on production authdns caches (testing dns lint related bits)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 07m 53s)
  • 01:33 demon@tin: Synchronized scap/plugins/clean.py: no-op (duration: 00m 41s)
  • 00:35 mobrovac@tin: Finished deploy [electron-render/deploy@5ec5614]: (no justification provided) (duration: 00m 59s)
  • 00:34 mobrovac@tin: Started deploy [electron-render/deploy@5ec5614]: (no justification provided)
  • 00:33 mobrovac@tin: Finished deploy [electron-render/deploy@5ec5614]: (no justification provided) (duration: 04m 08s)
  • 00:29 mobrovac@tin: Started deploy [electron-render/deploy@5ec5614]: (no justification provided)
  • 00:27 mobrovac@tin: Finished deploy [electron-render/deploy@5ec5614]: Deploy for T159486 (duration: 04m 46s)
  • 00:27 mobrovac@tin: Finished deploy [mobileapps/deploy@d6202e4]: Deploy for T159486 (duration: 03m 52s)
  • 00:26 catrope@tin: Synchronized php-1.29.0-wmf.15/extensions/Echo/modules/ui/: Fix regression in Echo popup (duration: 00m 42s)
  • 00:23 mobrovac@tin: Started deploy [mobileapps/deploy@d6202e4]: Deploy for T159486
  • 00:23 mobrovac@tin: Started deploy [electron-render/deploy@5ec5614]: Deploy for T159486
  • 00:22 mobrovac@tin: Finished deploy [mathoid/deploy@83f80ee]: Deploy for T159486 (duration: 04m 53s)
  • 00:22 mobrovac@tin: Finished deploy [graphoid/deploy@485ca11]: Deploy for T159486 (duration: 04m 45s)
  • 00:20 mobrovac@tin: Finished deploy [electron-render/deploy@51cff8a]: Deploy for T159486 (duration: 03m 29s)
  • 00:19 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Modify add/remove groups for flood group on wikitech (duration: 00m 42s)
  • 00:18 mobrovac@tin: Started deploy [mathoid/deploy@83f80ee]: Deploy for T159486
  • 00:17 mobrovac@tin: Finished deploy [cxserver/deploy@7e22281]: Deploy for T159486 (duration: 02m 24s)
  • 00:17 mobrovac@tin: Started deploy [graphoid/deploy@485ca11]: Deploy for T159486
  • 00:17 mobrovac@tin: Started deploy [electron-render/deploy@51cff8a]: Deploy for T159486
  • 00:16 mobrovac@tin: Finished deploy [changeprop/deploy@99280e3]: Deploy for T159486 (duration: 01m 09s)
  • 00:16 mobrovac@tin: Finished deploy [trending-edits/deploy@88e2f74]: Deploy changes for T156666 T156680 T159486 T156411 (duration: 06m 58s)
  • 00:15 mobrovac@tin: Started deploy [cxserver/deploy@7e22281]: Deploy for T159486
  • 00:15 mobrovac@tin: Started deploy [changeprop/deploy@99280e3]: Deploy for T159486
  • 00:13 Reedy: Clear 2FA for "User:Steven Walling"; identity confirmed via facebook
  • 00:09 mobrovac@tin: Started deploy [trending-edits/deploy@88e2f74]: Deploy changes for T156666 T156680 T159486 T156411
  • 00:08 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Disable wgCiteResponsiveReferences by default for back-compat (T33597) (duration: 00m 41s)

2017-03-07

  • 23:38 mutante: gerrit restarting for config changes 341701, 341587
  • 22:45 papaul: ms-be2028-ms-be2039 - signing puppet certs, salt-key, initial run
  • 22:11 mobrovac@tin: Finished deploy [citoid/deploy@5a7e053]: Deploy for T158675 T103478 T159486 (duration: 02m 36s)
  • 22:08 mobrovac@tin: Started deploy [citoid/deploy@5a7e053]: Deploy for T158675 T103478 T159486
  • 22:02 mobrovac@tin: Finished deploy [zotero/translators@35da336]: Update transators for T158675 (duration: 00m 06s)
  • 22:01 mobrovac@tin: Started deploy [zotero/translators@35da336]: Update transators for T158675
  • 21:59 mobrovac@tin: Finished deploy [trending-edits/deploy@f855460]: (no justification provided) (duration: 04m 48s)
  • 21:54 mobrovac@tin: Started deploy [trending-edits/deploy@f855460]: (no justification provided)
  • 21:40 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.29.0-wmf.15 refs T158996
  • 21:30 twentyafterfour@tin: Finished scap: bump test wikis to 1.29.0-wmf.5 refs T158996 (duration: 53m 17s)
  • 21:23 mutante: mw1177 - service hhvm restart
  • 20:37 twentyafterfour@tin: Started scap: bump test wikis to 1.29.0-wmf.5 refs T158996
  • 20:29 mutante: iridium - re-enabling puppet, ssh-phab service converted to base::service_unit, upstart template moved but unchanged, service restarted just fine.
  • 20:27 mutante: phab2001 - phab-ssh service converted to base::service_unit and with working systemd unit file. 'systemctl ssh-phab status' is active (running) (T158434)
  • 20:26 ottomata: installing librdkafka 0.9.4 on cp1045 (cache misc host) via .deb package to try it with varnishkafka in prod (ping bblack, ema, just in case)
  • 20:23 mutante: iridium - temp disabled puppet - converting phab-ssh service to base::service_unit, systemd on phab2001, upstart on iridium
  • 19:23 twentyafterfour: branching 1.29.0-wmf15 refs T158996
  • 19:20 bblack: rebooting baham (ns1) AGAIN - low cpu frequencies issues like T147905 - checking bios/idrac stuff
  • 19:08 bblack: rebooting baham (ns1) - low cpu frequencies issues like T147905
  • 18:52 volans: rmmod acpi_pad on baham, was using 100% CPU T137647
  • 18:37 mobrovac: restbase deploy start of cd53670b
  • 16:58 akosiaris: re-increase temporarily the client-output-buffer-limit for rbd1007, phab task filling to follow
  • 16:40 akosiaris: decrease client-output-buffer-limit soft-limit back to normal values
  • 16:22 filippo@puppetmaster1001: conftool action : set/weight=40; selector: name=ms-fe1008.eqiad.wmnet
  • 16:22 filippo@puppetmaster1001: conftool action : set/weight=40; selector: name=ms-fe1007.eqiad.wmnet
  • 16:22 filippo@puppetmaster1001: conftool action : set/weight=40; selector: name=ms-fe1006.eqiad.wmnet
  • 16:22 filippo@puppetmaster1001: conftool action : set/weight=40; selector: name=ms-fe1005.eqiad.wmnet
  • 15:28 joal@tin: Finished deploy [analytics/aqs/deploy@e0da1bd]: (no justification provided) (duration: 06m 08s)
  • 15:22 joal@tin: Started deploy [analytics/aqs/deploy@e0da1bd]: (no justification provided)
  • 15:15 akosiaris: increase client-output-buffer-limit soft-limit to 500MB temporarily on rdb1007
  • 14:46 jynus: restart labsdb1004 for config and data check
  • 14:32 moritzm: uploaded HHVM 3.18 builds of hhvm-tidy, hhvm-luasandbox and hhvm-wikidiff2 to the experimental section of apt.wikimedia.org (Bug: T158176)
  • 14:03 reedy@tin: Synchronized docroot/: Fixup filebackend symlinks (duration: 00m 41s)
  • 13:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1060 weight - T158193 (duration: 00m 58s)
  • 12:53 marostegui: Just for the sake of having it logged: gtid_domain_id has been deployed in all the database servers - T149418
  • 12:53 elukey: analytics1040 back in service - testing the new Debian configuration
  • 12:39 marostegui: Deploy ALTER table on db2028 (codfw s6 master) on the revision table - T159414
  • 12:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 with less weight - T158193 (duration: 00m 40s)
  • 12:19 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2053 - T159414 (duration: 00m 43s)
  • 12:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2034 - T132416 (duration: 00m 50s)
  • 11:41 gehel: cleaning empty log file on elastic2001 (cronspam)
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=trendingedits'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=pdfrender'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=eventstreams'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=ores'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=cxserver'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=apertium'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=citoid'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=graphoid'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mobileapps'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=trendingedits'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=pdfrender'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=eventstreams'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=ores'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=cxserver'])
  • 11:32 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=apertium'])
  • 11:32 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=citoid'])
  • 11:32 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=graphoid'])
  • 11:32 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 11:32 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mobileapps'])
  • 11:27 elukey: end of hacking on install1002 (puppet re-enabled)
  • 09:23 ema: cache_text, cache_upload: upgrading to varnish 4.1.5 T159424
  • 09:10 elukey: temporary live hacking analytics-flex.cfg partman config on install1002
  • 08:25 moritzm: installing systemd bugfix updates from jessie point release
  • 07:39 marostegui: Stop MySQL db1067 to clone db1060 from it - T158193
  • 07:16 marostegui: Deploy ALTER table on db2053 (s6) for the revision table - T159414
  • 07:16 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2053 - T159414 (duration: 00m 41s)
  • 05:22 Krinkle: foreachwikiindblist 'all - closed - private' deleteEqualMessages.php (T45917) - purge upstreamed translations from remaining wikis
  • 03:28 Krinkle: foreachwikiindblist closed deleteEqualMessages.php (T45917) - purge upstreamed translations from closed wikis
  • 02:28 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Mar 7 02:28:59 UTC 2017 (duration 5m 32s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 08m 19s)
  • 00:49 RainbowSprinkles: gerrit: coming back online now
  • 00:43 RainbowSprinkles: gerrit: taking offline for a minute or two for case-insensitive login conversion
  • 00:39 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: In CSP policy for foundationwiki, wikidata.org -> www.wikidata.org (duration: 00m 40s)
  • 00:19 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Add other WMF domains to foundationwiki CSP policy for Special:HideBanners (duration: 00m 40s)
  • 00:00 mobrovac: restbase restarting in labs for T158628

2017-03-06

  • 22:14 awight: update payments-wiki config to a591e4c
  • 21:51 mutante: bast3001 - powerdown (T159480), decom in progress
  • 21:48 mutante: bast3001 - schedule downtime for host and all services in Icinga, remove from puppet, salt .. (T159480)
  • 21:36 hashar@tin: Synchronized static/images/project-logos: [fixup] Fix up wrongly updated sr.wikibooks and bs.wiktionary logos - T159542 T159534 (duration: 00m 42s)
  • 21:02 matt_flaschen: populateContentModel.php --wiki=cawiki --ns=103 run for revision, archive, page . T159047 complete
  • 21:00 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Flow for Viquiprojecte Discussió on cawiki (duration: 00m 40s)
  • 20:46 ottomata: removing old cdh packages from thirdparty component in apt
  • 20:34 gehel: reimport waterlines data on maps1001.eqiad.wmnet - T159631
  • 20:34 matt_flaschen: For T159047
  • 20:34 matt_flaschen: Ran (time mwscript extensions/Flow/maintenance/convertNamespaceFromWikitext.php --wiki=cawiki 'Viquiprojecte_Discussió') 2>&1|tee --append ~/2017-03-02_cawiki_convertNamespacesFromWikitext_Viquiprojecte_Discussió.log
  • 20:26 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Disable Cognate on beta wiktionary sites T156241 Beta Only (duration: 00m 46s)
  • 20:11 thcipriani@tin: Synchronized wmf-config: SWAT: Enable Cognate for beta wiktionaries T156241 beta-only change (duration: 00m 43s)
  • 20:05 ejegg: updated payments-wiki from 66d8125 to f991f15
  • 20:05 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create "flood" flag for labswiki (duration: 00m 40s)
  • 19:53 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Add "flow-create-board" to CommonSettings.php for global groups (duration: 00m 40s)
  • 19:52 gehel: restarting wdqs-updater on wdqs* servers to activate GC logs - T159248
  • 19:43 thcipriani: mwscript migrateUserGroup.php --wiki=trwiki 'technician' 'interface-editor' on terbium for T159636
  • 19:43 thcipriani@tin: Synchronized wmf-config: SWAT: Rename "technician" to "interface-editor" on trwiki T144638 (duration: 00m 46s)
  • 19:41 gehel@tin: Finished deploy [wdqs/wdqs@1f2973c]: (no justification provided) (duration: 01m 25s)
  • 19:39 gehel@tin: Started deploy [wdqs/wdqs@1f2973c]: (no justification provided)
  • 19:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 (duration: 00m 40s)
  • 18:22 elukey: analytics1040 has been silenced and it is not ready to work, need to fix its partman recipe
  • 18:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2060 - T159414 (duration: 00m 44s)
  • 18:04 gehel@tin: Finished deploy [wdqs/wdqs@7b77735]: (no justification provided) (duration: 01m 46s)
  • 18:03 demon@tin: Synchronized wmf-config/interwiki.php: Sync interwiki list, T159680 (duration: 00m 41s)
  • 18:02 gehel@tin: Started deploy [wdqs/wdqs@7b77735]: (no justification provided)
  • 15:01 hashar: restarting Jenkins
  • 14:59 addshore: EU SWAT done
  • 14:50 chasemp: labnet1001 'service nova-fullstack restart'
  • 14:44 addshore@tin: Synchronized wmf-config/extension-list-labs: Remove InterwikiSorting and add Cognate to extension-list-labs T150183 T156241 BETA ONLY (duration: 00m 39s)
  • 14:42 addshore@tin: Synchronized wmf-config/extension-list: Add InterwikiSorting extension to prod extension-list T150183 NOOP (duration: 00m 38s)
  • 14:39 addshore@tin: Synchronized wmf-config/db-labs.php: SWAT: Create extension1 db cluster for beta T156241 BETA ONLY (duration: 00m 39s)
  • 14:37 addshore@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Add a CSP policy to foundationwiki to prevent privacy breach T159386 (duration: 00m 39s)
  • 14:23 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change account creation throttle for idwiki to default (6) (duration: 00m 39s)
  • 14:15 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Translation memories multi-DC support T132076 2/2 (NOOP) (duration: 00m 42s)
  • 14:13 addshore@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Enable Translation memories multi-DC support T132076 1/2 (duration: 00m 50s)
  • 14:05 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Bs.wiktionary namespace changes T159538 (duration: 00m 40s)
  • 14:00 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: srwikibooks & bswiktionary logos T159534 T159542 2/2 (duration: 00m 39s)
  • 13:58 addshore@tin: Synchronized static/images/project-logos/: SWAT: srwikibooks & bswiktionary logos T159534 T159542 1/2 (duration: 00m 39s)
  • 13:23 godog: reenable puppet on graphite2001
  • 13:07 marostegui: Deploy ALTER table on db2060 (s6) for the revision table - T159414
  • 13:07 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2060 - T159414 (duration: 00m 39s)
  • 13:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2046 - T159414 (duration: 00m 50s)
  • 12:45 moritzm: upgrading apache on mw1209-mw1235
  • 12:44 moritzm: upgrading apache on graphite*
  • 11:49 moritzm: installing imagemagick security updates
  • 11:36 moritzm: upgrading apache on krypton
  • 11:30 moritzm: upgrading apache on planet.wikimedia.org
  • 11:05 elukey: reimage the first Hadoop worker node (an1040) to Debian Jessie
  • 10:46 moritzm: upgrading apache on mediawiki servers in codfw
  • 10:36 gehel: upgrade to elasticsearch 5.2.2 on relforge cluster - T156150
  • 10:24 elukey: (shamefully) replaced /etc/init.d/hadoop-hdfs-datanode script with "exit 0" to prevent the HDFS datanode daemon to start on analytics1028 (broken disk) and leave the rest running (puppet included) - T159632
  • 10:12 gehel: postgresql upgrade on maps* (postgresql-9.4 postgresql-9.4-postgis-2.3 postgresql-9.4-postgis-2.3-scripts postgresql-client-9.4 postgresql-client-common postgresql-common postgresql-contrib-9.4)
  • 10:06 ariel@tin: Finished deploy [dumps/dumps@8521be0]: fix: retries of broken runs could except on uninited var (duration: 00m 01s)
  • 10:06 ariel@tin: Started deploy [dumps/dumps@8521be0]: fix: retries of broken runs could except on uninited var
  • 09:46 gehel: postgresql upgrade on maps-test* (postgresql-9.4 postgresql-9.4-postgis-2.3 postgresql-9.4-postgis-2.3-scripts postgresql-client-9.4 postgresql-client-common postgresql-common postgresql-contrib-9.4)
  • 09:14 ariel@tin: Finished deploy [dumps/dumps@04794df]: move default config into a file and clean up (duration: 00m 02s)
  • 09:14 ariel@tin: Started deploy [dumps/dumps@04794df]: move default config into a file and clean up
  • 09:09 gehel: killing stuck tilerator notification on maps-test2001 - T145534
  • 07:22 marostegui: Resume pt-table-checksum on plwiki (s2) - T154485
  • 06:59 marostegui: Deploy ALTER table on db2046 (s6) for the revision table - T159414
  • 06:46 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2046 - T159414 (duration: 00m 51s)
  • 02:24 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Mar 6 02:24:24 UTC 2017 (duration 5m 19s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 07m 15s)
  • 01:29 cwd: updated staging civicrm database and triggers

2017-03-05

  • 22:23 Reedy: Generating some more captchas again T159581
  • 10:19 elukey: disabled puppet on analytics1028 to avoid puppet to start the HDFS daemon (T159632)
  • 02:24 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Mar 5 02:24:02 UTC 2017 (duration 5m 20s)
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 07m 07s)

2017-03-04

  • 16:43 Reedy: Manually generating even more captchas (going upto 10k total) in screen as reedy on terbium T159581
  • 16:35 Reedy: Manually generating some more captchas T159581
  • 03:28 legoktm: pausing refreshLinks.php run due to increase in job queue
  • 03:05 mutante: planet2001 - and this time it just worked and i can't reproduce the issue. install finished. re-adding to puppet, signing certs...
  • 03:00 mutante: planet2001 - reinstalling once more (T159432)
  • 02:36 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Mar 4 02:36:25 UTC 2017 (duration 5m 19s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 12m 10s)
  • 00:52 mutante: conf2002 - ran "systemctl reset-failed" to fix Icinga alert about broken systemd state due to formerly existing but failed service etcdmirror-eqiad-wmnet. turns out you need this to remove missing units. found on http://serverfault.com/questions/606520/how-to-remove-missing-systemd-units (T131959)

2017-03-03

  • 23:23 RainbowSprinkles: phabricator: restarted apache 1 last time, removed hack
  • 23:19 mutante: icinga: for special external hosts benefactorevents and eventdonations, "submit passive check result for this host" -> "check_tcp -p 80" to avoid "crit hosts" that just don't respond to ICMP (http://www.htmlgraphic.com/nagios-check-host-without-ping/)
  • 23:12 RainbowSprinkles: phabricator: restarting apache real quick
  • 22:03 hashar: rebooting contint2001
  • 21:54 hashar: restarting Jenkins
  • 21:51 hashar: enabling puppet on contint1001 and puppet-run
  • 21:05 hashar: disabled puppet on contint1001
  • 20:26 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Beta Cluster only (duration: 00m 40s)
  • 19:35 ebernhardson: restart elasticsearch on relforge1002 to update remote reindex whitelist
  • 19:33 ebernhardson: restart elasticsearch on relforge1001 to update remote reindex whitelist
  • 19:11 legoktm: running refreshLinks.php across small wikis
  • 18:43 addshore@tin: Synchronized php-1.29.0-wmf.14/extensions/RevisionSlider/modules/ext.RevisionSlider.css: T159428 Quick fix for misplaced tooltips on RTL wikis (duration: 00m 42s)
  • 17:35 hashar: CI is mostly recovered. It could not spawn instance anymore. The queue is being processed and will take a while to be completed. Check status on https://integration.wikimedia.org/zuul/ | T159543
  • 16:17 hashar: Stopped Jenkins from processing builds while instances are being recycled
  • 13:37 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2067 - T159414 (duration: 00m 50s)
  • 13:12 elukey: removed apache2 (rc state) and apache2-utils from analtytics1027
  • 11:11 elukey@tin: Finished deploy [analytics/refinery@1440646]: (no justification provided) (duration: 00m 14s)
  • 11:11 elukey@tin: Started deploy [analytics/refinery@1440646]: (no justification provided)
  • 11:09 elukey@tin: Finished deploy [analytics/refinery@1440646]: (no justification provided) (duration: 00m 02s)
  • 11:09 elukey@tin: Started deploy [analytics/refinery@1440646]: (no justification provided)
  • 11:05 jynus: stopping mariadb and restarting db1051 for maintenance
  • 11:03 joal@tin: Finished deploy [analytics/refinery@1440646]: (no justification provided) (duration: 01m 23s)
  • 11:02 joal@tin: Started deploy [analytics/refinery@1440646]: (no justification provided)
  • 10:53 marostegui: Start pt-table-checksum on plwiki (s2) - T154485
  • 10:48 joal@tin: Finished deploy [analytics/refinery@1440646]: (no justification provided) (duration: 15m 33s)
  • 10:33 joal@tin: Started deploy [analytics/refinery@1440646]: (no justification provided)
  • 09:28 hashar: Restarting Jenkins (2)
  • 09:03 hashar: Restarting Jenkins
  • 08:27 moritzm: upgrading apache on bromine
  • 08:22 marostegui: Run pt-table-checksum on s2 (nowiki) - T154485
  • 08:20 marostegui: Deploy alter table s6 on db2067 - T159414
  • 08:13 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2067 - T159414 (duration: 00m 40s)
  • 07:30 moritzm: installing w3m security updates on trusty (jessie already fixed)
  • 04:39 mutante: planet2001 last log message was for T159432
  • 04:38 mutante: planet2001 - reinstall, boot into installer, scheduled downtime (T15943)
  • 04:16 legoktm: running refreshLinks.php on aawiki
  • 04:13 legoktm@tin: Synchronized php-1.29.0-wmf.14/maintenance/refreshLinks.php: Queue non-recursive updates - https://gerrit.wikimedia.org/r/340920 (duration: 00m 40s)
  • 03:27 awight: rerunning schema_update wmf_civicrm:7480
  • 03:26 awight: update civicrm from 133bde2 to d20ed40
  • 02:38 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Mar 3 02:38:40 UTC 2017 (duration 5m 19s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 13m 28s)
  • 01:45 awight: rerun schema change wmf_civicrm:7480
  • 01:34 Krinkle: terbium$ foreachwiki purgeModuleDeps.php (T158105)
  • 01:34 Krinkle: terbium$ foreachwikiindblist group0 purgeModuleDeps.php (T158105)
  • 01:33 Krinkle: terbium$ mwscript purgeModuleDeps.php --wiki test2wiki (T158105)
  • 01:28 awight: update civicrm from 0cab193 to 133bde2
  • 01:12 MaxSem: Restarted tilerator on codfw tileservers to catch latest code changes
  • 01:11 mattflaschen@tin: Synchronized php-1.29.0-wmf.14/autoload.php: resourceloader: Add purgeModuleDeps.php maintenance script (duration: 00m 39s)
  • 01:10 mattflaschen@tin: Synchronized php-1.29.0-wmf.14/maintenance/cleanupRemovedModules.php: resourceloader: Add purgeModuleDeps.php maintenance script (duration: 00m 40s)
  • 01:09 mattflaschen@tin: Synchronized php-1.29.0-wmf.14/maintenance/purgeModuleDeps.php: resourceloader: Add purgeModuleDeps.php maintenance script (duration: 00m 40s)
  • 01:02 ejegg: re-running fix for missing names
  • 00:42 ejegg: re-enabled CiviCRM de-dupe jobs
  • 00:41 ejegg: CiviCRM geocoding update finished, name fix failed on badly formatted comment
  • 00:35 mattflaschen@tin: Synchronized wmf-config/CirrusSearch-common.php: CirrusSearch: Enable super_detect_noop (duration: 00m 39s)
  • 00:16 mattflaschen@tin: Synchronized php-1.29.0-wmf.14/extensions/Flow/: Fix autoload data and script (duration: 00m 59s)

2017-03-02

  • 23:49 ejegg: running batched geocoding update and donor name fixes
  • 23:43 ejegg: updated civicrm from d012767 to 2d1de87
  • 23:42 ejegg: disabled dedupe jobs for civi update
  • 23:07 bblack: all authdns servers puppet re-enabled
  • 23:05 bblack@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=appservers-rw,name=eqiad
  • 23:05 bblack@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=appservers-rw
  • 22:55 Krinkle: Stopped statsd-mw-js-deprecate service on hafnium per https://gerrit.wikimedia.org/r/338929
  • 22:46 catrope@tin: Synchronized dblists/: T63729: disable Flow on metawiki (duration: 00m 58s)
  • 22:36 MaxSem: killed stuck updates on maps-test2001
  • 22:09 mutante: bast3002 - stop rsyncd, remove rsyncd config snippets (T156506)
  • 20:05 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.14
  • 19:58 demon@tin: Synchronized wmf-config/CommonSettings.php: Stacktraces are useful when cli scripts fail (duration: 00m 56s)
  • 19:58 bblack@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=appservers-rw,named=eqiad
  • 19:57 bblack@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=appservers-rw
  • 19:53 maxsem@tin: Finished deploy [tilerator/deploy@edb97c5]: Trying https://gerrit.wikimedia.org/r/#/c/340607/ once again (duration: 00m 04s)
  • 19:53 maxsem@tin: Started deploy [tilerator/deploy@edb97c5]: Trying https://gerrit.wikimedia.org/r/#/c/340607/ once again
  • 19:49 maxsem@tin: Finished deploy [tilerator/deploy@0fe5a1d]: Reverting to previous version
  • 19:49 maxsem@tin: Started deploy [tilerator/deploy@0fe5a1d]: Reverting to previous version
  • 19:46 maxsem@tin: Finished deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/ (duration: 00m 05s)
  • 19:46 maxsem@tin: Started deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/
  • 19:43 maxsem@tin: Finished deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/ (duration: 00m 03s)
  • 19:43 maxsem@tin: Started deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/
  • 19:42 maxsem@tin: Finished deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/ (duration: 00m 03s)
  • 19:42 maxsem@tin: Started deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/
  • 19:42 maxsem@tin: Finished deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/ (duration: 00m 23s)
  • 19:42 maxsem@tin: Started deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/
  • 19:16 addshore@tin: Synchronized dblists/all-labs.dblist: Add beta hewiktionary T158628 2/2 NOOP (duration: 00m 39s)
  • 19:15 addshore@tin: Synchronized wikiversions-labs.json: Add beta hewiktionary T158628 1/2 NOOP (duration: 00m 42s)
  • 19:06 awight: reenabling donation and recurring queue consumers
  • 19:05 addshore@tin: Synchronized wmf-config/throttle.php: Add new rules for WMUK T159454 T159461 (duration: 00m 43s)
  • 19:04 awight: update civicrm from fb91fa8 to d012767
  • 18:22 demon@tin: Synchronized php-1.29.0-wmf.14/includes/changes/EnhancedChangesList.php: T159466 (duration: 00m 40s)
  • 17:51 bblack: disabling puppet on authdns prod machines for hacky discovery testing
  • 17:44 bblack@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=appservers-ro,name=codfw
  • 17:44 bblack@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=appservers-rw,name=eqiad
  • 17:38 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: name=eqiad
  • 16:52 bblack: puppet re-enabled on authdns production boxes
  • 16:27 bblack: puppet disabled on authdns production boxes, for hacky testing of discovery-related commits
  • 16:00 jynus: restarting db1001 for kernel and mariadb upgrade
  • 15:49 moritzm: uploaded 6.8.9.9-5+deb8u7+wmf1 to apt.wikimedia.org (CMYK sharpen bugfix rebased on latest Debian update)
  • 15:42 moritzm: installing libfcgi-perl security updates
  • 14:47 phuedx@tin: Synchronized wmf-config/InitialiseSettings.php: T157700: Re-enable Page Previews instrumentation (duration: 00m 40s)
  • 14:37 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1008.eqiad.wmnet
  • 14:37 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1007.eqiad.wmnet
  • 14:37 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1006.eqiad.wmnet
  • 14:32 phuedx@tin: Synchronized portals: (no justification provided) (duration: 00m 41s)
  • 14:32 phuedx@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 40s)
  • 14:26 jynus: running alter table on db2040 T147747
  • 14:22 elukey@tin: Finished deploy [analytics/refinery@c3dd129]: (no justification provided) (duration: 02m 18s)
  • 14:20 elukey@tin: Started deploy [analytics/refinery@c3dd129]: (no justification provided)
  • 14:12 phuedx@tin: Synchronized wmf-config/InitialiseSettings.php: Remove Page Previews experiment config (duration: 00m 40s)
  • 14:10 phuedx@tin: Synchronized wmf-config/CommonSettings.php: Remove Page Previews experiment config (duration: 01m 06s)
  • 13:47 moritzm: removed obsolete kernels on ocg1002
  • 13:46 moritzm: removed obsolete kernels on eventlog1001
  • 13:03 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1005.eqiad.wmnet
  • 12:52 moritzm: installing shadow security updates on jessie hosts
  • 12:43 jynus: running ANALYZE table on revision at db1051 (depooled) T159319
  • 12:36 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 for maintenance (duration: 00m 42s)
  • 11:58 hashar: CI composer based builds are now ok. Only operations/mediawiki-config was impacted as far as I can tell.
  • 11:10 kartik@tin: Finished deploy [cxserver/deploy@5101090]: (no justification provided) (duration: 02m 24s)
  • 11:07 kartik@tin: Started deploy [cxserver/deploy@5101090]: (no justification provided)
  • 10:51 hashar: CI composer based builds are sometime broken since composer got upgraded to 1.1.0 . See https://phabricator.wikimedia.org/T159431
  • 10:23 moritzm: installing bind updates (we're using client-side libs/tools)
  • 10:04 moritzm: installing tiff security updates on trusty hosts (jessie already fixed)
  • 09:55 elukey: increased PHP memory_limit on bohrium for Piwik (T154558)
  • 09:26 moritzm: installing glibc updates from jessie point release
  • 09:24 hashar: Upgrading composer to 1.1.0 on CI instances
  • 09:08 moritzm: installing apache2 security updates on mw1262-mw1265
  • 08:51 jynus: running alter table on db2039 T147747
  • 08:45 jynus: running alter table on db2035 T147747
  • 08:27 marostegui: Start pt-table-checksum on itwiki (s2)  - T154485
  • 07:20 marostegui: Deploy alter table enwiki.revision db2016 (codfw master) - T132416
  • 07:09 marostegui: Resume pt-table-checksum on idwiki (s2) - T154485
  • 03:04 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Mar 2 03:04:16 UTC 2017 (duration 5m 49s)
  • 02:58 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 14m 52s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 09m 30s)
  • 01:52 eileen1: civicrm changed...
  • 00:48 mutante: tin/mira - you will notice in the output of keyholder status you will not see the pathes in the "comment" column anymore. this is due to newer versions of openssh-client and caused our problem last time i attempted this. thanks to thcipriani's fix https://gerrit.wikimedia.org/r/#/c/312947/ we don't rely on this anymore and all is good, keyholder stays armed even after re-encrypting the
  • 00:44 mutante: tin - disarm/rearm keyholder after changing passphrases of all deployment keys to new passphrase (T154943)
  • 00:41 mutante: mira - disarm/rearm keyholder after changing passphrases of all other deployment keys (T154943)
  • 00:37 dereckson@tin: Synchronized wmf-config/interwiki.php: Update interwiki map (ref T159103) (duration: 00m 41s)
  • 00:23 mutante: mira - disarming keyholder, changed password of analytics deploy key - rearming to test changes for T154943

2017-03-01

  • 23:28 mutante: contint1002, contint2001: rm /usr/lib/ganglia/python_modules/diskstat.py*; rm /etc/ganglia/conf.d/diskstat.pyconf (re: gerrit 340657)
  • 21:44 arlolra@tin: Finished deploy [parsoid/deploy@32ca3fb]: (no justification provided) (duration: 00m 15s)
  • 21:44 arlolra@tin: Started deploy [parsoid/deploy@32ca3fb]: (no justification provided)
  • 21:44 arlolra@tin: Finished deploy [parsoid/deploy@32ca3fb]: (no justification provided) (duration: 00m 15s)
  • 21:44 arlolra@tin: Started deploy [parsoid/deploy@32ca3fb]: (no justification provided)
  • 21:43 arlolra@tin: Finished deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0 (duration: 02m 00s)
  • 21:41 arlolra@tin: Started deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0
  • 21:41 arlolra@tin: Finished deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0 (duration: 03m 50s)
  • 21:40 demon@tin: Synchronized php-1.29.0-wmf.14/extensions/Echo/includes/model/Event.php: better logging and such (duration: 00m 40s)
  • 21:37 arlolra@tin: Started deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0
  • 21:37 arlolra@tin: Finished deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0 (duration: 05m 14s)
  • 21:32 arlolra@tin: Started deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0
  • 21:32 arlolra@tin: Finished deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0 (duration: 07m 39s)
  • 21:28 demon@tin: Synchronized php-1.29.0-wmf.14/extensions/CentralAuth/: Unbreak pending real fix (duration: 00m 49s)
  • 21:24 arlolra@tin: Started deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0
  • 21:04 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.14
  • 21:03 demon@tin: Synchronized php: Symlink swap (duration: 00m 39s)
  • 20:41 mutante: netmon1001, labsdb1006,labsdb1007, fluorine, helium same fix as above, were not covered by salt targeting as they are precise. this is all now. ubuntu.wikimedia.org does not appear in sources when checking *
  • 20:35 mutante: [neodymium:~] $ sudo salt --out=txt -b 10 -C 'G@lsb_distrib_codename:trusty' cmd.run "sed -i 's/ubuntu.wikimedia/mirrors.wikimedia/g' /etc/apt/sources.list && apt-get update" (https://phabricator.wikimedia.org/rOPUPe9da17d739233a4db197e947e627cf2a47ce6e6f#2080366)
  • 20:27 mutante: all trusty hosts via salt - fix APT sources list. replace ubuntu.wikimedia (deleted) with mirrors.wikimedia, apt-get update (re: https://phabricator.wikimedia.org/rOPUPe9da17d739233a4db197e947e627cf2a47ce6e6f)
  • 20:02 smalyshev@tin: Finished deploy [wdqs/wdqs@2b8ffef]: Bump memory limit for Java to 16g (duration: 03m 36s)
  • 19:59 smalyshev@tin: Started deploy [wdqs/wdqs@2b8ffef]: Bump memory limit for Java to 16g
  • 19:40 mutante: ocg1001, db1047, californium, db1051, rcs1002, db1041, iridium - fix APT sources list. replace ubuntu.wikimedia (deleted) with mirrors.wikimedia, apt-get update
  • 19:30 mutante: labsdb1001, labtestcontrol2001, labtestvirt2001 - fix APT sources list. replace ubuntu.wikimedia (deleted) with mirrors.wikimedia
  • 19:19 awight: applying civicrm db migration wmf_civicrm:7465
  • 19:18 awight: update civicrm from b3f6eef to 58c8c06
  • 19:07 mutante: terbium - install multiple pending package upgrades
  • 19:04 mutante: terbium - uses ubuntu.wikimedia.org in APT sources but that does not exist anymore. replaced 'ubuntu' with 'mirrors' globally, apt-get update
  • 18:35 thcipriani@tin: Synchronized README: test sync for scap 3.5.3-1 (duration: 00m 46s)
  • 17:54 jynus: autoremoving old kernels on terbium to make room on /boot
  • 17:52 jynus: running alter table on db2044 T147747
  • 15:47 joal@tin: Finished deploy [analytics/refinery@f4a5020]: (no justification provided) (duration: 02m 33s)
  • 15:45 marostegui: Resume pt-table-checksum on idwiki (s2) - T154485
  • 15:45 joal@tin: Started deploy [analytics/refinery@f4a5020]: (no justification provided)
  • 15:44 joal@tin: Finished deploy [analytics/refinery@b4a8fcc]: (no justification provided) (duration: 00m 13s)
  • 15:44 joal@tin: Started deploy [analytics/refinery@b4a8fcc]: (no justification provided)
  • 15:35 jynus: running alter table on db1034 T147747
  • 15:28 gehel: deploying on eqiad completed - T158782
  • 15:26 elukey@tin: Finished deploy [analytics/refinery@b4a8fcc]: (no justification provided) (duration: 02m 15s)
  • 15:23 elukey@tin: Started deploy [analytics/refinery@b4a8fcc]: (no justification provided)
  • 15:18 gehel: testing a few host on codfw looks good, deploying on eqiad - T158782
  • 15:10 gehel: mw1209 looks good, deploying on codfw - T158782
  • 15:05 gehel: mwdebug1001 looks good, deploying on mw1209 - T158782
  • 14:54 gehel: starting deployment of mediawiki apache config - T158782
  • 14:31 elukey@tin: Finished deploy [analytics/refinery@33db287]: (no justification provided) (duration: 01m 13s)
  • 14:30 elukey@tin: Started deploy [analytics/refinery@33db287]: (no justification provided)
  • 14:29 dcausse: EU SWAT Done
  • 14:27 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: [cirrus] cleanup old A/B test (duration: 00m 40s)
  • 14:27 elukey@tin: Finished deploy [analytics/refinery@33db287]: (no justification provided) (duration: 01m 24s)
  • 14:26 elukey@tin: Started deploy [analytics/refinery@33db287]: (no justification provided)
  • 14:12 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: [cirrus] Test disable super_detect_noop script (duration: 00m 47s)
  • 13:16 marostegui: run pt-table-checksum on idwiki - T154485
  • 12:43 moritzm: installing apache2 security updates on mw1261
  • 12:22 godog: upgrade thumbor to 0.1.13 on thumbor100[12]
  • 11:32 jynus: running alter table on db2037 T147747
  • 11:27 moritzm: upgrading nginx on meiterium/archiva.wikimedia.org to 1.11.4 (using openssl 1.1)
  • 11:02 moritzm: uploaded lz4 0.0~r131 for jessie-wikimedia to apt.wikimedia.org (required by HHVM 3.18)
  • 09:33 jynus: running alter table on db1037 T147747
  • 09:20 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 after maintenance (duration: 00m 41s)
  • 09:14 marostegui: Deploy alter table s3 (all wikis) user_groups table - T155605
  • 08:49 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1026 after maintenance (duration: 00m 40s)
  • 08:18 moritzm: installing libgd2 security updates on trusty (jessie already fixed)
  • 07:05 marostegui: Deploy alter table enwiki.revision - dbstore2002 - T132416
  • 03:06 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Mar 1 03:06:24 UTC 2017 (duration 5m 46s)
  • 03:00 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 13m 51s)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 08m 03s)
  • 01:00 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Remove DonationInterface loading as gone from master (primarily to unbreak beta) (duration: 00m 42s)
  • 00:59 eileen1: Update CiviCRM from 04b49b0 to b3f6eef
  • 00:59 reedy@tin: Synchronized wmf-config/CommonSettings.php: Remove DonationInterface loading as gone from master (primarily to unbreak beta) (duration: 00m 40s)

2017-02-28

  • 22:29 mutante: (T157675) - delete salt keys - [neodymium:~] $ for mcnode in $(seq 2001 2016); do sudo salt-key -d mc${mcnode}.codfw.wmnet; done
  • 22:26 mutante: (T157675) - revoke puppet certs, deactivate nodes, rm from icinga. [puppetmaster1001:~] $ for mcnode in $(seq 2001 2016); do puppet node clean mc${mcnode}.codfw.wmnet && puppet node deactivate mc${mcnode}.codfw.wmnet ; done
  • 21:58 awight: update payments from 2a0c3b2 to 66d8125
  • 21:51 eileen1: update CiviCRM from a2875c5 to 04b49b0
  • 21:44 urandom: Updating RESTBase mobileapps tables (all remaining) to use time-windowed compaction
  • 21:40 maxsem@tin: Finished deploy [kartotherian/deploy@81db48c]: Second attempt at 81db48c (duration: 06m 39s)
  • 21:34 maxsem@tin: Started deploy [kartotherian/deploy@81db48c]: Second attempt at 81db48c
  • 21:23 MaxSem: Completely disabled kartotherian on maps-test2004, it just logs errors
  • 21:05 _joe_: manually installing nodejs on wasat T156922
  • 20:50 maxsem@tin: Synchronized wmf-config/wikitech.php: https://gerrit.wikimedia.org/r/#/c/340357/2 (duration: 00m 40s)
  • 20:33 urandom: Updating RESTBase mobileapps tables (phase0) to use time-windowed compaction
  • 20:30 demon@tin: Synchronized wmf-config/wikitech.php: no moar forms on wikitech (duration: 00m 39s)
  • 20:03 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.14
  • 19:34 demon@tin: Synchronized php-1.29.0-wmf.14/extensions/WikimediaEvents/modules/ext.wikimediaEvents.geoFeatures.js: Roan made me do it (duration: 00m 39s)
  • 19:26 demon@tin: Finished scap: testwiki to wmf.14 + l10n bootstrap (duration: 55m 14s)
  • 19:04 urandom: Updating RESTBase mobileapps tables (wikimedia) to uses time-windowed compaction
  • 18:31 demon@tin: Started scap: testwiki to wmf.14 + l10n bootstrap
  • 17:43 urandom: Updating RESTBase mobileapps tables (wikipedia) to uses time-windowed compaction
  • 17:11 elukey: Analytics Hadoop cluster upgraded to CDH 5.10
  • 17:09 jynus: disabling replication lag alerts on db1026 (depooled)
  • 17:05 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1001.eqiad.wmnet
  • 17:04 gehel: restarting blazegraph on wdqs1001 - T159245
  • 15:47 jynus: running alter table on db1056 T147747
  • 15:30 gehel: depooling wdqs1001 due to instability
  • 15:29 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1001.eqiad.wmnet
  • 15:23 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1056 for maintenance (duration: 00m 40s)
  • 14:51 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1003.eqiad.wmnet
  • 14:46 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1053 after maintenance (duration: 00m 39s)
  • 14:35 elukey: start the Analytics Hadoop cluster upgrade (https://etherpad.wikimedia.org/p/analytics-cdh5.10)
  • 14:32 marostegui: run pt-table-checksum on eowiki (s2) - T154485
  • 14:08 phuedx@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Make Page Previews use RESTBase on Beta Cluster (duration: 00m 42s)
  • 14:02 reedy@tin: Synchronized php-1.29.0-wmf.13/extensions/Dashiki/extension.json: Register JsonConfigModels (duration: 00m 42s)
  • 13:57 jynus: running alter table on db1036 T147747
  • 13:23 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 after maintenance (duration: 00m 39s)
  • 13:03 Reedy: ran namespaceDupes on meta to fix some Config pages
  • 12:48 _joe_: flushed memcached in codfw, restarting hhvm on appserver to flush APC in order to test warmup script
  • 11:47 gehel: restarting wdqs-blazegraph on wdqs1003
  • 11:40 gehel: depooling wdqs1003 for investigation (high 5xx rate)
  • 11:40 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1003.eqiad.wmnet
  • 11:22 jynus: running alter table on db1053 T147747
  • 11:18 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1053 for maintenance (duration: 00m 40s)
  • 10:56 elukey: restart zookeeper on conf1002
  • 10:53 marostegui: run pt-table-checksum on enwiktionary (s2) - T154485
  • 10:35 elukey: restar zookeeper on conf1003
  • 10:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1026 for maintenance (duration: 00m 39s)
  • 10:23 marostegui: run pt-table-checksum on enwikiquote (s2) - T154485
  • 10:09 marostegui: Deploy alter table s2 on all wikis for table user_groups - T155605
  • 10:00 elukey: restart zookeeper on conf1001
  • 09:47 jynus: running alter table on db1055 T147747
  • 09:43 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 for maintenance (duration: 00m 40s)
  • 09:38 marostegui: Deploy alter table s7 on all wikis for table user_groups - T155605
  • 09:06 jynus: running alter table on db2042 T147747
  • 09:03 marostegui: Deploy alter table s1 (enwiki).user_groups - T155605
  • 08:59 marostegui: run pt-table-checksum on cswiki (s2) - T154485
  • 08:43 moritzm: installing python-crypto security updates
  • 08:38 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1045 after maintenance (duration: 00m 40s)
  • 08:37 hashar: nodepool deleted alien instances 541585 541586 and 541587
  • 08:35 marostegui: Deploy alter table s6 (frwiki,jawiki,ruwiki).user_groups - T155605
  • 08:24 marostegui: run pt-table-checksum on bgwiktionary (s2) - T154485
  • 08:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 after maintenance (duration: 00m 41s)
  • 08:18 marostegui: Deploy alter table s5 wikidatawiki.user_groups - T155605
  • 08:15 marostegui: Deploy alter table s5 dewiki.user_groups - T155605
  • 07:41 marostegui: Deploy alter table s4.user_groups - T155605
  • 07:12 marostegui: run pt-table-checksum on bgwiki (s2) - T154485
  • 07:00 marostegui: Deploy alter table enwiki.revision db2034 - T132416
  • 02:35 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Feb 28 02:35:56 UTC 2017 (duration 5m 20s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 11m 40s)
  • 02:18 mutante: rsyncing prometheus metrics data from bast3001 to bast3002 (T156506)
  • 01:42 mutante: mw1198 - restart hhvm
  • 01:01 demon@tin: Synchronized scap/plugins/clean.py: No-op, more cleanups for clean.py (duration: 00m 42s)
  • 00:33 ebernhardson: restart elasticsearch on relforge1002, putting too much load on the machine got it stuck in a GC spiral with 1minute+ collections
  • 00:29 ebernhardson: restart elasticsearch on relforge1001, putting too much load on the machine got it stuck in a GC spiral with 1minute+ collections
  • 00:15 demon@tin: Synchronized php-1.29.0-wmf.13/extensions/MobileFrontend/resources/skins.minerva.base.styles/ui.less: Fix the incorrect magnify glass icon position in lang search (duration: 00m 39s)
  • 00:13 demon@tin: Synchronized php-1.29.0-wmf.13/extensions/Nuke/Nuke_body.php: Move back to old caller names (duration: 00m 43s)
  • 00:09 demon@tin: Synchronized wmf-config/CommonSettings.php: Enable editmyoptions right for all users on loginwiki (duration: 00m 41s)

2017-02-27

  • 23:50 demon@tin: Finished scap: Enabling Dashiki on meta (duration: 20m 46s)
  • 23:29 demon@tin: Started scap: Enabling Dashiki on meta
  • 23:17 demon@tin: Synchronized scap/plugins/clean.py: no-op (duration: 00m 48s)
  • 22:37 otto@tin: Finished deploy [eventstreams/deploy@76c763e]: Deploying swagger-ui /?doc endpoint (duration: 01m 45s)
  • 22:36 otto@tin: Started deploy [eventstreams/deploy@76c763e]: Deploying swagger-ui /?doc endpoint
  • 22:34 otto@tin: Finished deploy [eventstreams/deploy@76c763e]: Deploying /?doc swagger-ui endpoint only to scb2001 (duration: 00m 17s)
  • 22:34 otto@tin: Started deploy [eventstreams/deploy@76c763e]: Deploying /?doc swagger-ui endpoint only to scb2001
  • 22:10 otto@tin: Finished deploy [eventstreams/deploy@2f73b52]: Deploying /?doc swagger-ui endpoint only to scb2001 (duration: 00m 18s)
  • 22:10 otto@tin: Started deploy [eventstreams/deploy@2f73b52]: Deploying /?doc swagger-ui endpoint only to scb2001
  • 21:42 bsitzmann@tin: Finished deploy [mobileapps/deploy@872a615]: Update mobileapps to c924126 (duration: 03m 14s)
  • 21:39 bsitzmann@tin: Started deploy [mobileapps/deploy@872a615]: Update mobileapps to c924126
  • 21:16 mutante: ganglia - switching esams aggregator to bast3002 - except short gaps in esams graphs
  • 20:51 robh: disabled puppet on einstienium for icinga update of config
  • 18:26 gehel: restarting wdqs-updater on all wdqs servers
  • 18:25 gehel@tin: Finished deploy [wdqs/wdqs@daca9b3]: (no justification provided) (duration: 01m 39s)
  • 18:24 gehel: redeploying wdqs (previous deploy was not latest version)
  • 18:24 gehel@tin: Started deploy [wdqs/wdqs@daca9b3]: (no justification provided)
  • 18:18 awight: update civicrm from 20660c4 to a2875c5
  • 18:14 gehel: restarting wdqs-updater on all wdqs servers
  • 18:14 gehel@tin: Finished deploy [wdqs/wdqs@62354ed]: (no justification provided) (duration: 00m 52s)
  • 18:13 gehel@tin: Started deploy [wdqs/wdqs@62354ed]: (no justification provided)
  • 18:12 gehel@tin: Finished deploy [wdqs/wdqs@62354ed]: log (duration: 00m 12s)
  • 18:12 ema: temporarily bumping timeout_idle to 120s on cache_misc T154558
  • 18:12 gehel@tin: Started deploy [wdqs/wdqs@62354ed]: log
  • 18:04 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1001.eqiad.wmnet
  • 16:08 jynus: starting schema change on db1051 T147747
  • 16:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 for maintenance (duration: 00m 40s)
  • 15:55 jynus: starting schema change on db2038 T147747
  • 14:58 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: name=eqiad
  • 14:42 Dereckson: Fix namespace dupes pages on ext.wikipedia (T158914)
  • 14:30 hashar: European SWAT done. Pushed https://gerrit.wikimedia.org/r/#/c/339446/ and https://gerrit.wikimedia.org/r/#/c/339348/
  • 14:29 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: New namespace aliases for itwikiversity - T158775 (duration: 00m 43s)
  • 14:13 moritzm: installed apache2 security updates on mwdebug*
  • 14:10 aude@tin: Synchronized wmf-config/Wikibase-production.php: Disable geo-shape datatype on wikidata for now (duration: 00m 41s)
  • 13:58 marostegui: Manually deploy gtid_domain_id on s2 - T149418
  • 13:06 elukey: restart zookeeper on conf2003
  • 12:39 elukey: restart zookeeper on conf2002
  • 12:14 _joe_: reissuing the certificate for etcd.codfw.wmnet due to a previous error
  • 12:00 elukey: rebooting mw2092 due to puppet errors for mw-cgroup - T151427
  • 11:58 volans: re-enabled icinga-wm
  • 11:37 ema: cp1052 repooled T148891
  • 11:19 elukey: zookeeper status report - new changes rolled out to druid nodes and conf2001 - conf1* and conf200[23] still pending, waiting for more metrics before proceeding
  • 11:09 volans: temporarily stopped ircecho (icinga-wm)
  • 11:04 ema: rebooting cp1052 into kernel 4.4.2-3+wmf8 T148891
  • 10:49 moritzm: uploaded apache2 2.4.10-10+deb8u8+wmf1 to apt.wikimedia.org (rebase of local patches on top on latest DSA)
  • 10:34 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: wme: Set ReadingDepth sampling rate to 0.1% - T155639 (duration: 00m 40s)
  • 10:31 elukey: limiting the Zookeeper Maximum heap size to 1G (https://gerrit.wikimedia.org/r/#/c/337797/) - setting applied gradually to Zookeeper on Druid and Conf* hosts
  • 10:11 _joe_: upgrading conftool to 0.4.0 across the cluster T149617
  • 10:03 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1045 for maintenance (duration: 00m 43s)
  • 09:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1026 after maintenance with full weight (duration: 00m 39s)
  • 08:42 _joe_: upload conftool 0.4.0 to trusty-wikimedia
  • 08:42 _joe_: promote conftool 0.4.0 to jessie-wikimedia main
  • 07:59 marostegui: Run pt-table-checksum on s2 (nlwiki) on revision table - T154485
  • 07:29 marostegui: Deploy alter table enwiki.revision - db2034 - T132416
  • 07:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2034 - T132416 (duration: 00m 40s)
  • 07:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2070 - T132416 (duration: 00m 40s)
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Feb 27 02:25:10 UTC 2017 (duration 5m 21s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 07m 23s)

2017-02-26

  • 17:10 Reedy: ran namespaceDupes for extwiki
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Feb 26 02:25:07 UTC 2017 (duration 5m 21s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 07m 26s)

2017-02-25

  • 20:06 elukey: depooled cp2017 (via local sudo -i depool command) since the host froze (it got back after a powercycle)
  • 19:54 elukey: powercycled cp2017, mgmt console stuck
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Feb 25 02:25:10 UTC 2017 (duration 5m 21s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 07m 20s)
  • 01:43 mutante: bast3002 - sign puppet cert, initial run with basic "bastion" role, to replace broken bast3001, but WIP, ganglia/prometheus roles not moved yet (T156506)

2017-02-24

  • 22:46 Krinkle: (terbium) sql --write mediawikiwiki 'DELETE FROM module_deps' (in batches of 500; 42292 rows affected) - per T158105.
  • 22:28 smalyshev@tin: Finished deploy [wdqs/wdqs@62354ed]: Deploy new updater on 1001 for timeout increase (duration: 00m 16s)
  • 22:27 smalyshev@tin: Started deploy [wdqs/wdqs@62354ed]: Deploy new updater on 1001 for timeout increase
  • 22:23 smalyshev@tin: Finished deploy [wdqs/wdqs@62354ed]: Deploy new updater on 2001 for testing (duration: 00m 26s)
  • 22:23 smalyshev@tin: Started deploy [wdqs/wdqs@62354ed]: Deploy new updater on 2001 for testing
  • 20:50 ebernhardson: restart elasticsearch on logstash1002
  • 20:05 demon@tin: Synchronized wmf-config/wikitech.php: (no justification provided) (duration: 00m 48s)
  • 19:30 Pchelolo: restarting RESTBase on xenon.eqiad.wmnet in staging
  • 17:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1026 after maintenance with low load (duration: 00m 40s)
  • 16:55 volans: manually cleaning ferm leftovers on dbproxy1011 - T158798
  • 15:35 ema: temporarily bumping timeout_idle to 60s on cache_misc T154558
  • 14:27 volans: re-started and re-armed keyholder after upgrade on: mira.codfw.wmnet,neodymium.eqiad.wmnet,sarin.codfw.wmnet,tin.eqiad.wmnet T158660 T158659
  • 10:41 ema: cache_misc: upgrading to varnish 4.1.5
  • 10:30 moritzm: installing imagemagick regression update for security update on trusty (the Debian update seems unaffected)
  • 10:23 moritzm: installing spice updates on trusty
  • 09:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1036 - T154485 (duration: 00m 40s)
  • 09:39 elukey: stop Redis and Memcached on mc2001->mc2016 as extra precautionary step before decom - T157675
  • 08:44 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1001.eqiad.wmnet
  • 08:16 volans: temporary disabled puppet on neodymium and sarin to deploy Gerrit 339183 - T158753
  • 07:32 marostegui: Deploy alter table enwiki.revision on db2070 - T132416
  • 07:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2069 and depool db2070 - T132416 (duration: 00m 45s)
  • 02:32 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Feb 24 02:32:21 UTC 2017 (duration 5m 22s)
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 07m 02s)
  • 00:26 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Store goodfaith scores in the ORES tables T137966 (duration: 00m 40s)
  • 00:17 mobrovac: restbase deploying b477ab46

2017-02-23

  • 21:11 dereckson@tin: Finished scap: Full scap to deploy new l10n keys on wikitech (gerrit:339456), take two (duration: 22m 55s)
  • 20:49 dereckson@tin: Started scap: Full scap to deploy new l10n keys on wikitech (gerrit:339456), take two
  • 20:48 dereckson@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap"; owner is "dereckson"; reason is "Full scap to deploy new l10n keys on wikitech (gerrit:339456)" (duration: 00m 00s)
  • 20:46 dereckson@tin: Started scap: Full scap to deploy new l10n keys on wikitech (gerrit:339456)
  • 20:04 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.13
  • 19:47 dereckson@tin: Synchronized php-1.29.0-wmf.13/extensions/WikimediaMessages/extension.json: Create user group messages for wikitech.wikimedia.org (T158417) (duration: 00m 39s)
  • 19:45 dereckson@tin: Synchronized php-1.29.0-wmf.13/extensions/WikimediaMessages/i18n/wikitech/: (no justification provided) (duration: 00m 43s)
  • 18:29 chasemp: labnodepool1001:~# service nodepool restart
  • 17:40 gehel: removing old prod indices from relforge1002 - T156150
  • 17:37 gehel: removing old prod indices from relforge1002 (jawikiprod_content, enprodwiki_content, ruwikiprod_content) - T156150
  • 16:33 paravoid: cleaning up openstack packages from einstenium & tegment
  • 16:19 gehel: starting upgrade relforge cluster to elasticsearch 5.2.1 - expect significant downtime - T156150
  • 16:19 gehel: unban relforge1001 - T156150
  • 15:45 gehel: banning relforge1001 from clsuter to prepare for ES5 upgrade - T156150
  • 15:18 godog: roll-restart pybal in codfw to pick up swift https service
  • 15:08 marostegui: Power off dbstore1001 to change its disks and reimage - T153768
  • 14:42 addshore: addshore@tin scap clean 1.29.0-wmf.6 && scap clean 1.29.0-wmf.7 (to remove warning on scap pull on mwdebug1002, T157030)
  • 14:39 addshore: EU SWAT done
  • 14:39 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT T158832 nable TwoColConflict on hewiki (duration: 00m 40s)
  • 14:29 addshore@tin: Synchronized php-1.29.0-wmf.13/extensions/ContentTranslation/ContentTranslation.hooks.php: SWAT T158297 Really disable europeana2802016 campaign (duration: 00m 39s)
  • 14:26 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT T156794 Enable v2 of Minerva's header on cawiki and itwiki (duration: 00m 42s)
  • 14:18 paravoid: upgrading grafana to 4.1 on krypton
  • 13:52 gehel: restart logstash on relforge1001 to test logging configuration - T158664
  • 13:03 ema: cache_maps: upgrading to varnish 4.1.5
  • 12:40 moritzm: installing libssh security updates on trusty (jessie already fixed)
  • 12:40 moritzm: installing libssh security updates (jessie already fixed)
  • 12:35 moritzm: installing tomcat updates
  • 09:39 elukey: increase cassandra system_auth replication from 6 to 12 on AQS
  • 09:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1036 - T154485 (duration: 00m 40s)
  • 09:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1036 - T154485 (duration: 00m 40s)
  • 09:06 _joe_: uploaded conftool 0.4.0 to jessie-wikimedia experimental
  • 08:54 marostegui: Stop pt-table-checksum on nlwiki.revision - T154485
  • 08:51 marostegui: Run pt-table-checksum on s2 (nlwiki) on revision table - T154485
  • 07:59 marostegui: Run pt-table-checksum on s2 (nlwiki) on logging table - T154485
  • 07:16 marostegui: Deploy alter table enwiki.revision db2069 - T132416
  • 07:14 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2062 and depool db2069 - T132416 (duration: 00m 42s)
  • 07:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1060 original load - T158194 (duration: 00m 40s)
  • 03:02 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Feb 23 03:02:10 UTC 2017 (duration 5m 47s)
  • 02:56 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 14m 38s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 08m 40s)
  • 00:19 dereckson@tin: Synchronized wmf-config/throttle.php: Throttle rule for it.wikiversity (T158767) (duration: 00m 40s)
  • 00:18 Krinkle: mwscript deleteEqualMessages.php --wiki simplewikibooks (T45917)

2017-02-22

  • 23:46 ejegg: turned off 3DS requirement for Denmark on payments-wiki
  • 23:17 matt_flaschen: Exported https://meta.wikimedia.org/wiki/Talk:Flow/Developer_test_page to https://meta.wikimedia.org/wiki/Talk:Flow/Developer_test_page/Wikitext using extensions/Flow/maintenance/convertToText.php
  • 23:17 matt_flaschen: Migrated https://meta.wikimedia.org/wiki/Research_talk:ORES_paper to https://www.mediawiki.org/wiki/Talk:ORES/Paper using extensions/Flow/maintenance/dumpBackup.php and importDump.php
  • 22:53 Pchelolo: update RESTBase to 3340714f0
  • 22:52 jynus: stopping dbstore1001 mariadb in preparation for tomorrow's reimage T153768
  • 22:50 Pchelolo: update RESTBase to 3340714f0: canary on restbase1007
  • 22:46 Pchelolo: update RESTBase to 3340714f0: staging
  • 21:57 maxsem@tin: Finished deploy [kartotherian/deploy@81db48c]: Deploying https://gerrit.wikimedia.org/r/#/c/339093/ (duration: 15m 05s)
  • 21:42 maxsem@tin: Started deploy [kartotherian/deploy@81db48c]: Deploying https://gerrit.wikimedia.org/r/#/c/339093/
  • 20:30 demon@tin: Finished scap: group1 to wmf.13 (duration: 25m 39s)
  • 20:04 demon@tin: Started scap: group1 to wmf.13
  • 20:02 gehel@tin: Finished deploy [wdqs/wdqs@7768422]: (no justification provided) (duration: 02m 04s)
  • 19:59 gehel@tin: Started deploy [wdqs/wdqs@7768422]: (no justification provided)
  • 19:56 gehel: deploying latest wdqs version
  • 19:46 godog: roll-HUP rsyslog on mw1* to pick up DNS udplog change - T123728
  • 19:45 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Finish removing "shellmanagers" on Wikitech T158482 (duration: 00m 40s)
  • 19:37 thcipriani@tin: Synchronized php-1.29.0-wmf.13/extensions/Flow: SWAT: Import dump: support importing a board that exist in the farm T154830 (duration: 00m 56s)
  • 19:34 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Removing the "shellmanagers" group from Wikitech T158482 (duration: 00m 49s)
  • 19:14 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Configuration changes for wikitech.wikimedia.org T158516 T158554 T158482 (duration: 00m 40s)
  • 18:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 with less weight - T158194 (duration: 00m 39s)
  • 18:17 Dereckson: Last two deployment entries were to rollback portals/ to last known state (T158782)
  • 18:17 dereckson@tin: Synchronized portals: (no justification provided) (duration: 00m 39s)
  • 18:17 dereckson@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 40s)
  • 16:29 gehel: reimage of relforge1001 starting
  • 16:21 marostegui: Shutdown db1060 for BBU replacement - T158194
  • 16:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 - T158194 (duration: 00m 40s)
  • 16:19 ema: cp3006 upgraded to varnish 4.1.5
  • 16:15 ema: cp4019 upgraded to varnish 4.1.5
  • 15:48 moritzm: installing tcpdump security updates on ubuntu systems (jessie already fixed for a while)
  • 15:43 jynus: stopping mariadb replication on db1026 for maintenance T147747
  • 15:21 marostegui: Restart MySQL on db1095 to apply new replication filters - https://phabricator.wikimedia.org/T154485
  • 15:16 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1026 for maintenance (duration: 00m 41s)
  • 15:11 marostegui: Restart MySQL on db1069 to apply new replication filters - https://phabricator.wikimedia.org/T154485
  • 14:50 zeljkof: finished EU SWAT
  • 14:49 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T158762) (duration: 00m 41s)
  • 14:30 gehel: resetting to usual values for low/high watermark on elasticsearch eqiad (75% / 80%)
  • 14:17 hashar: Nuked Jenkins workspaces for the job operations-puppet-typos
  • 14:17 zfilipin@tin: Synchronized dblists/compact-language-links.dblist: SWAT: Deploy Compact Language Links in Swedish Wikipedia (T157114) (duration: 00m 50s)
  • 14:17 gehel: temporary raising high/low watermarks on elasticsearch eqiad to allow allocation of all shards
  • 14:04 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic1047.eqiad.wmnet
  • 12:38 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1045 with low load (3rd time a charm) (duration: 00m 39s)
  • 12:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1045 with low load (again) (duration: 02m 47s)
  • 12:18 dcausse: rebuild of translation memories index is done
  • 12:05 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1045 with low load (duration: 02m 49s)
  • 12:03 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(45|46).eqiad.wmnet
  • 11:48 paravoid: upgrading labmon1001 to grafana 4.1
  • 10:55 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(45|46|47).eqiad.wmnet
  • 10:54 moritzm: upgrading remaining mediawiki servers to HHVM 3.12.14
  • 10:54 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(35|37|39|43|44).eqiad.wmnet
  • 10:42 elukey: reinstall mw211[89] as MW videoscalers (trusty) and mw2243 as MW jobrunner
  • 10:05 filippo@tin: Synchronized wmf-config/ProductionServices.php: Move udp2log from fluorine to mwlog1001 - T123728 (duration: 00m 41s)
  • 10:01 hashar: enabling puppet on contint1001 and running it
  • 09:56 volans: restarting salt-master on neodymium after openssl upgrade
  • 09:37 ema: cache_text, cache_upload: libssl1.1 upgraded to 1.1.0e-1+wmf1, libevent-2.0-5 upgraded to 2.0.21-stable-2+deb8u1
  • 09:28 hashar: disable puppet on contint1001. Will use contint2001 as a canary
  • 09:14 marostegui: Run pt-table-checksum on s2.nlwiki over some tables - T154485
  • 09:04 dcausse: rebuilding translation memories index - ETA ~4hours (from terbium, logs in ~dcausse/ttm-refresh)
  • 09:02 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(35|39|43|44).eqiad.wmnet
  • 08:07 moritzm: upgrading openssl on redis clusters / various base service restarts
  • 07:44 gehel: restart elasticsearch on elastic1035
  • 07:43 gehel: trncating logs on elastic10(35|39|44)
  • 07:23 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2062 - T132416 (duration: 00m 40s)
  • 07:23 marostegui: Deploy alter table enwiki.revision db2062 - T132416
  • 07:18 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2055 - T132416 (duration: 00m 40s)
  • 03:10 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Feb 22 03:10:13 UTC 2017 (duration 5m 46s)
  • 03:04 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 13m 54s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 11m 48s)
  • 01:03 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable ORES review tool in cswiki T151611 (duration: 00m 39s)
  • 00:48 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Send "exception" channel to logstash Do not send "exception-json" channel to logstash T136849 (duration: 00m 40s)
  • 00:34 thcipriani@tin: Synchronized wmf-config: SWAT: Set $wgSoftBlockRanges T154698 PART II (duration: 00m 42s)
  • 00:33 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set $wgSoftBlockRanges T154698 PART I (duration: 00m 40s)
  • 00:20 thcipriani@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: Fix SiteConfiguration array merge syntax T157656 Fix Sentry URL scheme on beta Fix PageViewInfo config T158698 (beta-only changes) (duration: 00m 39s)
  • 00:17 thcipriani@tin: Synchronized php-1.29.0-wmf.12/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: SWAT: Turn off sister search AB test. T157942 (duration: 00m 39s)
  • 00:16 thcipriani@tin: Synchronized php-1.29.0-wmf.13/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: SWAT: Turn off sister search AB test. T157942 (duration: 00m 43s)
  • 00:13 smalyshev@tin: Finished deploy [wdqs/wdqs@7768422]: Deploy 2.1.5RC WAR on 2001 for testing (duration: 00m 25s)
  • 00:13 smalyshev@tin: Started deploy [wdqs/wdqs@7768422]: Deploy 2.1.5RC WAR on 2001 for testing
  • 00:05 demon@tin: Synchronized scap/plugins/clean.py: More code cleanup (duration: 00m 40s)

2017-02-21

  • 23:30 MaxSem: Kartotherian deploy did not happen
  • 23:22 demon@tin: Synchronized scap/plugins/clean.py: Code cleanup (duration: 00m 46s)
  • 23:21 demon@tin: scap aborted: scap/plugins/clean.py Code cleanup (duration: 00m 10s)
  • 23:21 demon@tin: Started scap: scap/plugins/clean.py Code cleanup
  • 22:01 mutante: carbon - removed from icinga, shutdown -h now (T158020)
  • 21:31 mutante: carbon - puppet node clean, node deactivate (T158020)
  • 21:10 demon@tin: Synchronized scap/plugins/prep.py: Completeness (duration: 00m 42s)
  • 20:48 Krinkle: (terbium) sql --write test2wiki 'DELETE FROM module_deps' (3687 rows affected, 0.01 sec) - per T158105.
  • 20:47 Krinkle: (terbium) sql --write testwiki 'DELETE FROM module_deps' (per T158105)
  • 20:44 mutante: carbon - backup /root data to install1002:/root/root-carbon/ before shutdown (T158020)
  • 20:36 mutante: rsyncing /home/ dirs excl. dot files, from carbon to install1002 (T158020)
  • 20:15 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(35|39|43|44).eqiad.wmnet
  • 20:08 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.13
  • 19:50 demon@tin: Finished scap: prime wmf.13 - testwiki plus l10n build (pt 3 because ugh) (duration: 17m 17s)
  • 19:32 demon@tin: Started scap: prime wmf.13 - testwiki plus l10n build (pt 3 because ugh)
  • 19:32 demon@tin: scap failed: RuntimeError 2 test canaries had check failures (rerun with --force to override this check) (duration: 15m 00s)
  • 19:17 demon@tin: Started scap: prime wmf.13 - testwiki plus l10n build (pt 2 because T156851)
  • 19:16 demon@tin: Finished scap: prime wmf.13 - testwiki plus l10n build (duration: 26m 15s)
  • 18:49 demon@tin: Started scap: prime wmf.13 - testwiki plus l10n build
  • 18:45 moritzm: installing PHP security updates on iridium (phabricator.wikimedia.org)
  • 18:36 ppchelko@tin: Finished deploy [changeprop/deploy@4706f9d]: Change-Prop: Make ORES return minified responses T157693 (duration: 00m 55s)
  • 18:35 ppchelko@tin: Started deploy [changeprop/deploy@4706f9d]: Change-Prop: Make ORES return minified responses T157693
  • 18:34 Pchelolo: changeprop deploy 4706f9da
  • 18:14 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(27|32|34|38|41).eqiad.wmnet
  • 18:12 godog: roll-restart nodepool on labnodepool1001 to pick up statsd.eqiad.wmnet DNS changes - T157022
  • 18:12 godog: roll-restart zuul on cont1001 to pick up statsd.eqiad.wmnet DNS changes - T157022
  • 18:04 godog: roll-restart eventstreams in codfw/eqiad to pick up statsd.eqiad.wmnet DNS changes - T157022
  • 18:03 godog: roll-restart trendingedits in codfw/eqiad to pick up statsd.eqiad.wmnet DNS changes - T157022
  • 17:58 demon@tin: Synchronized tests/multiversion/MWMultiVersionTest.php: No op in prod, completeness, etc (duration: 00m 40s)
  • 17:57 demon@tin: Synchronized multiversion/MWMultiVersion.php: Shut up dumb invalid hostname errors (duration: 00m 52s)
  • 17:50 godog: roll-restart ocg in codfw/eqiad to pick up statsd.eqiad.wmnet DNS changes - T157022
  • 17:47 godog: roll-restart jmxtrans in codfw/eqiad on conf* to pick up statsd.eqiad.wmnet DNS changes - T157022
  • 17:35 godog: roll-restart parsoid in codfw/eqiad to pick up statsd.eqiad.wmnet DNS changes - T157022
  • 17:35 Amir1: done restarting ores services
  • 17:20 Amir1: restarting ores uwsgi and celery services in scb nodes
  • 16:59 ema: cache_misc, cache_maps: libssl1.1 upgraded to 1.1.0e-1+wmf1, libevent-2.0-5 upgraded to 2.0.21-stable-2+deb8u1
  • 16:37 gehel: restarting elasticsearch on elastic1030
  • 16:34 gehel: truncating elasticsearch logs on elastic1023
  • 16:31 gehel: truncating elasticsearch logs on elastic1030
  • 16:18 dcausse: truncated main elastic log, daemon.log and syslog on elastic1023
  • 16:08 moritzm: restarting apache on uranium for openssl update
  • 16:06 dcausse: truncated main log file on elastic1030
  • 15:50 gehel: restarting wdqs-updater on wdqs1002
  • 15:40 elukey: restart eventlogging on kafka200[123] for openssl upgrades
  • 15:40 godog: restart navtiming ve asset-check statsd-mw-js-deprecate on hafnium to pick up statsd.eqiad.wmnet change - T157022
  • 15:39 elukey: restart jmxtrans on kafka[12]00[123] for T157022
  • 15:34 mobrovac@tin: Started restart [mobileapps/deploy@cd3b897]: Restarting for Graphite DNS switch T157022
  • 15:32 elukey: correction on my previous entry: restart eventlogging on kafka100[123] for openssl upgrades
  • 15:30 mobrovac@tin: Started restart [graphoid/deploy@da37386]: Restarting for Graphite DNS switch T157022
  • 15:22 elukey: restart eventlogging on kafka200[123] for openssl upgrades
  • 15:21 mobrovac@tin: Started restart [cxserver/deploy@0e4ae4f]: Restarting for Graphite DNS switch T157022
  • 15:20 moritzm: rolling restart of swift frontend servers to pick up openssl update
  • 15:19 mobrovac@tin: Started restart [citoid/deploy@95df861]: Restarting for Graphite DNS switch T157022
  • 15:18 mobrovac@tin: Started restart [mathoid/deploy@ba3217e]: Restarting for Graphite DNS switch T157022
  • 15:17 hashar: European SWAT complete
  • 15:17 hashar@tin: Synchronized php-1.29.0-wmf.12/extensions/UniversalLanguageSelector/UniversalLanguageSelector.hooks.php: Fix site picks: missing from globals (duration: 01m 00s)
  • 15:12 gehel: restarting kartotherian / tilerator(ui) on maps1*
  • 15:09 gehel: restarting kartotherian / tilerator(ui) on maps2*
  • 15:06 gehel: restarting kartotherian / tilerator(ui) on maps-test*
  • 15:06 godog: roll-restart restbase after statsd move to graphite1001 - T157022
  • 15:06 elukey: Increased manually maximum httpd keep alive requests and timeout on bohrium (piwik) - T154558
  • 14:56 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(33|34|38|42).eqiad.wmnet
  • 14:43 hashar@tin: Synchronized php-1.29.0-wmf.12/extensions/UniversalLanguageSelector/maintenance/ULSCompactLinksDisablePref.php: Add a maintenance script for opt-in T133031 (duration: 00m 41s)
  • 14:35 moritzm: upgrading openssl on logstash cluster / various base service restarts
  • 14:29 dcausse: truncated main log file on elastic1030
  • 14:29 hashar@tin: Synchronized portals: (no justification provided) (duration: 00m 40s)
  • 14:29 moritzm: restarting NTP servers on dns_recursors to pick up openssl update (one by one)
  • 14:28 hashar@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 40s)
  • 14:25 moritzm: upgrading openssl on memcached clusters / various base service restarts
  • 14:22 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TwoColConflict extension on arwiki - T158493 (duration: 00m 40s)
  • 14:13 hashar@tin: Synchronized portals: (no justification provided) (duration: 00m 41s)
  • 14:12 hashar@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 40s)
  • 14:10 hashar@tin: Synchronized php-1.29.0-wmf.12/extensions/UniversalLanguageSelector/: Fix broken site picks feature for compact language links (duration: 01m 04s)
  • 14:08 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ReadingDepth logging on Wikipedias - T148262 T155639 (duration: 00m 45s)
  • 14:00 moritzm: upgrading openssl on maps clusters / various base service restarts
  • 13:41 elukey: restarting nodejs on aqs1* to pick up openssl security upgrades
  • 13:21 moritzm: upgrading openssl on aqs cluster / various base service restarts
  • 13:06 moritzm: upgrading openssl on parsoid clusters / various base service restarts
  • 12:55 moritzm: upgrading openssl on database servers / various base service restarts
  • 12:53 volans: re-enabled puppet on neodymium and puppetmaster1001 after Gerrit 330436 was merged T154588
  • 12:51 volans: re-enabled puppet on planet2001, was disabled since a week without reason
  • 12:39 volans: reenabled ircecho aftrer fixing ferm issue and run puppet on affected hosts
  • 12:08 volans: stopped ircecho temporarily while fixing ferm
  • 12:01 volans: temporarily disabled puppet on neodymium and puppetmaster1001 to merge Gerrit 330436 T154588
  • 11:32 moritzm: upgrading openssl on kafka clusters / various base service restarts
  • 11:15 moritzm: upgrading openssl on restbase clusters / various base service restarts
  • 11:05 moritzm: upgrading openssl on hadoop cluster / various base service restarts
  • 11:02 elukey: rolling restart of cassandra-metrics-collector on aqs1* for T157022
  • 10:55 elukey: rolling restart of the analyics jmxtrans daemons for T157022
  • 10:29 moritzm: restarting base services on mw2* after openssl update
  • 10:14 godog: downgrade carbon-c-relay on graphite1001 to trusty's version and bounce daemons
  • 09:58 moritzm: upgrading mira/tin to HHVM 3.12.14
  • 09:46 godog: upgrade graphite on graphite1001 and bounce carbon daemons
  • 09:26 ema: cp3030: libssl1.1 upgraded to 1.1.0e-1+wmf1, libevent-2.0-5 upgraded to 2.0.21-stable-2+deb8u1
  • 08:53 godog: switch statsd/graphite DNS to graphite1001 - T157022
  • 08:32 moritzm: upgrading mw1170-mw1208 to HHVM 3.12.14
  • 08:30 gehel: increasing concurrent recoveries / relocations to 8 on elasticsearch eqiad
  • 08:24 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(27|32|37|41).eqiad.wmnet
  • 07:31 marostegui: Deploy alter table enwiki.revision db2055 - T132416
  • 07:29 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2048 and depool db2055 - T132416 (duration: 00m 51s)
  • 02:24 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Feb 21 02:24:37 UTC 2017 (duration 5m 20s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 07m 20s)
  • 01:17 tstarling@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 42s)

2017-02-20

  • 20:31 gehel: taking threaddumps and restarting elastic1017 (high load)
  • 20:20 gehel: reducing concurrent recoveries / relocations to 4 on elasticsearch eqiad
  • 19:07 ariel@tin: Finished deploy [dumps/dumps@9757356]: fix retries of page content dumps with checkpoint, no dup ranges (duration: 00m 02s)
  • 19:07 ariel@tin: Started deploy [dumps/dumps@9757356]: fix retries of page content dumps with checkpoint, no dup ranges
  • 18:30 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(27|32|37|41).eqiad.wmnet
  • 18:29 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(26|31|36|40).eqiad.wmnet
  • 17:56 ppchelko@tin: Finished deploy [changeprop/deploy@30873eb]: Update change-prop to 30873ebd5: enabling DNS caching for T158338 (duration: 01m 41s)
  • 17:54 ppchelko@tin: Started deploy [changeprop/deploy@30873eb]: Update change-prop to 30873ebd5: enabling DNS caching for T158338
  • 17:52 Pchelolo: update change-prop to 30873ebd5
  • 16:40 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(26|31|36|40).eqiad.wmnet
  • 14:55 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1002.eqiad.wmnet
  • 14:49 ema: cp2002, cp4008: libssl1.1 upgraded to 1.1.0e-1+wmf1 and libevent-2.0-5 upgraded to 2.0.21-stable-2+deb8u1
  • 14:32 ema: upgrading pinkunicorn to varnish 4.1.5-1wm1
  • 14:30 ema: varnish 4.1.5-1wm1 uploaded to apt.w.o
  • 14:10 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(25|28|29|30).eqiad.wmnet
  • 13:52 gehel: resetting ownership of new .wsp files for wdqs1002 on graphite[12]001
  • 13:49 moritzm: installing remaining lcms security updates
  • 13:41 hashar@tin: Synchronized wmf-config/throttle.php: [throttle] New rule - T158312 (duration: 00m 42s)
  • 13:35 marostegui: Transferring dbstore1001:/srv/backups (the last 2 backups) to dbstore2001:/srv/backup/dbstore1001 - T153768
  • 13:17 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(25|28|29|30).eqiad.wmnet
  • 13:04 moritzm: installing jasper security updates
  • 12:20 godog: remove syslog from graphite1001, bump max open files for carbon-c-relay
  • 11:00 godog: switch diamond traffic to graphite1001 - T157022
  • 10:54 moritzm: rolling restart of nginx on remaining mediawiki servers in eqiad to pick up openssl update
  • 10:26 ariel@tin: Finished deploy [dumps/dumps@dee43ca]: fix prefetch on retries of partially complete page content dumps (duration: 00m 02s)
  • 10:26 ariel@tin: Started deploy [dumps/dumps@dee43ca]: fix prefetch on retries of partially complete page content dumps
  • 10:24 hashar@tin: Synchronized wmf-config/throttle.php: Add new throttle rule - T158432 (duration: 00m 49s)
  • 09:46 moritzm: upgrading mediawiki servers in codfw to HHVM 3.12.14
  • 09:33 marostegui: Manually deploy gtid_domain_id on s6 hosts - T149418
  • 08:47 gehel: restarting diamond on wdqs1002 after initial data import
  • 08:41 marostegui: Increase 100G dbstore1002 lv /dev/mapper/tank-data
  • 08:17 ariel@tin: Finished deploy [dumps/dumps@d50e129]: cleanup tmp files before checkpoint file rerun (duration: 00m 02s)
  • 08:17 ariel@tin: Started deploy [dumps/dumps@d50e129]: cleanup tmp files before checkpoint file rerun
  • 07:39 marostegui@tin: Synchronized wmf-config/db-codfw.php: Update ticket number for db2048 depool reason (duration: 00m 44s)
  • 07:29 marostegui: Deploy alter table on db2048 enwiki.revision - T132416
  • 07:28 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2048 - T132416 (duration: 00m 41s)
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Feb 20 02:25:07 UTC 2017 (duration 5m 19s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 07m 53s)

2017-02-19

  • 20:08 ariel@tin: Finished deploy [dumps/dumps@364470e]: fix private table dumping, report failed runs correctly (duration: 00m 03s)
  • 20:08 ariel@tin: Started deploy [dumps/dumps@364470e]: fix private table dumping, report failed runs correctly
  • 15:54 hashar: Restarted Zuul to clear out a stall function in Gearman server.
  • 02:37 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Feb 19 02:37:07 UTC 2017 (duration 5m 18s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 13m 08s)

2017-02-18

  • 18:00 reedy@tin: Synchronized wmf-config/CommonSettings.php: Rv reservedusernames addition from CS (duration: 00m 42s)
  • 17:59 reedy@tin: Synchronized php-1.29.0-wmf.12/includes/DefaultSettings.php: Unknown user to reserved usernames in defaultsettings (duration: 00m 45s)
  • 17:29 reedy@tin: Synchronized wmf-config/CommonSettings.php: Add Unknown user to reservedusernames (duration: 00m 48s)
  • 02:23 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Feb 18 02:23:52 UTC 2017 (duration 5m 20s)
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 06m 41s)

2017-02-17

  • 21:34 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic1023.eqiad.wmnet
  • 19:20 gehel: upgrading maps-test2004 to nodejs6 for testing - T150354
  • 18:34 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(21|22|24).eqiad.wmnet
  • 17:17 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(21|22|23|24).eqiad.wmnet
  • 17:17 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(17|18|19|20).eqiad.wmnet
  • 16:18 ejegg: updated 3DS rules for DK, SE, and NO
  • 16:15 bblack: restarting cp1074 varnish backend (cron due in 24h, but mb lag looks pretty bad)
  • 15:26 urandom: T155120: Restarting Cassandra on restbase1007-a.eqiad.wmnet to disable Prometheus exporter agent
  • 14:48 _joe_: uploaded clustershell 1.7.3, tqdm, pyparsing to jessie-wikimedia in preparation for cumin
  • 13:47 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2070 - T156478 (duration: 00m 45s)
  • 12:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2070 - T156478 (duration: 00m 41s)
  • 12:18 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2070 - T156478 (duration: 00m 41s)
  • 11:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Clean up db1028 old comments - T153300 (duration: 00m 41s)
  • 11:11 moritzm: restarting nginx on sodium (mirrors.wikimedia.org) to pick up openssl update
  • 10:01 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(17|18|19|20).eqiad.wmnet
  • 09:54 moritzm: rolling restart of nginx on mediawiki servers in codfw to pick up openssl update
  • 09:33 moritzm: rolling restart of nginx on mw canaries to pick up openssl update
  • 09:32 gehel: upgrade nginx on elasticsearch codfw for ssl upgrade
  • 09:30 gehel: upgrade nginx on elastic1049-1052 for ssl upgrade
  • 09:22 moritzm: restarting nginx on install1002/2002 to pick up new openssl
  • 08:35 moritzm: upgrading mw1262-mw1265 to HHVM 3.12.14
  • 08:35 godog: restart nginx on prometheus in eqiad/codfw to pick up openssl update
  • 08:19 moritzm: restarted nginx/prometheus in esams/ulsfo to pick up openssl update
  • 08:00 moritzm: installing openssl 1.1.0e updates
  • 07:58 moritzm: upgrading mw1261 to HHVM 3.12.14
  • 07:41 moritzm: installing spice security updates
  • 06:59 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2070 - T156478 (duration: 00m 48s)
  • 02:33 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Feb 17 02:33:21 UTC 2017 (duration 5m 32s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 06m 49s)

2017-02-16

  • 23:18 maxsem@tin: Finished scap: Another time, just ot make sure some files synched cuz lat time there were some mid-air collisions (duration: 15m 44s)
  • 23:02 maxsem@tin: Started scap: Another time, just ot make sure some files synched cuz lat time there were some mid-air collisions
  • 22:58 maxsem@tin: Finished scap: Update messages for https://gerrit.wikimedia.org/r/#/c/338013/ (duration: 24m 29s)
  • 22:46 mutante: tin - apt-get clean - 4.6G avail (T158359)
  • 22:33 maxsem@tin: Started scap: Update messages for https://gerrit.wikimedia.org/r/#/c/338013/
  • 22:32 maxsem@tin: Synchronized php-1.29.0-wmf.12/extensions/JsonConfig/: https://gerrit.wikimedia.org/r/#/c/338013/ (duration: 00m 42s)
  • 22:31 maxsem@tin: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/338208/ (duration: 00m 53s)
  • 22:17 XenoRyet: updated paymentswiki from 4466b9d to 2a0c3b2
  • 22:04 mutante: phab2001 - start/stop phd, testing gerrit 338163
  • 21:39 Reedy: make that 2017
  • 21:39 Reedy: Deleted around 9500 pre 2013 captchas
  • 21:08 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.12
  • 20:13 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.12
  • 19:13 chasemp: clean out /var/log/ on labnet1001 as it filled up
  • 19:12 gehel: restarting kartotherian / tilerator on maps-test*
  • 19:07 chasemp: bump up nodepool allocated fixed ips set (I think it exhausted them errantly somehow?)
  • 18:52 chasemp: clean out nodepool instances
  • 18:50 chasemp: stop noodepool to reset state on pool
  • 18:03 halfak@tin: Started deploy [ores/deploy@e9bbda3]: (no justification provided)
  • 18:03 halfak: deploying ores:e9bbda3
  • 17:08 hashar: reenable puppet on contint1001
  • 17:03 hashar: stopped puppet on contint1001 for https://gerrit.wikimedia.org/r/#/c/336978/
  • 16:30 moritzm: uploaded HHVM 3.12.14 to apt.wikimedia.org
  • 16:25 jynus: SET GLOBAL thread_pool_size=64; on db1074's mariadb
  • 16:01 moritzm: upgrading mwdebug1002 to HHVM 3.12.14
  • 15:53 moritzm: upgrading mwdebug1001 to HHVM 3.12.14
  • 14:27 moritzm: uploaded openssl 1.1.0e to apt.wikimedia.org
  • 13:28 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2070 IP as it goes to another rack - T156478 (duration: 00m 41s)
  • 13:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2070 IP as it goes to another rack - T156478 (duration: 00m 56s)
  • 13:19 marostegui: Shutdown db2070 for maintenance - T156478
  • 10:18 godog: roll-restart hhvm in eqiad to pick up fluorine -> mwlog1001 changes - T123728
  • 10:17 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(48|49|50|51|52).eqiad.wmnet
  • 10:15 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(48|49|50|51|52).codfw.wmnet
  • 09:39 moritzm: installing libgc security updates on trusty systems
  • 09:33 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2070 - T156478 (duration: 00m 41s)
  • 08:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore origina db1082 weight - T158188 (duration: 00m 41s)
  • 08:39 godog: roll-restart jobrunner in codfw/eqiad to pick up fluorine -> mwlog1001 redis change - T123728
  • 07:54 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2060 - T156161 (duration: 00m 44s)
  • 07:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase load db1082 - T158188 (duration: 00m 42s)
  • 03:11 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Feb 16 03:11:01 UTC 2017 (duration 5m 42s)
  • 03:05 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 14m 27s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 11m 46s)
  • 00:59 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 with low load (duration: 00m 41s)
  • 00:49 eileen1: Update CiviCRM from 1ffc090 to 20660c4
  • 00:12 maxsem@tin: Synchronized php-1.29.0-wmf.12/extensions/Gadgets: https://gerrit.wikimedia.org/r/#/c/338004/ (duration: 00m 42s)

2017-02-15

  • 22:59 eileen1: update CIviCRM from da6ba1b to 1ffc090
  • 22:55 thcipriani@tin: Synchronized php-1.29.0-wmf.12/includes/libs/rdbms/ChronologyProtector.php: Add version to ChronologyProtector key T158217 (duration: 00m 41s)
  • 22:52 demon@tin: Synchronized php-1.29.0-wmf.12/extensions/Dashiki: prep-type stuff (duration: 00m 50s)
  • 21:41 ejegg: updated civicrm from 0fb289f to da6ba1b
  • 20:06 mutante: contint1001 - logrotate --force /etc/logrotate.d/jenkins to test gerrit:337383
  • 19:50 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Popups by default on se.wikimedia (T68374) (duration: 00m 41s)
  • 18:59 demon@tin: Synchronized multiversion/submodules.json: no-op (duration: 00m 50s)
  • 18:30 marostegui: Stop MySQL and shutdown db2062 for maintenance - T156478
  • 17:58 jynus: stopping labsdb1005 mariadb + puppet in preparation for reimage
  • 17:41 thcipriani@tin: Synchronized php-1.29.0-wmf.11/includes/libs/rdbms/ChronologyProtector.php: init() use instanceof instead of empty() T158127 (duration: 00m 41s)
  • 17:17 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: Group0 to 1.29.0-wmf.12 T155527
  • 17:08 thcipriani@tin: Synchronized php-1.29.0-wmf.12/includes/libs/rdbms/ChronologyProtector.php: init() use instanceof instead of empty() T158127 (duration: 00m 43s)
  • 17:05 thcipriani: starting wmf.12 to group0
  • 16:39 godog: flip xenon redis and apache from fluorine to mwlog1001 - T123728
  • 16:23 Jeff_Green: authdns-update to deploy fundraising host rename db1008->frav1001
  • 16:16 urandom: T155120: restarting Cassandra on restbase1007-a to enable Prometheus exporter (canary)
  • 16:01 dcausse@tin: Synchronized php-1.29.0-wmf.11/extensions/CirrusSearch/: T156234: Fold some problematic whitespaces with completion (duration: 01m 01s)
  • 15:57 marostegui: (Old action but for the sake of getting it logged) Force RAID controller to work on WriteBack even with the broken BBU it has now on db1060 so it can keep up with the replication thread - T158194
  • 15:51 filippo@tin: Synchronized wmf-config/StartProfiler.php: Switch xenon redis to mwlog1001.eqiad.wmnet (duration: 00m 42s)
  • 15:44 hashar: Zuul reducing gate-and-submit minimum amount of changes to process from the wrong 12 down to 2. In case of repeating failures it would end up running jobs for only two jobs which would prevent cancelling jobs for up to 11 changes!
  • 15:37 jynus: stopping slave and repartitioning db1045
  • 14:45 jynus: offlined 2 disks with media + other errors on db1060
  • 14:34 marostegui: Stop MySQL and shutdown db1082 - T158188
  • 14:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 (duration: 00m 44s)
  • 14:04 hashar@tin: Synchronized php-1.29.0-wmf.12/extensions/CirrusSearch/includes/Maintenance/SuggesterAnalysisConfigBuilder.php: Fold some problematic whitespaces with completion - T156234 (duration: 00m 48s)
  • 13:59 elukey: disabled mod_deflate on bohrium (piwik) and disabled puppet. Testing 503 reduction.
  • 13:16 dereckson@tin: Synchronized wmf-config/throttle.php: Throttle rule for Royal College of Nursing event (T158171) (duration: 00m 43s)
  • 12:56 elukey: restart of jmxtrans on all the analytics kafka brokers
  • 12:33 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1045 (duration: 00m 42s)
  • 11:38 marostegui: Running pt-table-checksum on db1043 (m3 - phabricator master) - T154485
  • 11:35 bblack@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp1067.eqiad.wmnet
  • 11:20 godog: upgrade git on tin/mira - T140927
  • 10:59 moritzm: installing PHP security updates on californium (running horizon)
  • 10:49 moritzm: installing PHP security updates on uranium (running ganglia)
  • 10:46 moritzm: installing PHP security updates on siliver (running wikitech)
  • 08:21 moritzm: installing PHP security updates on Ubuntu systems
  • 07:33 marostegui: Deploy alter table on x1 master (db1031) for the echo_notification tables - T136428
  • 07:27 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2062 - T156478 (duration: 00m 42s)
  • 02:40 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Feb 15 02:40:31 UTC 2017 (duration 5m 23s)
  • 02:35 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 12m 50s)
  • 00:22 ebernhardson@tin: Synchronized php-1.29.0-wmf.11/extensions/CirrusSearch/: Provide per-index settings from configuration for elasticsearch 5 (duration: 00m 55s)
  • 00:16 ebernhardson@tin: Synchronized wmf-config/CirrusSearch-common.php: Configure cirrus per-index setings for elasticsearch 5 (duration: 00m 43s)
  • 00:07 ebernhardson@tin: Synchronized php-1.29.0-wmf.11/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: (no justification provided) (duration: 00m 50s)

2017-02-14

  • 22:23 thcipriani@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Cleanup popups beta cluster config (beta-only-change) (duration: 00m 41s)
  • 22:15 chasemp: start staged nova-fullstack testing daemon on labnet1002 for metric inspection
  • 22:11 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.29.0-wmf.11 for T158127
  • 22:07 otto@tin: Finished deploy [analytics/refinery@4cd6305]: Deploying refinery with another update to drop hourly partitions script (duration: 01m 13s)
  • 22:06 otto@tin: Started deploy [analytics/refinery@4cd6305]: Deploying refinery with another update to drop hourly partitions script
  • 22:05 otto@tin: Finished deploy [analytics/refinery@4cd6305]: Deploying refinery with another update to drop hourly partitions script (duration: 01m 41s)
  • 22:03 otto@tin: Started deploy [analytics/refinery@4cd6305]: Deploying refinery with another update to drop hourly partitions script
  • 22:01 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.29.0-wmf.12
  • 21:47 thcipriani@tin: Synchronized php-1.29.0-wmf.11/includes/libs/rdbms/loadbalancer/LoadBalancer.php: doWait() (duration: 00m 50s)
  • 21:10 thcipriani@tin: Finished scap: testwiki to php-1.29.0-wmf.12 and rebuild l10n cache (wikiversions.json not updated previously) (duration: 09m 25s)
  • 21:00 thcipriani@tin: Started scap: testwiki to php-1.29.0-wmf.12 and rebuild l10n cache (wikiversions.json not updated previously)
  • 20:51 thcipriani@tin: Finished scap: testwiki to php-1.29.0-wmf.12 and rebuild l10n cache (duration: 48m 04s)
  • 20:38 eileen1: update civicrm from 8d04e75 to 0fb289f
  • 20:36 otto@tin: Finished deploy [analytics/refinery@67c3924]: Deploying refinery with update to drop hourly partitions script (duration: 02m 25s)
  • 20:33 otto@tin: Started deploy [analytics/refinery@67c3924]: Deploying refinery with update to drop hourly partitions script
  • 20:22 Dereckson: Update site statistics for pam.wikipedia (T158110, now 454 images)
  • 20:03 thcipriani@tin: Started scap: testwiki to php-1.29.0-wmf.12 and rebuild l10n cache
  • 19:23 krinkle@tin: Synchronized docroot/noc/conf: I67194f (duration: 00m 48s)
  • 19:22 krinkle@tin: Synchronized dblists/: I67194f (duration: 01m 37s)
  • 18:39 arlolra: Updated Parsoid to 79ccfb93 (T58381, T108216)
  • 18:28 thcipriani: starting branch cut for 1.29.0-wmf.12
  • 18:27 arlolra@tin: Finished deploy [parsoid/deploy@1bfb86b]: Updating Parsoid to 79ccfb93 (duration: 09m 58s)
  • 18:17 arlolra@tin: Started deploy [parsoid/deploy@1bfb86b]: Updating Parsoid to 79ccfb93
  • 18:11 MaxSem: Purged https://he.wikipedia.org/static/images/mobile/copyright/wikipedia-wordmark-he.svg with purgeList.php
  • 17:34 bblack@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp1067.eqiad.wmnet
  • 16:18 andrewbogott: rebooting californium
  • 16:15 andrewbogott: dist-upgrade californium (as part of the liberty->mitaka upgrade)
  • 16:06 Niharika: Updated wikimania app to 5c44d06 Removed stale translations
  • 16:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2062 IP after its move to another rack - T156478 (duration: 00m 39s)
  • 16:01 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2062 IP after its move to another rack - T156478 (duration: 00m 40s)
  • 15:11 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Allow sysops to add/revoke account creator on it.wikiversity (T158062) (duration: 00m 41s)
  • 15:05 marostegui: Shutdown mysql (and later the whole host) on db2062 for maintenance - T156478
  • 15:04 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2062 to change its rack - T156478 (duration: 00m 41s)
  • 15:01 ema: lvs10*: upgrade to pybal 1.13.5 T147425
  • 14:48 moritzm: installing php security updates on einsteinium
  • 14:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073,89,90,91,92 (duration: 00m 41s)
  • 14:38 ema: lvs300[12]: upgrade to jessie 8.7, pybal 1.13.5, reboot into kernel 4.4.2-3+wmf8 T155401 T147425
  • 14:35 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073,89,90,91,92 (duration: 00m 40s)
  • 14:20 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055,72,83,56,84,76,78,87,71 (duration: 00m 40s)
  • 14:17 hashar: European SWAT is complete
  • 14:16 hashar@tin: Synchronized php-1.29.0-wmf.11/extensions/CirrusSearch/profiles/SimilarityProfiles.php: Explicitly use BM25 as default for wmf_defaults similarity profile (duration: 00m 47s)
  • 14:14 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055,72,83,56,84,76,78,87,71 (duration: 00m 41s)
  • 14:11 ema: lvs300[34]: upgrade to jessie 8.7, pybal 1.13.5, reboot into kernel 4.4.2-3+wmf8 T155401 T147425
  • 14:09 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TwoColConflict on dewiki - T155721 (duration: 00m 40s)
  • 14:06 hashar@tin: Synchronized wmf-config/throttle.php: Throttle rule for cswiki - T158040 (duration: 00m 40s)
  • 13:50 ema: lvs400[12]: upgrade to jessie 8.7, pybal 1.13.5, reboot into kernel 4.4.2-3+wmf8 T155401 T147425
  • 13:32 ema: lvs400[34]: upgrade to jessie 8.7, pybal 1.13.5, reboot into kernel 4.4.2-3+wmf8 T155401 T147425
  • 13:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051,66,80,74,77,56,81,70,82 (duration: 00m 44s)
  • 13:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051,66,80,74,77,56,81,70,82 (duration: 00m 40s)
  • 12:59 jynus: reloading/restarting gerrint on cobalt, too slow
  • 11:57 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2221.codfw.wmnet
  • 11:30 moritzm: manual fix up of exim spool permissions on krypton (used to run the heavy exim variant)
  • 11:28 jynus: performing schema change on all mariadb servers T150474
  • 10:51 ema: lvs200[123]: upgrade to jessie 8.7, pybal 1.13.5, reboot into kernel 4.4.2-3+wmf8 T155401 T147425
  • 10:48 moritzm: installing tomcat security updates
  • 10:23 ema: lvs200[456]: upgrade to jessie 8.7, pybal 1.13.5, reboot into kernel 4.4.2-3+wmf8 T155401 T147425
  • 10:18 ema: uploading pybal 1.13.5 to apt.w.o T147425
  • 09:05 moritzm: upgrading firejail on sca cluster
  • 08:45 moritzm: installing vim security updates
  • 08:33 moritzm: restarting zookeeper on conf1003
  • 08:23 moritzm: restarting zookeeper on conf1002 to pick up OpenJDK update (restarts were stopped yesterday to further investigate gc behaviour)
  • 06:56 marostegui: Deploy alter table on x1 echo_notification tables - T136428
  • 02:40 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Feb 14 02:40:22 UTC 2017 (duration 5m 19s)
  • 02:35 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 11m 50s)
  • 00:40 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/337442/ (duration: 00m 40s)
  • 00:39 eileen1: update civicrm from 7b36996 to 8d04e75
  • 00:38 maxsem@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-he.svg: https://gerrit.wikimedia.org/r/#/c/337442/ (duration: 00m 40s)
  • 00:22 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/337441/ (duration: 00m 40s)
  • 00:21 maxsem@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-hi.svg: https://gerrit.wikimedia.org/r/#/c/337441/ (duration: 00m 42s)
  • 00:16 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/337064/ (duration: 00m 48s)

2017-02-13

  • 23:45 Pchelolo: update RESTBase to 0e9106ab8
  • 23:35 Pchelolo: update RESTBase to 0e9106ab8 - canary on restbase1007
  • 23:29 Pchelolo: update RESTBase to 0e9106ab8 - staging
  • 23:27 bsitzmann@tin: Finished deploy [mobileapps/deploy@cd3b897]: Update mobileapps to 776211b (duration: 03m 19s)
  • 23:24 bsitzmann@tin: Started deploy [mobileapps/deploy@cd3b897]: Update mobileapps to 776211b
  • 22:39 Pchelolo: rollback RESTBase to ea980cc5d - staging
  • 22:26 Pchelolo: update RESTBase to 0e9106ab8 - staging
  • 22:14 kaldari@tin: Synchronized dblists/: Turning on Echo for loginwiki (duration: 00m 41s)
  • 22:14 kaldari: scap sync-dir dblists/ 'Turning on Echo for loginwiki'
  • 21:43 bsitzmann@tin: Finished deploy [mobileapps/deploy@f6b4435]: Update mobileapps to 3af473f (duration: 03m 44s)
  • 21:39 bsitzmann@tin: Started deploy [mobileapps/deploy@f6b4435]: Update mobileapps to 3af473f
  • 20:51 mutante: carbon/install - adjusted Letsencrypt cert creation, deactivated reprepro to protect from accidental use, switching rsync direction from install1002->install2002, disabled cron on carbon (T132757)
  • 20:33 Reedy: dropped old out of date echo tables from extension1.loginwiki T157105
  • 19:38 mutante: carbon - synced /srv/ data to install1002/2002 for the last time, switching apt.wikimedia.org CNAME to install1002 - carbon deprecated (T132757)
  • 19:22 legoktm: running namespaceDupes.php on fiwiki (T103786)
  • 19:11 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: properly set wgCirrusSearchUseIcuFolding T155515 (duration: 00m 41s)
  • 19:00 moritzm: upgrading mw1236 with the security updates it missed while it was powered off
  • 18:33 chasemp: labstore1005 service maintain-dbusers restart
  • 18:18 mutante: scandium - shutdown -h now (T150936)
  • 18:05 mutante: scandium - ex-zuul merger - removing from puppet, revoking puppet cert, salt key..
  • 16:08 ema: lvs1003: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 15:58 ema: lvs1002: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 15:52 gehel: copy old blazegraph metrics to new path (wikidata.query.(triple|lag).* -> servers.<server_name>...)
  • 15:36 ema: lvs1001: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 15:18 moritzm: switched krypton to exim4-daemon-light (the -heavy variant was installed from an earlier role it carried)
  • 15:06 addshore: EU SWAT done
  • 15:06 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T101634 Update Wikiquote talk namespace in Sanskrit Wikisource and Support legacy Wikiquote talk namespace in Sanskrit Wikisource (duration: 00m 40s)
  • 14:37 addshore@tin: Synchronized portals: Updating portal stats Gerrit (duration: 00m 40s)
  • 14:36 addshore@tin: Synchronized portals/prod/wikipedia.org/assets: Updating portal stats Gerrit (duration: 00m 40s)
  • 14:28 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155721 Enable TwoColConflict on metawiki (duration: 00m 40s)
  • {{safesubst:SAL entry|1=14:23 addshore@tin: Synchronized php-1.29.0-wmf.11/extensions/TwoColConflict/includes/TwoColConflictHooks.php: [[gerrit:337186|Change beta feature info and talk links (duration: 00m 40s)}}
  • 14:19 addshore@tin: Synchronized php-1.29.0-wmf.11/extensions/RevisionSlider/modules/ext.RevisionSlider.lazy.css: T157800 Dont set min-height and min-width for oo-ui buttons 2/2 (duration: 00m 55s)
  • 14:18 addshore@tin: Synchronized php-1.29.0-wmf.11/extensions/RevisionSlider/modules/ext.RevisionSlider.css: T157800 Dont set min-height and min-width for oo-ui buttons 1/2 (duration: 01m 07s)
  • 13:57 moritzm: rolling restart of zookeeper in eqiad to pick up Java security updates
  • 13:57 marostegui: Shutdown db2060 for maintenance - T156161
  • 13:47 ema: lvs100[56]: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 13:37 ema: lvs1004: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 13:33 moritzm: rolling restart of zookeeper in codfw to pick up Java security updates
  • 12:32 elukey: updating elastic search ACLs on cr1/cr2 for the analytics-ip4 filter
  • 11:59 moritzm: removing unneeded PHP packages from mw1261-mw1265 (these were installed before we changed puppet trim most PHP packages in favour of HHVM)
  • 11:21 moritzm: installing PHP security updates
  • 11:18 elukey: stopped ircecho on einsteinium
  • 11:11 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1002.eqiad.wmnet
  • 10:47 godog: remove big/spammy log files from thubmro100[12] - T157949
  • 10:35 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1001.eqiad.wmnet
  • 09:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 - T136428 (duration: 00m 41s)
  • 09:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 - T136428 (duration: 00m 40s)
  • 09:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1077 - T136428 (duration: 00m 40s)
  • 09:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1077 - T136428 (duration: 00m 40s)
  • 09:00 marostegui: Deploy alter table s3 officewiki and mediawikiwiki for echo_notification tables on eqiad - T136428
  • 08:10 elukey: removed empty log files from elastic1022,1024,2001,1026,1040 to fix logrotate cronspam
  • 07:55 moritzm: upgrade HHVM on remaining mw servers in eqiad
  • 07:03 marostegui: Compressing commonswiki tables on labsdb1010 and labsdb1011 - T153743
  • 02:38 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Feb 13 02:38:43 UTC 2017 (duration 5m 18s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 13m 18s)

2017-02-12

  • 15:18 reedy@tin: Synchronized php-1.29.0-wmf.11/extensions/ConfirmEdit/maintenance: Instrumentation to script (duration: 00m 41s)
  • 02:24 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Feb 12 02:24:56 UTC 2017 (duration 5m 20s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 07m 12s)

2017-02-11

  • 10:02 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic20(33|34|35|36).codfw.wmnet
  • 09:53 elukey: mw1236 back in production (scap pull executed before pooled=yes) - T156610
  • 09:52 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1236.eqiad.wmnet
  • 09:35 elukey: rebooting mw1236 to make sure that it comes up cleanly - T156610
  • 09:15 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic20(33|34|35|36).codfw.wmnet
  • 09:09 gehel: cleanup logs on elastic20(01|25) - T139043
  • 02:37 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Feb 11 02:37:10 UTC 2017 (duration 5m 19s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 11m 31s)
  • 00:00 mutante: switching apt.wikimedia.org from carbon to install1002 - there might be a short time until the LE SSL cert is also adjusted

2017-02-10

  • 23:41 RainbowSprinkles: gerrit: Restarting to pick up config changes
  • 23:12 RainbowSprinkles: gerrit: restarting service
  • 22:42 godog: start rsync of whisper metrics graphite2001 -> graphite1001 - T157022
  • 22:29 mutante: carbon - stopping puppet and most services, adding deprecation warning to motd, rsyncing data one last time (T132757)
  • 22:17 mutante: install1001 - shutdown ganeti instance and deleting it and its disk (T132757)
  • 21:27 mutante: install1001, install2001 - removed from Icinga, shutting down (T84380, T132757)
  • 21:18 mutante: install1001, install2001 - revoke puppet certs, puppet node deactivate, delete salt keys (T84380, T132757)
  • 21:03 ladsgroup@tin: Synchronized php-1.29.0-wmf.11/extensions/Nuke/Nuke_body.php: gerrit:337076 Fixing Special:Nuke (T156112, T156949, T156314) (duration: 00m 58s)
  • 21:03 Amir1: ladsgroup@tin:/srv/mediawiki-staging$ scap sync-file php-1.29.0-wmf.11/extensions/Nuke/Nuke_body.php 'gerrit:337076 Fixing Special:Nuke (T156112, T156949, T156314)'
  • 20:38 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic20(29|30|31|32).codfw.wmnet
  • 20:11 godog: silence graphite1001 for ssd reinstall - T157022
  • 18:27 brion: brion running throttled version of requeueTranscodes.php for low-res transcodes. expect increased load on video scalers but should remain responsive.
  • 18:23 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic20(29|30|31|32).codfw.wmnet
  • 18:08 jynus: renabling delayed replication for dbstore2001 T130128
  • 17:57 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic20(27|28).codfw.wmnet
  • 16:15 mutante: scandium - stopping zuul-merger service (T150936)
  • 15:15 jynus: temporarily disabling mariadb replication lag checks to deploy new version of the icinga check script
  • 15:09 godog: bounce cassandra-a on xenon after https://gerrit.wikimedia.org/r/335826
  • 14:26 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic20(25|26).codfw.wmnet
  • 12:33 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,service=nginx,name=mw1227.eqiad.wmnet
  • 12:32 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,service=nginx,name=mw1228.eqiad.wmnet
  • 12:32 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,service=nginx,name=mw1229.eqiad.wmnet
  • 12:11 elukey: updating firewall rules for analytics on cr1/cr2
  • 12:00 godog: bounce mwerrors on eventlog1001 to pick up statsd cname change - T157022
  • 11:46 godog: roll-restart tileratorui in codfw/eqiad to pick up new statsd.eqiad.wmnet - T157022
  • 11:41 godog: roll-restart trendingedits on scb in eqiad/codfw to pick up new statsd.eqiad.wmnet - T157022
  • 11:36 godog: roll-restart mathoid/citoid/mobileapps/cxserver/eventstreams/graphoid on scb in eqiad/codfw to pick up new statsd.eqiad.wmnet - T157022
  • 11:30 godog: roll-restart changeprop on scb in eqiad/codfw to pick up new statsd.eqiad.wmnet - T157022
  • 11:25 godog: roll-restart nodepool on labnodepool1001 to pick up new statsd.eqiad.wmnet - T157022
  • 11:23 godog: roll-restart parsoid on ruthenium to pick up new statsd.eqiad.wmnet - T157022
  • 11:19 godog: roll-restart jmxtrans on conf* in codfw/eqiad to pick up new statsd.eqiad.wmnet - T157022
  • 11:16 godog: roll-restart tilerator in codfw/eqiad to pick up new statsd.eqiad.wmnet - T157022
  • 11:10 godog: restart navtiming ve asset-check statsd-mw-js-deprecate on hafnium to pick up statsd.eqiad.wmnet change - T157022
  • 10:54 godog: roll-restart karthoterian in codfw/eqiad to pick up new statsd.eqiad.wmnet - T157022
  • 10:39 godog: roll-restart parsoid in codfw/eqiad to pick up new statsd.eqiad.wmnet - T157022
  • 10:37 elukey: roll-restart of aqs to pick up new statsd.eqiad.wmnet - T157022
  • 10:34 godog: roll-restart ocg to pick up new statsd.eqiad.wmnet - T157022
  • 10:30 godog: roll-restart restbase in eqiad to pick up new statsd.eqiad.wmnet - T157022
  • 10:20 godog: restart of jmxtrans on analytics by elukey - T157022
  • 10:10 ema: lvs1007-10: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 10:06 godog: roll-restart restbase in codfw to pick up new statsd.eqiad.wmnet - T157022
  • 10:03 hashar: rebooting contint2001
  • 09:51 hashar: Reenabling puppet and zuul-merger on contint1001 and contint2001. The git-daemon is running now T140297 T150936. The 'systemctl status git-daemon' thought that the service was running when it was not (filled T157785 )
  • 09:26 hashar: stopped zuul-merger process on contint1001 and contint2001. They lack the git-daemon service to expose the merges.
  • 08:45 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic20(25|26|27|28).codfw.wmnet
  • 08:44 moritzm: upgrading hhvm on mw1200-mw1229
  • 08:41 elukey: restarting kafka mirror maker and jmxtrans of kafka[12]00[123] for java security upgrades
  • 08:30 marostegui: Deploye alter table s3 officewiki.echo_notification and mediawikiwiki.echo_notification tables only on codfw - T136428
  • 04:06 mutante: ganglia - switching aggregators from 1001 to 1002 and 2001 to 2002, there might be minor gaps in the graphs, but hey, it's deprecated anyways
  • 03:48 mutante: icinga - live hack fixing config - due to partially removed decom hosts mc2001-mc2016
  • 02:37 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Feb 10 02:37:59 UTC 2017 (duration 5m 26s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 11m 53s)
  • 01:48 ebernhardson@tin: Synchronized wmf-config/CirrusSearch-common.php: Setup sister search prefix display types T149806 (duration: 00m 40s)
  • 01:43 ebernhardson@tin: Synchronized wmf-config/CirrusSearch-common.php: Setup sister search prefix display types T149806 (duration: 00m 48s)
  • 01:01 brion: transcode queue back to normal
  • 00:53 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set $wgPageAssessmentsSubprojects to true on English Wikipedia T157654 (duration: 00m 43s)
  • 00:53 brion: transcode high-prio queue may be briefly blocked by an influx of low-res transcodes queued in bulk. should return to normal in a bit.
  • 00:32 thcipriani@tin: Synchronized php-1.29.0-wmf.11/extensions/WikimediaEvents: SWAT: Enable Sister project search AB test T149806 (duration: 00m 45s)

2017-02-09

  • 22:58 bblack@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2006.codfw.wmnet
  • 22:57 bblack: cp2006: unresponsive control, powercycled from racadm, normal boot, no evidence in logs - repooling for now
  • 22:48 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.11
  • 22:44 thcipriani@tin: Synchronized php-1.29.0-wmf.11/skins/CologneBlue/SkinCologneBlue.php: Revert "Remove warning suppression" (duration: 00m 59s)
  • 22:42 bblack: cp2006 depooled due to icinga report of host-down
  • 22:42 bblack@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2006.codfw.wmnet
  • 22:02 brion: brion running tests of requeueTranscodes.php on terbium to restart subsets of video scaler work
  • 21:54 mutante: cp3014,cp3020,cp3022 - puppet node deactivate - cp3020 delete salt key (T130883)
  • 20:45 thcipriani@tin: Synchronized php-1.29.0-wmf.11/skins/CologneBlue/SkinCologneBlue.php: Fix a bunch of undefined indexes T157619 (sync actual skin file) (duration: 00m 40s)
  • 20:41 thcipriani@tin: Synchronized php-1.29.0-wmf.11/skins/CologneBlue/CologneBlue.php: Fix a bunch of undefined indexes T157619 (duration: 00m 41s)
  • 20:24 otto@tin: Finished deploy [analytics/refinery@9e689f3]: (no justification provided) (duration: 00m 20s)
  • 20:23 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:23 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:21 otto@tin: Finished deploy [analytics/refinery@9e689f3]: (no justification provided) (duration: 00m 04s)
  • 20:21 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:21 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:20 otto@tin: Finished deploy [analytics/refinery@9e689f3]: (no justification provided) (duration: 00m 07s)
  • 20:20 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:20 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:20 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:19 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:05 thcipriani@tin: Synchronized php-1.29.0-wmf.11/resources/src/mediawiki.special/mediawiki.special.search.interwikiwidget.styles.less: SWAT: Temporary hax to hide cawiki hacked in search sidebar T149806 (duration: 00m 40s)
  • 20:02 thcipriani@tin: Synchronized php-1.29.0-wmf.11/extensions/ConfirmEdit: SWAT: Add script for counting captchas Use an accurate number of captchas (duration: 00m 43s)
  • 19:57 ottomata: restarting main kafka brokers in codfw and then eqiad to pick up jvm updates
  • 19:50 thcipriani@tin: Synchronized php-1.29.0-wmf.11/extensions/TimedMediaHandler: SWAT: TMH job queue split into low and high priority PART III T155098 (duration: 00m 44s)
  • 19:49 thcipriani@tin: Synchronized php-1.29.0-wmf.11/extensions/TimedMediaHandler/TimedMediaHandler.hooks.php: SWAT: TMH job queue split into low and high priority PART II T155098 (duration: 00m 40s)
  • 19:47 thcipriani@tin: Synchronized php-1.29.0-wmf.11/extensions/TimedMediaHandler/TimedMediaHandler.php: SWAT: TMH job queue split into low and high priority PART I T155098 (duration: 00m 41s)
  • 19:37 smalyshev@tin: Finished deploy [wdqs/wdqs@1a7cd32]: Deploy new WAR build on 1003 (duration: 00m 16s)
  • 19:37 smalyshev@tin: Started deploy [wdqs/wdqs@1a7cd32]: Deploy new WAR build on 1003
  • 19:34 smalyshev@tin: Finished deploy [wdqs/wdqs@1a7cd32]: Deploy new WAR build on 2003 (duration: 00m 26s)
  • 19:34 smalyshev@tin: Started deploy [wdqs/wdqs@1a7cd32]: Deploy new WAR build on 2003
  • 19:29 ladsgroup@tin: Started deploy [ores/deploy@10fa16b]: (no justification provided)
  • 19:28 ladsgroup@tin: Started deploy [ores/deploy@a3a410b]: (no justification provided)
  • 19:26 thcipriani@tin: Synchronized php-1.29.0-wmf.11/extensions/TimedMediaHandler/SpecialTimedMediaHandler.php: SWAT: Only load necessary fields on Special:TimedMediaHandler lists (T157621) (duration: 00m 41s)
  • 19:19 ladsgroup@tin: Started deploy [ores/deploy@a3a410b]: (no justification provided)
  • 19:17 smalyshev@tin: Started deploy [wdqs/wdqs@1a7cd32]: Deploy new WAR build
  • 19:15 smalyshev@tin: Started deploy [wdqs/wdqs@1a7cd32]: (no justification provided)
  • 19:14 bd808: Restarted logstash on logstash1001. Dead since 2017-02-09T06:39:46 with "java.lang.UnsupportedOperationException" crash in worker thread.
  • 19:13 smalyshev@tin: Started deploy [wdqs/wdqs@1a7cd32]: (no justification provided)
  • 19:08 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Setting $wgPageAssessmentsSubprojects to true on testwiki T157654 (duration: 00m 54s)
  • 19:05 ladsgroup@tin: Started deploy [ores/deploy@4fdaf7d]: (no justification provided)
  • 18:50 godog: test bouncing jmxtrans on kafka1012 to pick up statsd changes
  • 18:35 godog: bounce zuul to pick up statsd DNS change - T157022
  • 18:34 ladsgroup@tin: Finished deploy [ores/deploy@e27e845]: (no justification provided) (duration: 04m 33s)
  • 18:30 ladsgroup@tin: Started deploy [ores/deploy@e27e845]: (no justification provided)
  • 18:29 Amir1: starting deploy of ores:e27e845 to canary node
  • 17:55 elukey: proactively restarted statsv on hafnium after the kafka broker restarts
  • 17:42 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic20(18|21|22|23|24).codfw.wmnet
  • 16:38 godog: flip dns records for statsd/carbon to graphite2001 - T157022
  • 16:37 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2040 (duration: 00m 52s)
  • 16:22 ema: lvs1011: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 16:17 marostegui: Shutdown db2060 for maintenance - T156161
  • 16:15 marostegui: Compressing commonswiki on labsdb1009 - T153743
  • 16:08 ema: lvs1012: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 16:06 jynus: rolling restart of replication threads for dbstore1002/2001/2002 T111654
  • 15:49 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1009.eqiad.wmnet
  • 15:42 godog: roll-restart diamond to pick up graphite2001 changes
  • 15:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086 - T136428 (duration: 00m 44s)
  • 15:23 ema: shutdown cp3020 T130883
  • 15:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 - T136428 (duration: 00m 40s)
  • 15:19 elukey: restarting all Analytics Kafka brokers for Java security upgrades
  • 15:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T136428 (duration: 00m 40s)
  • 15:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T136428 (duration: 00m 43s)
  • 14:53 moritzm: upgrading hhvm on mw1189-mw1199 and mw1293/mw1294
  • 14:48 godog: move diamond traffic to graphite2001 - T157022
  • 14:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1062 - T136428 (duration: 00m 41s)
  • 14:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1062 - T136428 (duration: 00m 45s)
  • 14:01 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic20(21|22|23|24).codfw.wmnet
  • 13:45 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2020.codfw.wmnet
  • 13:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1028 - T153300 (duration: 00m 41s)
  • 13:11 moritzm: upgrading firejail on sca cluster
  • 12:52 gehel: killing salt runs stuck on failing reimage of elastic2018
  • 12:37 mforns@tin: Finished deploy [analytics/refinery@9e689f3]: (no justification provided) (duration: 03m 05s)
  • 12:34 mforns@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 11:56 moritzm: upgrading hhvm on mw1170-mw1188 (also effecting updates of openssl, libgd, lcms, gnutls, sqlite, libxpm and glibc)
  • 11:39 gehel: failed reimage on elastic201[89], restarting
  • 10:54 moritzm: deploy exim and openssh bugfix updates from jessie point release
  • 10:51 moritzm: upgrading java on kafka clusters and druid
  • 10:49 elukey: restarting Java daemons on druid100[123] for security upgrades
  • 10:42 jynus: preparing to reimage db2040 T111654
  • 10:37 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2040 (duration: 00m 40s)
  • 10:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1034 - T111654 (duration: 00m 41s)
  • 10:09 hashar: Restarted Jenkins on contint1001
  • 10:04 hashar: Running package upgrades on contint2001
  • 10:03 elukey: restore Hadoop master to an1001
  • 09:57 elukey: failover Hadoop masters from an1001 to an1002 to allow Java upgrades
  • 09:52 gehel: cleaning up logs on elastic20(01|16) - T139043
  • 09:50 elukey: restarting oozie and hive on analytics1003 for java security upgrades
  • 09:39 marostegui: Deploy alter table on eqiad hosts for s7 metawiki and wiki on the echo_notification tables - T136428
  • 09:38 jynus: upgrading and restarting db1034 T111654
  • 09:34 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1034 (duration: 00m 44s)
  • 09:32 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic20(17|18|19|20).codfw.wmnet
  • 09:20 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2057 (duration: 00m 41s)
  • 09:17 gehel: restarting blazegraph on wdqs1003 to ensure proper war is loaded
  • 09:10 marostegui: Deploy alter table on codfw hosts for s7 metawiki and wiki on the echo_notification tables - T136428
  • 09:08 moritzm: restarting archiva on meitnerium for java security update
  • 09:07 elukey: Executing Cassandra nodetool cleanup on aqs1006-{a,b} (one at the time) and aqs1009-a
  • 09:01 elukey: restarting java daemons on all the Hadoop nodes for security upgrades
  • 08:59 gehel: cleaning empty logs on elastic10(22|24|40) - thanks elukey !
  • 08:51 moritzm: installing Java security updates on Hadoop cluster
  • 08:45 moritzm: installing Java security updates on stat* and contint1001
  • 08:17 marostegui: Compressing commonswiki tables on db1095 - T153743
  • 07:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Added comment for db1064 being master of db1095 - T153743 (duration: 00m 40s)
  • 07:46 elukey: Renamed some logs in /var/log (adding _renamed) on aluminum, elastic102[46]/1040 to avoid cronspam and logrotate failures
  • 07:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1064 - T153743 (duration: 00m 40s)
  • 07:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 - T156126 (duration: 00m 42s)
  • 03:14 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Feb 9 03:14:55 UTC 2017 (duration 5m 44s)
  • 03:09 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 15m 19s)
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 15m 05s)
  • 01:10 twentyafterfour: phabricator upgrade finished.
  • 01:07 dereckson@tin: Synchronized wmf-config/throttle.php: Update throttle rules (Gerrit:336552 for it.wikiversity + Gerrit:336741 for cleaning) (duration: 00m 40s)
  • 01:02 twentyafterfour: starting phabricator deployment #phab-2017-02-08
  • 00:57 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Switch to SiteMatrixInterwikiResolver for AB test (Gerrit:336738) (duration: 00m 41s)
  • 00:47 dereckson@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Configuration change for RelatedArticles (labs only, Gerrit:336740) (duration: 00m 40s)
  • 00:34 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Quiz on Spanish Wikibooks (duration: 00m 41s)
  • 00:24 dereckson@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Configuration changes for RelatedArticles (labs only, Gerrit:336732 and Gerrit:336733) (duration: 00m 40s)
  • 00:20 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Prune wgMinervaUseFooterV2 (T157075) (duration: 00m 41s)
  • 00:11 dereckson@tin: Synchronized wmf-config/CommonSettings.php: Use https:// urls when communicating with PediaPress (T157398) (duration: 00m 41s)

2017-02-08

  • 23:24 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2016.codfw.wmnet
  • 23:04 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to 1.29.0-wmf.11 -- T157621 is not code-change related
  • 22:45 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 back to 1.29.0-wmf.10
  • 22:43 thcipriani: rolling back for wmf.11 from group1 due to T157621
  • 22:33 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2015.codfw.wmnet
  • 22:26 demon@tin: Synchronized php-1.29.0-wmf.11/includes/WebResponse.php: Debugging fun times (duration: 00m 50s)
  • 22:21 gehel: elastic2016 not coming up after reimage - powercycling
  • 22:01 mholloway-shell@tin: Finished deploy [mobileapps/deploy@0efa7b8]: Update service-mobileapp-node to f45bfff (duration: 02m 55s)
  • 21:58 mholloway-shell@tin: Started deploy [mobileapps/deploy@0efa7b8]: Update service-mobileapp-node to f45bfff
  • 21:43 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic20(13|14).codfw.wmnet
  • 21:42 halfak@tin: Finished deploy [ores/deploy@7c80636]: (no justification provided) (duration: 01m 26s)
  • 21:41 halfak@tin: Started deploy [ores/deploy@7c80636]: (no justification provided)
  • 21:36 halfak@tin: Finished deploy [ores/deploy@7c80636]: (no justification provided) (duration: 03m 45s)
  • 21:32 halfak@tin: Started deploy [ores/deploy@7c80636]: (no justification provided)
  • 20:54 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.11
  • 20:52 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic20(13|14|15|16).codfw.wmnet
  • 20:47 thcipriani@tin: Synchronized php-1.29.0-wmf.11/extensions/MobileFrontend/includes/api/ApiMobileView.php: Pass revision id to parseSectionsData to avoid warnings T157515 (duration: 00m 42s)
  • 19:54 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable VE on fr.wiktionary Projet: namespace (T156660) (duration: 00m 44s)
  • 19:44 Dereckson: mwscript updateCollation.php --wiki=olowiki --previous-collation=uppercase (T147064, 4238 rows processed)
  • 19:39 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Set category collation for olo.wikipedia (T146612, T147064) (duration: 00m 43s)
  • 19:29 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Namespace configuration for ml. projects (T56951) (duration: 00m 41s)
  • 19:17 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Create autopatrolled and rollbacker permissions for fa.wikiquote (T156163) (duration: 00m 43s)
  • 19:08 dereckson@tin: Synchronized php-1.29.0-wmf.11/extensions/UploadWizard/UploadWizard.config.php: Disable Firefogg support (T157201) (duration: 00m 44s)
  • 19:07 mutante: bastion hosts, people.wm: deluser volkere, let puppet create volker-e, move data, delete old home dir (T157591)
  • 19:05 dereckson@tin: Synchronized php-1.29.0-wmf.10/extensions/UploadWizard/UploadWizard.config.php: Disable Firefogg support (T157201) (duration: 00m 46s)
  • 19:02 mutante: temp. disabling puppet and doing some debugging on bastion hosts, renaming a user
  • 18:33 demon@tin: Synchronized multiversion/: Dropping old MWVersion shim (duration: 00m 57s)
  • 18:05 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic20(09|10|11|12).codfw.wmnet
  • 18:02 jynus: upgrading and restarting db2057 T111654
  • 17:46 jynus@tin: Synchronized wmf-config/db-codfw.php: depool db2057 (duration: 00m 41s)
  • 17:45 elukey: added some annotations to the aqs analytics ACLs on cr1/cr2
  • 17:30 jynus@tin: Synchronized wmf-config/db-eqiad.php: repool db1030 (duration: 00m 40s)
  • 17:04 jynus: rolling restart of replication thread of 29 mysql hosts T111654
  • 17:02 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic20(09|10|11|12).codfw.wmnet
  • 16:32 ema: cp3045 stuck rebooting, power-cycled
  • 16:20 ema: cp2017 stuck rebooting, power-cycled
  • 16:19 jynus: upgrading and restarting db1030 T111654
  • 16:15 ema: pybal 1.13.4 built and uploaded to carbon
  • 16:12 jynus@tin: Synchronized wmf-config/db-eqiad.php: depool db1030 (duration: 00m 41s)
  • 16:10 chasemp: maintain-views and maintain-meta_p full runs on labsdb1009/10/11
  • 15:46 marostegui@tin: Synchronized wmf-config/db-codfw.php: db1073 change IP - T156126 (duration: 00m 40s)
  • 15:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: db1073 change IP - T156126 (duration: 00m 40s)
  • 15:41 ema: cp2011 stuck rebooting, power-cycled
  • 15:40 jynus@tin: Synchronized wmf-config/db-eqiad.php: repool db1026 (duration: 00m 41s)
  • 15:28 elukey: Eqiad cr1/cr2 - Updated analytics-in4 for new aqs nodes and removed decommed ones
  • 15:20 hoo@tin: Synchronized php-1.29.0-wmf.10/extensions/Wikidata: Wikibase uses multiple EntityPrefetchers (T157380) (duration: 02m 07s)
  • 15:15 hoo@tin: Synchronized php-1.29.0-wmf.11/extensions/Wikidata: Wikibase uses multiple EntityPrefetchers (T157380) (duration: 02m 11s)
  • 14:54 marostegui: Shutdown db1073 for maintenance - https://phabricator.wikimedia.org/T156126
  • 14:31 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2008.codfw.wmnet
  • 14:30 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2007.codfw.wmnet
  • 14:30 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2006.codfw.wmnet
  • 14:29 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2005.codfw.wmnet
  • 14:26 elukey: restarting nutcracker in all the codfw mw servers to pick up the new shards
  • 14:23 ema: cp2022 stuck rebooting, power-cycled
  • 14:23 gehel: drain shards from elastic20(09|10|11|12) in preparation for reimage - T151326
  • 14:17 jynus: upgrading and restarting db1026 T111654
  • 13:46 elukey: replacing the codfw memcached/redis shards 12->16
  • 13:41 marostegui: Start replication on db1064 - T153743
  • 13:40 marostegui: Enable replication between db1095 and db1064 - T153743
  • 13:39 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1026 for maintenance (duration: 00m 41s)
  • 13:10 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2008.codfw.wmnet
  • 13:10 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2007.codfw.wmnet
  • 13:09 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2006.codfw.wmnet
  • 13:09 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2005.codfw.wmnet
  • 12:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T156126 (duration: 00m 40s)
  • 12:38 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1037 after maintenance (duration: 00m 41s)
  • 12:17 jynus: upgrading and restarting db1037 T111654
  • 12:16 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1037 for maintenance (duration: 00m 40s)
  • 12:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1045 after maintenance (duration: 00m 42s)
  • 10:39 jynus: upgrading and restarting db1045 T111654
  • 10:38 moritzm: upgrading openssl, libgd, lcms, gnutls, sqlite, libxpm and glibc in codfw mediawiki cluster (so get get effected by the restart during the HHVM upgrade)
  • 10:11 moritzm: upgrading hhvm on codfw mediawiki cluster
  • 09:55 hoo: Updated the Wikidata property suggester with data from last Monday's JSON dump and applied the T132839 workarounds
  • 09:44 elukey: boostrapping aqs1009-b (last new AQS Cassandra instance)
  • 09:31 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2004.codfw.wmnet
  • 09:31 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2003.codfw.wmnet
  • 08:56 ema: cache_upload: upgrade to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8 T155401
  • 08:42 gehel: drain shards from elastic200[5678] in preparation for reimage - T151326
  • 08:18 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2004.codfw.wmnet
  • 08:18 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2003.codfw.wmnet
  • 08:16 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2002.codfw.wmnet
  • 07:59 marostegui: Adding 100G to the lv on dbstore1001
  • 07:23 marostegui: Restart MySQL db1095 and labsdb1009 for maintenance - T153743
  • 03:08 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Feb 8 03:08:05 UTC 2017 (duration 5m 43s)
  • 03:02 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 15m 32s)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 07m 35s)
  • 01:02 mutante: mw1294 - run puppet because it popped up in Icinga as failed - removes a bunch of /var/tmp/core/../rsvg-convert.*, all else normal
  • 00:56 thcipriani@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT Setting $wgPageAssessmentsSubprojects to true on beta cluster (housekeeping sync) (duration: 00m 40s)
  • 00:55 mutante: mw1189 service hhvm restart
  • 00:55 mutante: iridum - apache graceful'ed
  • 00:51 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update footer logos on mobile site for various projects PART II T157476 (duration: 00m 41s)
  • 00:50 thcipriani@tin: Synchronized static/images/mobile/copyright: SWAT: Update footer logos on mobile site for various projects PART I T157476 (duration: 00m 41s)
  • 00:16 thcipriani@tin: Synchronized wmf-config: SWAT: Deploy TextCat Improvements T149324 T142140 (duration: 00m 45s)

2017-02-07

  • 23:42 mutante: carbon - stopping DHCP service (install* should be used)
  • 22:31 otto@tin: Finished deploy [eventstreams/deploy@e86077c]: (no justification provided) (duration: 02m 26s)
  • 22:28 otto@tin: Started deploy [eventstreams/deploy@e86077c]: (no justification provided)
  • 21:17 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.29.0-wmf.11
  • 21:02 eileen1: Update CiviCRM from e45da6d to 7b36996
  • 21:01 thcipriani@tin: Finished scap: testwiki to php-1.29.0-wmf.11 and rebuild l10n cache (duration: 51m 53s)
  • 20:10 thcipriani@tin: Started scap: testwiki to php-1.29.0-wmf.11 and rebuild l10n cache
  • 19:43 gehel: deploying analysis-stempel plugin on relforge and cluster restart
  • 19:34 gehel: drain shards from elastic200[34] in preparation for reimage - T151326
  • 19:30 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1015 after maintenance (duration: 01m 34s)
  • 18:58 jynus: preparing db2043 for reimage T152188
  • 18:55 jynus: restarting and upgrading db1015 T152188
  • 18:55 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1015 after maintenance (duration: 00m 54s)
  • 18:33 thcipriani: starting branch cut for MediaWiki and extensions 1.29.0-wmf.11
  • 18:24 arlolra: Updated Parsoid to f0732260 (T109897)
  • 18:20 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1022 after maintenance (duration: 00m 40s)
  • 18:18 arlolra@tin: Finished deploy [parsoid/deploy@c3a5c55]: Updating Parsoid to f0732260 (duration: 09m 05s)
  • 18:09 arlolra@tin: Started deploy [parsoid/deploy@c3a5c55]: Updating Parsoid to f0732260
  • 17:50 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2001.codfw.wmnet
  • 17:07 jynus: restarting and upgrading db1022 T152188
  • 17:05 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1022 for maintenance (duration: 00m 40s)
  • 16:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1021 (duration: 00m 57s)
  • 16:08 jynus: restarting and upgrading db2064 T152188
  • 14:56 jynus: preparing db2036 for reimage T152188
  • 14:42 gehel: drain shards from elastic2001 / elastic2002 in preperation for reimage - T151326
  • 14:28 hashar: European swat copleted
  • 14:27 elukey: restarting hhvm on mw1304 (load very high, no queue, threads locked - /tmp/hhvm.62070.bt.)
  • 14:22 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Namespace changes for elwikisource - T157187 (duration: 00m 40s)
  • 14:19 elukey: restarting all the Yarn Node Managers on the Hadoop worker nodes to pick up the new config - T156932
  • 14:12 hoo@tin: Synchronized wmf-config/: Search index article placeholders on cywiki up to Q2794 (T144592) (duration: 00m 42s)
  • 14:10 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: Introduce $wmgArticlePlaceholderSearchEngineIndexed (duration: 00m 52s)
  • 14:05 marostegui: Importing commonswiki tables on labsdb1009 - T153743
  • 13:54 jynus: restarting and upgrading db1021
  • 13:45 marostegui: Importing commonswiki tables on db1095 - T153743
  • 12:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1021 (duration: 00m 41s)
  • 12:16 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1036 (duration: 00m 40s)
  • 11:56 moritzm: restarting hhvm on appserver canaries to pick up lcms, sqlite, libxpm, gnutls and glibc updates (from jessie 8.7 release and security updates)
  • 11:53 godog: stop puppet on ms-be1012 and change rsyslog to avoid local syslog spam - T157237
  • 11:42 moritzm: installing libxpm security updates
  • 11:39 jynus: restarting and upgrading db1036
  • 11:37 elukey: restarting hhvm on mw1226 (hhvm dump debug in /tmp/hhvm.33183.bt.)
  • 11:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1036 (duration: 00m 40s)
  • 11:27 moritzm: installing libgd security updates
  • 11:13 moritzm: installing libpng security updates
  • 10:34 jynus: preparing db2046 for reimage T152188
  • 10:02 _joe_: uploaded etcd-mirror 0.0.1 to jessie-wikimedia (T156009)
  • 09:48 moritzm: installing cairo security updates
  • 09:46 elukey: stopped and masked cassandra-{a,b} - T157425
  • 08:40 marostegui: Transferring commonswiki tables from db1064 to db1095 - T153743
  • 07:31 elukey: added "> /dev/null" manually to the carbon's root crontab (rsync job) to avoid cronspam. The change was already merged in https://gerrit.wikimedia.org/r/#/c/336218 but puppet is disabled on carbon.
  • 07:08 marostegui: Transferring commonswiki tables from db1064 to labsdb1009 - T153743
  • 07:06 marostegui: Importing commonswiki tables on labsdb1010 - T153743
  • 06:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 - T156226 (duration: 00m 50s)
  • 05:41 volans: ms-be1012 running out of space on /, manually compressed /var/log/swift/server.log.1 and cleaned up apt cache T157237
  • 02:37 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Feb 7 02:37:35 UTC 2017 (duration 5m 16s)
  • 02:35 cwd: imported triggers into staging civi
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 13m 23s)
  • 01:31 mutante: prometheus1004 - installed OS, signing puppet cert, initial run.. (T152504)
  • 00:49 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Fix $wmgVisualEditorAvailableNamespaces code style (Gerrit:336346) (no-op) (duration: 00m 40s)
  • 00:31 mutante: install1001 - re-enabled puppet, start DHCP service
  • 00:26 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Show again svwiki logo between 1.5x and 2x zoom (T157387) (duration: 00m 40s)

2017-02-06

  • 23:45 mutante: prometheus1003 - installed OS, signing puppet cert, initial run (T152504)
  • 22:57 Krinkle: Purged https://en.wikipedia.org/static/apple-touch/wikipedia.png (mwscript purgeList.php) for T152538
  • 21:39 bsitzmann@tin: Finished deploy [mobileapps/deploy@9b42448]: Update mobileapps to 034a391 (duration: 03m 59s)
  • 21:37 otto@tin: Finished deploy [eventstreams/deploy@c938a57]: (no justification provided) (duration: 01m 47s)
  • 21:35 otto@tin: Started deploy [eventstreams/deploy@c938a57]: (no justification provided)
  • 21:35 bsitzmann@tin: Started deploy [mobileapps/deploy@9b42448]: Update mobileapps to 034a391
  • 20:49 mutante: cp3011 thru cp3022 - shutdown / poweroff (T130883)
  • 20:39 tgr@tin: Synchronized php-1.29.0-wmf.10/extensions/JsonConfig/includes/JCUtils.php: T155532: Update JsonConfig login API call (duration: 01m 00s)
  • 20:27 mutante: cp3011 thru cp3022 - revoke puppet certs, puppet node deactivate (T130883)
  • 19:37 jynus: restarting db2060 after kernel upgrade
  • 19:30 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: Second half of changing project logo for hi.wikibooks.org (duration: 00m 39s)
  • 19:29 ebernhardson@tin: Synchronized static/images/project-logos/hiwikibooks.png: First part of Changing project logo for hi.wikibooks.org (duration: 00m 39s)
  • 19:25 ebernhardson: re-pulled 336242 to mwdebug1002
  • 19:16 ebernhardson: pulled 336242 to mwdebug1002
  • 19:13 jynus: preparing to reimage db2045 T152188
  • 19:10 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: Configure A/B test for CrossProject search results sidebar (duration: 00m 49s)
  • 19:08 ebernhardson: pulled 334673 to mwdebug1002
  • 18:16 jynus: preparing to reimage db2050 T152188
  • 18:06 marostegui: Start to transfer commonswiki ibd and cfg from db1064 to labsdb1010 - https://phabricator.wikimedia.org/T153743
  • 17:41 mobrovac: restbase end deploy of ea980cc5
  • 17:10 mobrovac: restbase start deploy of ea980cc5
  • 17:03 hashar: Nodepool/CI back up
  • 16:51 marostegui: Stop MySQL and shutdown db1072 for raid and BBU replacement - T156226
  • 16:51 marostegui: Stop MySQL and shutdown db1072 for raid
  • 16:35 hashar: Nodepool Jessie images are back up. Trusty one is being rebuild..
  • 15:55 elukey: mc2029 shutdown for DC ops
  • 15:46 hashar: Stopping Nodepool for maintenance
  • 15:42 oblivian@tin: Finished deploy [changeprop/deploy@5f932a3]: Revert ORES throttling (duration: 03m 49s)
  • 15:39 oblivian@tin: Started deploy [changeprop/deploy@5f932a3]: Revert ORES throttling
  • 15:38 moritzm: installing lcms security updates on mediawiki canaries
  • 15:17 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1001.eqiad.wmnet
  • 15:16 gehel: starting reimage of wdqs1001 - T144536
  • 15:13 ema: cache_text: upgrade to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8 T155401!log cache_maps: upgrade to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8 T155401
  • 15:10 addshore@tin: Synchronized php-1.29.0-wmf.10/extensions/ORES/extension.json: T157206 ORES - Remove all (except meta) API funcationality hooks (take2) (duration: 00m 54s)
  • 14:51 moritzm: upgrading mw1262-mw1265 to hhvm 3.12.11+dfsg-1+wmf2
  • 14:46 addshore: EU SWAT all done!
  • 14:45 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: PROD-NOOP Enable InterwikiSorting on beta (duration: 00m 39s)
  • 14:42 addshore@tin: Synchronized wmf-config/Wikibase.php: T155995 Rm InterwikiSorting settings from wmgWikibaseClientSettings PT 2/2 (duration: 00m 39s)
  • 14:40 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155995 Rm InterwikiSorting settings from wmgWikibaseClientSettings PT 1/2 (duration: 00m 41s)
  • 14:33 gehel: resetting analytics-wmde/scripts on stat1002 to the correct "production" branch
  • 14:32 ema: cp2018 stuck rebooting, powercycled
  • 14:30 addshore@tin: Synchronized dblists/compact-language-links.dblist: T157108 & T157112 Deploy Compact Language Links out of beta in French/Dutch Wikipedia (duration: 00m 40s)
  • 14:23 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Copy InterwikiSorting settings from wmgWikibaseClientSettings (duration: 00m 40s)
  • 14:19 marostegui: Stop MySQL and shutdown db2060 for maintenance - T156161
  • 14:15 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155717 Enable TwoColConflict on mw.org (duration: 00m 40s)
  • 14:09 addshore@tin: Synchronized php-1.29.0-wmf.10/extensions/ORES/extension.json: T157206 ORES - Remove all (except meta) API funcationality hooks (duration: 00m 51s)
  • 14:05 volans: fixed duplicate entries in source.list on db2040 and es2002 (trusty)
  • 13:44 ema: cache_misc: upgrade to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8 T155401
  • 13:30 elukey: applied https://gerrit.wikimedia.org/r/#/c/336203/ manually to analytics1028 (hadoop worker node) as live test - T156932
  • 13:23 gehel: removing stale puppet lock file on elastic10(22|26)
  • 12:03 moritzm: upgrading mwdebug* and mw1261 to hhvm 3.12.11+dfsg-1+wmf2
  • 10:21 oblivian@tin: Finished deploy [changeprop/deploy@ac11ebe]: Deploying ores concurrency/disabling (duration: 00m 38s)
  • 10:20 oblivian@tin: Started deploy [changeprop/deploy@ac11ebe]: Deploying ores concurrency/disabling
  • 10:18 oblivian@tin: Started deploy [changeprop/deploy@ac11ebe]: Deploying ores concurrency/disabling to canary
  • 10:17 gehel: data import complete for wdqs1003, repooling - T152643
  • 10:14 marostegui: Started to transfer commonswiki (ibd and cfg) from db1064 to labsdb1011 - T153743
  • 10:14 ema: cache_maps: upgrade to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8 T155401
  • 09:40 gehel: elasticsearch - reindexing from 2017-02-04T20:00:00Z to 2017-02-05T23:59:00Z - T139043
  • 09:36 hoo: Removed 2fa from an account, per T157191
  • 09:30 marostegui: Stop MySQL Replication on db1064 for maintenance - T153743
  • 09:26 marostegui: Deploy ALTER table db1028 metawiki.pagelinks - T153300
  • 09:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1028 - T153300 (duration: 00m 42s)
  • 08:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064 - T153743 (duration: 00m 41s)
  • 08:19 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1008.eqiad.wmnet
  • 07:38 elukey: bootstrapping aqs1009-a (new AQS cassandra instance)
  • 07:30 marostegui: Stop MySQL on db1095 to snapshot it to es1017 - T153743
  • 07:03 marostegui: Upgrade mariadb+packages db1039 - T153300
  • 07:01 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2060 - T156161 (duration: 00m 40s)
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Feb 6 02:25:51 UTC 2017 (duration 5m 20s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 07m 30s)

2017-02-05

  • 23:42 gehel: truncating elasticsearch logs on elastic10(24|26|40) - T139043
  • 23:41 gehel: truncating elasticsearch logs on elastic1022 - T139043
  • 03:28 Amir1: ladsgroup@scb100[1-4]:~$ sudo service celery-ores-worker restart (T157206)
  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Feb 5 02:26:26 UTC 2017 (duration 5m 18s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 08m 20s)

2017-02-04

  • 19:24 elukey: Started nodetool-b cleanup on aqs1005 (after 1008-{ab} bootstraps)
  • 11:44 elukey: Started nodetool-a cleanup on aqs1008 (after 1008-{ab} bootstraps)
  • 09:09 elukey: Started nodetool-a cleanup on aqs1005 (after 1008-{ab} bootstraps)
  • 02:28 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Feb 4 02:28:33 UTC 2017 (duration 5m 23s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 07m 53s)

2017-02-03

  • 22:27 mutante: switching webproxy.*.wmnet CNAMEs from carbon to new install servers (T123733) - watching squid access logs
  • 19:02 ostriches: gerrit: flushed all web_sessions, you'll have to login again. Sorry
  • 18:07 godog: stop carbon-cache on graphite1001 to prevent useless write load
  • 17:05 godog: fail over read traffic from graphite1001 to graphite2001 https://gerrit.wikimedia.org/r/335761 - T157022
  • 16:10 godog: rsync coal data graphite1001 -> graphite2001
  • 15:03 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2061 (duration: 00m 40s)
  • 15:01 jynus: preparing to reimage db2039 T111654
  • 14:35 chasemp: restart apache on graphite1001 to see if it helps sqlite lock isssue
  • 14:30 jynus: upgrade and restart db2061 T111654
  • 14:26 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2061 (duration: 00m 40s)
  • 13:58 jynus: restarting and upgrading db2041 T111654
  • 12:16 gehel: restarting relforge1001 to pick up new master configuration
  • 11:21 jynus: preparing to reimage db2054 T111654
  • 11:01 marostegui: Alter table metawiki.pagelinks on db1039 (depooled) - T153300
  • 10:54 jynus: preparing to reimage db2053 T111654
  • 10:50 moritzm: mwdebug* and mw1261 have been reverted to previous HHVM package
  • 10:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1064 - T153743 (duration: 00m 42s)
  • 10:04 marostegui: Reboot db1064 to pick up the new kernels T153743
  • 09:59 marostegui: Upgrade db1064 from MariaDB 10.0.23 to 10.0.29 - T153743
  • 09:55 gehel: restarting relforge1002 to pick up new master configuration
  • 09:54 jynus: upgrade & restart of db2063 T111654
  • 09:48 marostegui: Restart mysql on db1064 to get its binary log changed to ROW - T153743
  • 09:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064 - T153743 (duration: 00m 40s)
  • 09:41 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1064 - T153743 (duration: 00m 40s)
  • 09:39 moritzm: upgraded mwdebug* and mw1261 to the new HHVM package
  • 09:22 moritzm: uploaded hhvm_3.12.11+dfsg-1+wmf2 to apt.wikimedia.org
  • 09:10 elukey: Replace Redis/Memcached shards mc2008->2011 with mc2026->mc2029
  • 08:50 moritzm: installing tomcat regression updates on trusty hosts (jessie update was fine)
  • 08:15 moritzm: restarting prometheus servers to pick up openssl update
  • 08:05 elukey: bootstrapping aqs1008-b (AQS Cassandra instance)
  • 07:41 moritzm: upgrading firejail on remaining wtp/Parsoid hosts
  • 07:15 marostegui: Stop MySQL db1095 to snapshot it to es1013:/srv/tmp - T153743
  • 02:38 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Feb 3 02:38:10 UTC 2017 (duration 5m 3s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 12m 06s)
  • 00:51 dereckson@tin: Synchronized dblists/related-articles-footer-blacklisted-skins.dblist: Adjust RelatedArticles deployment scale for Mobile English Wikipedia (T154681) (duration: 00m 39s)
  • 00:48 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Set site name for ku.wiktionary (T29878) (duration: 00m 39s)
  • 00:45 dereckson@tin: Synchronized wmf-config/: Adjust RelatedArticles deployment scale for Mobile English Wikipedia (T154681) (duration: 00m 42s)
  • 00:42 dereckson@tin: Synchronized dblists/related-articles-footer-blacklisted-skins.dblist: Enable RelatedArticles on Mobile French Wikipedia (T156362) (duration: 00m 44s)
  • 00:33 dereckson@tin: Synchronized static/apple-touch/wikipedia.png: Update apple touch icon for Wikipedia (T152538) (duration: 00m 39s)
  • 00:24 dereckson@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Limit page images on beta cluster to images in the lead section (no-op in prod) (duration: 00m 41s)
  • 00:17 dereckson@tin: Synchronized wmf-config/interwiki.php: Interwiki map update (duration: 00m 40s)

2017-02-02

  • 23:51 twentyafterfour: 1.29.0-wmf.10 appears to be stable. Train deployment complete.
  • 23:38 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.10
  • 23:33 twentyafterfour@tin: Synchronized php-1.29.0-wmf.10/includes/cache/MessageCache.php: deploy I5b84b1 refs T156996 (duration: 00m 45s)
  • 23:11 greg-g: Gerrit: we'll be flushing session caches momentarily, sorry for the inconvenience
  • 21:50 gehel: reimaging relforge1002.eqiad.wmnet
  • 20:35 mutante: carbon - disabling puppet (to stop it from re-adding second IPv6 address causing issues with ferm rules)
  • 20:17 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 wikis to 1.29.0-wmf.9
  • 20:14 twentyafterfour: rolling back to wmf.9 due to T156996
  • 20:10 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.10
  • 20:04 twentyafterfour: deploying MediaWiki 1.29.0-wmf.10 to all wikis
  • 19:29 tgr: reset wikimedia 2FA for jdlrobson
  • 19:24 tgr: reset wikitech 2FA for jdlrobson
  • 19:13 hashar: Gracefully restarting Jenkins
  • 19:09 ejegg: updated fundraising tools from 931c8cf to 8a65e38
  • 19:03 ejegg: re-enabled thank you mail job
  • 18:59 ejegg: disabled thank you mail job
  • 18:04 bsitzmann@tin: Finished deploy [mobileapps/deploy@09101f7]: try previous deploy again (at least on canary) (duration: 00m 51s)
  • 18:03 bsitzmann@tin: Started deploy [mobileapps/deploy@09101f7]: try previous deploy again (at least on canary)
  • 18:01 bearND: ^ reverted previous deploy due to incorrect links in the news endpoint
  • 18:00 bsitzmann@tin: Finished deploy [mobileapps/deploy@09101f7]: (no justification provided) (duration: 01m 56s)
  • 17:58 bsitzmann@tin: Started deploy [mobileapps/deploy@09101f7]: (no justification provided)
  • 17:56 jynus: upgrade & restart of db2059 T111654
  • 17:43 jynus: upgrade & restart of db2052 T111654
  • 17:32 mobrovac: restbase deploy end of 634faea2
  • 17:08 mobrovac: restbase deploying 634faea2
  • 16:56 reedy@tin: Synchronized php-1.29.0-wmf.10/extensions/ConfirmEdit/maintenance/GenerateFancyCaptchas.php: Fix inclusion path (duration: 00m 41s)
  • 16:24 elukey: reboot mc2019->mc2025 to see if they come up cleanly (currently codfw replicas of eqiad redis shards)
  • 16:13 elukey: rebooting mc202[6789] (not serving any traffic) to see if they come up cleanly
  • 16:00 elukey: rebooting mc203[01234] (not serving any traffic) to see if they come up cleanly
  • 15:43 moritzm: upgrading firejail on remaining app servers
  • 15:35 moritzm: upgrading firejail on wtp1001
  • 15:15 moritzm: upgrading firejail on eqiad imagescalers
  • 15:11 elukey: rebooting mc203[56] (not taking any traffic) to test if they come up cleanly
  • 15:01 elukey: Replace Redis/Memcached shards mc200[4567] with mc202[2345]
  • 14:51 godog: manually fail sdc on graphite1001 - T157022
  • 14:23 addshore: EU SWAT really finished
  • 14:23 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT Create Wikiprojekti namespace on Finnish Wikipedia T156621 (duration: 00m 41s)
  • 14:14 addshore: EU SWAT finished
  • 14:14 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT Add Wikinews languages (en, pt, ca, fr, de, it) as import sources on eswikinews T156737 (duration: 00m 40s)
  • 14:10 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT Enable ElectronPdfService extension on dewiki T150942 (duration: 00m 41s)
  • 12:53 gehel: starting reimage to jessie of elasticsearch relforge - T151326
  • 11:59 joal@tin: Finished deploy [analytics/refinery@bc4b4ed]: (no justification provided) (duration: 01m 14s)
  • 11:58 joal@tin: Started deploy [analytics/refinery@bc4b4ed]: (no justification provided)
  • 11:40 elukey: Swap mc2002 with mc2020, mc2003 with mc2021 (Redis codfw replicas) - T155755
  • 11:35 joal@tin: Finished deploy [analytics/refinery@bc4b4ed]: (no justification provided) (duration: 03m 03s)
  • 11:32 joal@tin: Started deploy [analytics/refinery@bc4b4ed]: (no justification provided)
  • 10:53 elukey: Swap mc2001 with mc2019 (Redis codfw replicas) - T155755
  • 10:34 moritzm: restarted hadoop-mapreduce-historyserver on analytics1001
  • 10:23 moritzm: rolling restart of nginx tls terminators running on mw* application servers in eqiad to pick up openssl 1.1 update
  • 09:54 moritzm: rolling restart of logstash cluster to pick up openjdk/NSS security updates
  • 09:19 jynus: deploying schema change to page_assessments_projects on enwikivoyage T156305
  • 09:18 moritzm: uograding remaining canary servers to new HHVM packages
  • 09:17 marostegui: Remove dbstore1001:/srv/tmp/db1063.tar.gz after it has been transferred to db1095:/srv/tmp/db1063.tar.gz to get more disk space
  • 08:57 jynus: deploying schema change to page_assessments_projects on enwiki T156305
  • 08:42 filippo@tin: Finished deploy [prometheus/jmx_exporter@23a8f0b]: jmx_exporter deploy (duration: 00m 04s)
  • 08:42 filippo@tin: Started deploy [prometheus/jmx_exporter@23a8f0b]: jmx_exporter deploy
  • 08:30 moritzm: installing ntfs-3g security update on labnodepool (other servers had it deinstalled)
  • 08:26 filippo@tin: Finished deploy [prometheus/jmx_exporter@23a8f0b]: (no justification provided) (duration: 00m 07s)
  • 08:26 filippo@tin: Started deploy [prometheus/jmx_exporter@23a8f0b]: (no justification provided)
  • 08:24 marostegui: Transfer /srv/tmp/db1063.tar.gz from dbstore1001 to db1095:/srv/tmp to gain disk space
  • 08:24 marostegui: Remove /srv/tmp/db1067.tar.gz from dbstore1001 to gain disk space
  • 08:23 filippo@tin: Finished deploy [prometheus/jmx_exporter@23a8f0b]: (no justification provided) (duration: 00m 12s)
  • 08:23 filippo@tin: Started deploy [prometheus/jmx_exporter@23a8f0b]: (no justification provided)
  • 08:10 legoktm@tin: Synchronized php-1.29.0-wmf.10/includes/specials/pagers/ActiveUsersPager.php: Make last remaining user_groups queries honor $wgDisableUserGroupExpiry https://gerrit.wikimedia.org/r/#/c/335587/ (T156995) (duration: 00m 51s)
  • 08:08 legoktm@tin: Synchronized php-1.29.0-wmf.10/includes/api/ApiQueryAllUsers.php: Make last remaining user_groups queries honor $wgDisableUserGroupExpiry https://gerrit.wikimedia.org/r/#/c/335587/ (T156995) (duration: 00m 58s)
  • 07:41 marostegui: Restart MySQL on db2012 to tune some innodb_ft flags - T156905
  • 03:24 eileen1: renable all other jenkins jobs - only some dedupe & one-off jobs disabled
  • 03:17 eileen1: re-enable dedupe jobs
  • 03:14 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Feb 2 03:14:06 UTC 2017 (duration 5m 44s)
  • 03:08 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 13m 49s)
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 14m 39s)
  • 02:32 eileen1: update civicrm from d06db92 to e45da6d
  • 02:23 eileen1: drush dis -y module_missing_message_fixer
  • 02:22 eileen1: drush mmmff --all
  • 02:22 eileen1: run drush en -y module_missing_message_fixer
  • 02:20 eileen1: Update civicrm from 7a86121 to d06db92
  • 02:02 mutante: carbon - remove unmapped IPv6 address making ferm rules fail, use only the _mapped_ IP (ip addr del 2620:0:861:1:7a2b:cbff:fe09:ea0/64 dev eth0) (T84380 T132757)
  • 01:50 eileen1: update CiviCRM from e17622b to 7a86121
  • 01:25 eileen1: jenkins disable Dedupe CiviCRM contacts & Dedupe major gifts 500_
  • 01:04 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/335578/ (duration: 00m 42s)
  • 00:30 maxsem@tin: Synchronized wmf-config/CirrusSearch-common.php: https://gerrit.wikimedia.org/r/#/c/335265/2 (duration: 00m 40s)
  • 00:21 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/335561/ (duration: 01m 00s)

2017-02-01

  • 22:54 mutante: carbon - rsyncing /srv/ data to install1002 (T132757)
  • 22:46 dereckson@tin: Synchronized wmf-config/: Folder sync to get around caching issue in previous deployments (T156942) (duration: 00m 45s)
  • 22:08 jynus: deploying schema change to page_assessments_projects on testwiki T156305
  • 21:49 bsitzmann@tin: Finished deploy [mobileapps/deploy@09101f7]: Update mobileapps to e48a88c (duration: 03m 04s)
  • 21:46 bsitzmann@tin: Started deploy [mobileapps/deploy@09101f7]: Update mobileapps to e48a88c
  • 21:39 twentyafterfour@tin: Synchronized wmf-config/InitialiseSettings.php: sync InitializeSettings to activate change from previous patches refs T156942 (duration: 00m 41s)
  • 21:21 reedy@tin: Synchronized php-1.29.0-wmf.10/includes/api/: Guard more ug_expiry queries (duration: 00m 48s)
  • 20:36 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.10
  • 20:29 twentyafterfour@tin: Synchronized wmf-config/abusefilter.php: deploy I0b4e02 refs T156942 (duration: 00m 39s)
  • 20:15 twentyafterfour@tin: Synchronized wmf-config/flaggedrevs.php: deploy I1683b1 refs T156942 (duration: 00m 40s)
  • 20:10 jynus: continuing mariadb rolling restart of db2044, db2051, db2058, db2065
  • 20:01 ejegg: updated payments-wiki from dd8a16d to 4466b9d
  • 19:59 twentyafterfour: scheduled downtime in icinga for phab2001's phd service
  • 19:59 twentyafterfour: Freshening phabricator's elasticsearch index, currently 50% complete
  • 19:27 twentyafterfour: disabled read-only in phabricator
  • 19:25 twentyafterfour: running puppet on iridium to activate the config change
  • 19:20 jynus: reloading haproxy on dbproxy1003
  • 19:11 jynus: remaining 7 minute with phabricator up, but read-only
  • 19:10 ostriches: phabricator: now in read-only mode
  • 19:08 jynus: scheduling 10 minutes of emergency downtime on phabricator
  • 19:06 mobrovac: restbase deploy end of 96a641aa
  • 18:49 joal@tin: Finished deploy [analytics/refinery@2b9a70a]: (no justification provided) (duration: 02m 33s)
  • 18:46 joal@tin: Started deploy [analytics/refinery@2b9a70a]: (no justification provided)
  • 18:34 mobrovac: restbase deploy start of 96a641aa
  • 16:54 marostegui: Optimize table phabricator_search.search_documentfield on db2012 - T156905
  • 16:41 jynus: mariadb rolling restart of db2037, db2044, db2051, db2058, db2065
  • 16:20 elukey: restarting Yarn Node Manager daemons on all the Hadoop nodes to bandaid a memory leak causing OOMs
  • 16:18 marostegui: Optimizing table search_documentfield on db1048 - T156905
  • 15:50 akosiaris: stop ircecho for a while to weather out most of the puppet alert storm
  • 15:46 akosiaris: restart puppetdb on nihal (openjdk upgrade)
  • 15:43 akosiaris: restart puppetdb on nitrogen
  • 15:40 jynus: preparing db1067 for reimage to jessie
  • 15:37 moritzm: upgrading canary app servers to new HHVM package (initially mwdebug and mw1261)
  • 15:17 Dereckson: `mwscript populateCategory.php plwikisource --force` to refresh categories stats (T156670)
  • 15:17 dereckson@tin: Finished scap: Full scap to propagate a core namespace l10n change (duration: 40m 10s)
  • 14:41 godog: upgrade thumbor to 0.1.34
  • 14:37 dereckson@tin: Started scap: Full scap to propagate a core namespace l10n change
  • 14:25 jynus: dropping and replacing events on db1057 - db1052 T156008
  • 14:24 dereckson@tin: Synchronized php-1.29.0-wmf.9/languages/messages/MessagesJv.php: Update namespace localisation in Javanese (T155957) (duration: 00m 40s)
  • 14:21 dereckson@tin: Synchronized php-1.29.0-wmf.10/languages/messages/MessagesJv.php: Update namespace localisation in Javanese (T155957) (duration: 00m 45s)
  • 14:12 moritzm: uploaded hhvm 3.12.12 to carbon
  • 14:10 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ElectronPdfService on meta (T150943) (duration: 00m 48s)
  • 13:39 marostegui: Deploy alter table dbstore1002 metawiki.pagelinks - T153300
  • 13:38 akosiaris: issue sudo hdparm -Y /dev/sdb on bast3001 to force a problematic drive to sleep
  • 13:21 marostegui: Clean up db1043 replication thread (it was replicating from db1048 which looks like an old thing) - T156905
  • 12:09 elukey@tin: Finished deploy [analytics/refinery@e6254a4]: (no justification provided) (duration: 04m 41s)
  • 12:04 elukey@tin: Started deploy [analytics/refinery@e6254a4]: (no justification provided)
  • 11:52 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2061 - T153300 (duration: 00m 40s)
  • 11:25 moritzm: removing ntfs-3g from various trusty servers
  • 11:14 godog: bounce leaking thumbor@8813 on thumbor1001
  • 11:08 kartik@tin: Finished deploy [cxserver/deploy@0e4ae4f]: (no justification provided) (duration: 02m 04s)
  • 11:06 kartik@tin: Started deploy [cxserver/deploy@0e4ae4f]: (no justification provided)
  • 07:53 marostegui: Deploy alter table metawiki.pagelinks db2061 - T153300
  • 07:52 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2061 - T153300 (duration: 00m 53s)
  • 07:43 moritzm: rolling restart of cassandra in eqiad to pick up openjdk and NSS security updates
  • 07:41 elukey: bootstrapping aqs1008-a on aqs1008 (new AQS cassandra node)
  • 07:31 marostegui: Force WB policy on the raid controller db1072 - T156226
  • 07:13 akosiaris: restart thumbor process on thumbor1001, thumbor1002, apply a different LimitNOFILE on thumbo1002
  • 04:17 mutante: carbon - rsyncing entire /srv over to install2002 (T156440)
  • 03:00 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Feb 1 03:00:32 UTC 2017 (duration 5m 35s)
  • 02:54 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 03m 42s)
  • 02:48 mutante: install1002, install2002 - install jessie, sign puppet certs, initial puppet run (T132757, T156440)
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 11m 52s)
  • 02:20 demon@tin: Synchronized scap/plugins/clean.py: no-op (duration: 00m 39s)
  • 01:19 mutante: ganeti: create instance install2002 with 80G disk, 2G RAM (T156440)
  • 01:15 mutante: ganeti: install1001 - remove virtual disk 1 from instance | create instance install1002 instead (T132757)
  • 00:57 mutante: Ganglia is now deprecated in favor of Grafana (https://phabricator.wikimedia.org/T145659#2925104)
  • 00:33 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/334025/ (duration: 00m 40s)
  • 00:32 maxsem@tin: Synchronized php-1.29.0-wmf.9/extensions/CentralNotice/: https://gerrit.wikimedia.org/r/#/c/335263/ (duration: 00m 58s)

2017-01-31

  • 23:52 ppchelko@tin: Finished deploy [changeprop/deploy@e27c3a0]: Update change-prop to fix wikidata rollback rule (duration: 01m 32s)
  • 23:51 ppchelko@tin: Started deploy [changeprop/deploy@e27c3a0]: Update change-prop to fix wikidata rollback rule
  • 22:27 twentyafterfour: cleaned up old branches: wmf.3 and wmf.4
  • 21:58 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 wikis to 1.29.0-wmf.10
  • 21:50 twentyafterfour@tin: Synchronized wmf-config/: sync ExtensionMessages-1.29.0-wmf.10.php (duration: 00m 47s)
  • 21:41 twentyafterfour@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 19m 49s)
  • 21:11 twentyafterfour: wmf-config/ExtensionMessages-1.29.0-wmf.10.php is missing refs T155525
  • 21:05 twentyafterfour@tin: Finished scap: (no justification provided) (duration: 27m 16s)
  • 20:37 twentyafterfour@tin: Started scap: (no justification provided)
  • 20:37 twentyafterfour: syncing 1.29.0-wmf.10 to test wikis
  • 20:32 jynus: stopping db1063 mariadb before full host reimage
  • 18:58 arlolra: Updated Parsoid to version 734dc996 (T98960)
  • 18:51 arlolra@tin: Finished deploy [parsoid/deploy@dc2323d]: Updating Parsoid to 734dc996 (duration: 12m 58s)
  • 18:46 ottomata: recentchange events now flowing into Kafka via EventBus T152030
  • 18:45 otto@tin: Synchronized wmf-config/CommonSettings.php: Enabling RCFeed -> EventBus (duration: 00m 42s)
  • 18:44 otto@tin: Synchronized wmf-config/CommonSettings-labs.php: Enabling RCFeed -> EventBus (duration: 00m 43s)
  • 18:38 arlolra@tin: Started deploy [parsoid/deploy@dc2323d]: Updating Parsoid to 734dc996
  • 18:06 jynus: end up tendril and dbtree maintenance, things should be back up, report if you see degradations of service
  • 17:37 jynus: stopping mysql, upgrading and restarting db1011- temporary outage of tendril & dbtree T111654
  • 16:59 robh: disabled puppet on einsteinium while i try to figure out what i broke in my config for icinga
  • 16:11 elukey: started Cassandra nodetool cleanup for aqs1007-a
  • 16:03 elukey: started Cassandra nodetool cleanup for aqs1004-b
  • 14:57 jynus: upgrading and restarting db1095 (sanitarium2)
  • 14:12 elukey: restarting hhvm on mw1204 (dump debug in /tmp/hhvm.29120.bt)
  • 14:07 aude@tin: Synchronized wmf-config/Wikibase.php: Update property suggester config (duration: 00m 42s)
  • 13:59 elukey: rebooted analytics1039 to pick up uuids in fstab - T147879
  • 12:26 addshore: TwoColConflict deploy slot done!
  • 12:26 addshore@tin: Synchronized wmf-config/CommonSettings-labs.php: Enable TwoColConflict on test wikis (T155716) 5/5 (duration: 00m 40s)
  • 12:25 addshore@tin: Synchronized wmf-config/CommonSettings.php: Enable TwoColConflict on test wikis (T155716) 4/5 (duration: 00m 40s)
  • 12:24 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TwoColConflict on test wikis (T155716) 3/5 (duration: 00m 42s)
  • 12:23 addshore@tin: Synchronized wmf-config/extension-list-labs: Enable TwoColConflict on test wikis (T155716) 2/5 (duration: 00m 40s)
  • 12:22 addshore@tin: Synchronized wmf-config/extension-list: Enable TwoColConflict on test wikis (T155716) 1/5 (duration: 00m 40s)
  • 12:16 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Add twocolconflict to wgBetaFeaturesWhitelist (T150184) (duration: 00m 41s)
  • 11:14 elukey: updating the puppet compiler's facts
  • 10:42 gehel: starting reimage of wdqs1003
  • 09:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool hosts in C2 - T155999 (duration: 00m 40s)
  • 09:38 moritzm: rolling restart of cassandra in codfw to pick up openjdk and NSS security updates
  • 09:00 gehel: aligning elasticsearch low watermark to 75% disk space on all clusters (eqiad was at 70%)
  • 08:44 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1007.eqiad.wmnet
  • 08:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add a warning about a possible bad BBU on db1072 - T156226 (duration: 00m 46s)
  • 08:26 elukey: started Cassandra nodetool cleanup for aqs1004-a
  • 07:54 moritzm: installing chromium security update on osmium
  • 07:49 marostegui: Reboot db1072 to force BBU recharge - T156226
  • 07:10 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2034 - T156478 (duration: 00m 57s)
  • 03:49 andrewbogott: restarted nova-api on labnet1001 which actually fixed some things
  • 03:28 chasemp: (slightly belated) set logging level on serpens higher to see if ldap binding is an issue
  • 02:45 bd808: Setup temporary cron on silver as user bd808 until T156733 is fixed properly
  • 02:33 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Jan 31 02:33:22 UTC 2017 (duration 5m 23s)
  • 02:31 bd808: Manually ran extensions/TorBlock/loadExitNodes.php on silver
  • 02:28 bd808@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TorBlock for Wikitech (duration: 00m 41s)
  • 02:28 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 07m 09s)
  • 02:16 chasemp: restart uwsgi-keystone-admin and uwsgi-keystone-public on labcontrol1001
  • 00:25 ebernhardson: restarting elasticsearch on elastic1029, got stuck in RemoteTransportException loop again
  • 00:23 mobrovac@tin: Started restart [electron-render/deploy@f1df2d3]: Service restart for firejail upgrade
  • 00:22 mobrovac@tin: Started restart [mobileapps/deploy@7615bf9]: Service restart for firejail upgrade
  • 00:21 mobrovac@tin: Started restart [mathoid/deploy@ba3217e]: Service restart for firejail upgrade
  • 00:20 mobrovac@tin: Started restart [graphoid/deploy@da37386]: Service restart for firejail upgrade
  • 00:17 mobrovac@tin: Started restart [cxserver/deploy@5ae4f8b]: Service restart for firejail upgrade
  • 00:13 mobrovac@tin: Started restart [citoid/deploy@95df861]: Service restart for firejail upgrade
  • 00:10 mobrovac@tin: Started restart [changeprop/deploy@2b980fa]: Service restart for firejail upgrade

2017-01-30

  • 21:49 eileen1: killed long running user-initiated dedupe query
  • 21:24 eileen1: updated civicrm from 6b6f5d6 to e17622b
  • 21:13 mobrovac@tin: Finished deploy [trending-edits/deploy@5735f00]: (no justification provided) (duration: 03m 13s)
  • 21:10 mobrovac@tin: Started deploy [trending-edits/deploy@5735f00]: (no justification provided)
  • 21:09 mobrovac@tin: Finished deploy [trending-edits/deploy@5735f00]: (no justification provided) (duration: 03m 07s)
  • 21:06 mobrovac@tin: Started deploy [trending-edits/deploy@5735f00]: (no justification provided)
  • 20:57 mobrovac@tin: Finished deploy [trending-edits/deploy@5735f00]: Bump memory limit and heartbeat timeout (duration: 01m 48s)
  • 20:55 mobrovac@tin: Started deploy [trending-edits/deploy@5735f00]: Bump memory limit and heartbeat timeout
  • 20:50 godog: uploaded scap 3.5.1-1
  • 20:50 thcipriani@tin: Synchronized README: test scap (duration: 00m 43s)
  • 20:47 eileen1: updated localsettings to a346207
  • 20:45 mobrovac@tin: Finished deploy [trending-edits/deploy@9addcd0]: Bump max_age to 18h for T156411 (duration: 02m 39s)
  • 20:43 mobrovac@tin: Started deploy [trending-edits/deploy@9addcd0]: Bump max_age to 18h for T156411
  • 20:25 eileen1: disable drupal update module on prod. T155084, this should still be on on dev sites so not using update script
  • 20:12 Pchelolo: update RESTBase to 501ea47edc in staging
  • 20:09 ejegg: updated payments-wiki config to d98b30b
  • 19:47 gehel@tin: Finished deploy [wdqs/wdqs@81442a0]: (no justification provided) (duration: 01m 23s)
  • 19:46 gehel@tin: Started deploy [wdqs/wdqs@81442a0]: (no justification provided)
  • 19:44 gehel: deploying latest wdqs gui
  • 18:56 thcipriani: unlocking mediawiki deployments for test
  • 18:53 nuria@tin: Finished deploy [eventlogging/analytics@4b28b14]: (no justification provided) (duration: 00m 11s)
  • 18:53 nuria@tin: Started deploy [eventlogging/analytics@4b28b14]: (no justification provided)
  • 18:50 nuria: rollback deployment to eventlogging
  • 18:48 thcipriani: mediawiki deployments momentarily
  • 18:46 nuria@tin: Finished deploy [eventlogging/analytics@4b28b14]: (no justification provided) (duration: 00m 04s)
  • 18:46 nuria@tin: Started deploy [eventlogging/analytics@4b28b14]: (no justification provided)
  • 18:42 Pchelolo: update RESTBase to cd2b5e019
  • 18:38 Pchelolo: update RESTBase to cd2b5e019: canary on restbase2001
  • 18:36 godog: upload scap 3.5.0-1 - T127762
  • 18:18 gehel: nginx upgrade and wdqs restart complete - sorry for the noise
  • 18:13 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1002.codfw.wmnet
  • 18:12 Niharika: updated scholarships Fixed some bugs with the login form
  • 18:11 moritzm: upgrading firejail on scb cluster
  • 18:11 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1001.codfw.wmnet
  • 18:07 Pchelolo: update RESTBase to cd2b5e019: canary on restbase1007
  • 18:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2034 IP - T156478 (duration: 00m 40s)
  • 18:04 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2034 IP - T156478 (duration: 00m 40s)
  • 18:04 gehel: rolling restart of nginx and wdqs for updates
  • 17:41 Pchelolo: update RESTBase to cd2b5e019: staging
  • 17:09 legoktm@tin: Finished scap: Build l10n cache for linter (duration: 22m 43s)
  • 17:05 marostegui: Shutdown mysql and poweroff db2034 for maintenance - T156478
  • 16:46 legoktm@tin: Started scap: Build l10n cache for linter
  • 16:41 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2034 for maintenance - T156478 (duration: 00m 40s)
  • 16:34 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs2003.codfw.wmnet
  • 15:42 hashar@tin: Synchronized php-1.29.0-wmf.9/extensions/timeline/Timeline.body.php: debug log EasyTimeline error - T138036 (duration: 00m 46s)
  • 15:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 with its original weight - T156226 (duration: 00m 52s)
  • 14:45 hashar@tin: Synchronized php-1.29.0-wmf.9/languages/Language.php: translateBlockExpiry: Duration is block expiry minus current time - T156453 (duration: 00m 42s)
  • 14:29 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RSS extension at metawiki, enable one feed - T155830 (duration: 00m 42s)
  • 14:15 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Create namespace alias وگ for NS_PROJECT in fawikiquote - T156451 (duration: 00m 40s)
  • 14:10 hashar@tin: Synchronized wmf-config/flaggedrevs.php: Remove flaggedrevs-protect-review page protection from enwiki - T156448 (duration: 00m 41s)
  • 14:08 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable SandboxLink on tgwiki - T156473 (duration: 00m 40s)
  • 14:06 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable SandboxLink on gdwiki - T156281 (duration: 00m 48s)
  • 11:23 gehel: upgrade and restart nginx on elasticsearch eqiad cluster
  • 10:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 with less weight - T156226 (duration: 00m 49s)
  • 10:00 gehel: upgrade and restart nginx on elasticsearch codfw cluster
  • 09:58 gehel: upgrade and restart nginx on relforge cluster
  • 09:44 godog: upgrade to thumbor 0.1.33 - T151066
  • 09:37 ariel@tin: Finished deploy [dumps/dumps@4a9e952]: proper md5sum format for adds/changes dumps (duration: 00m 02s)
  • 09:37 ariel@tin: Starting deploy [dumps/dumps@4a9e952]: proper md5sum format for adds/changes dumps
  • 09:25 elukey: bootstrapping new cassandra instance (aqs1007-b) on AQS - https://gerrit.wikimedia.org/r/#/c/334753/
  • 09:19 moritzm: installing tcpdump security updates
  • 09:06 marostegui: Upgrade db2012 to 10.0.29-2 (this was done couple of hours ago, but for the record) - T156373
  • 09:05 marostegui: Start slaves from s1 to s7 on dbstore2001 - T156373
  • 08:54 moritzm: installing NSS security updates on kafka and Hadoop clusters
  • 08:45 elukey: restarting aqs on aqs100[4567] to pick up NSS updates
  • 08:19 elukey: set mw1236.eqiad.wmnet pooled=inactive because powered off (no mentions on the SAL, still trying to find why)
  • 08:05 moritzm: switched application servers in codfw to systemd-timesyncd
  • 08:04 marostegui: Stop mysql db1073 to use it to clone db1072 - T156226
  • 08:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T156226 (duration: 02m 45s)
  • 02:21 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Jan 30 02:21:21 UTC 2017 (duration 4m 22s)
  • 02:16 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 06m 11s)

2017-01-29

  • 02:21 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Jan 29 02:21:37 UTC 2017 (duration 4m 23s)
  • 02:17 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 06m 06s)

2017-01-28

  • 02:22 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Jan 28 02:22:49 UTC 2017 (duration 4m 47s)
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 05m 39s)

2017-01-27

  • 22:45 mutante: install1001 - adding a second virtual hard disk, 80G
  • 22:31 mutante: carbon: rsync entire /srv/ to install2001 (this is APT data but also misc things like junos, megacli, firmware, ipmi
  • 21:41 volans: restored watchmouse checks for s5 (de wiki), Main_Page redirect was restored
  • 21:36 mobrovac@tin: Finished deploy [trending-edits/deploy@0e79bec]: Bump max_age to 12h T156411 (duration: 01m 58s)
  • 21:34 mobrovac@tin: Starting deploy [trending-edits/deploy@0e79bec]: Bump max_age to 12h T156411
  • 21:28 mobrovac@tin: Finished deploy [trending-edits/deploy@e0e32bb]: Restart the service to assess the load of replaying the last 6h T156411 (duration: 01m 03s)
  • 21:27 mobrovac@tin: Starting deploy [trending-edits/deploy@e0e32bb]: Restart the service to assess the load of replaying the last 6h T156411
  • 20:24 jynus: restart and upgrade mariadb on db1048
  • 20:14 mutante: db1019, db1042, analytics1015, analytics1026 - puppet node deactivate, remove from icinga, finish decom (T147313, T149793, T146265)
  • 20:07 volans: updated watchmouse checks for s5 (de wiki) because Main_Page was deleted, used the localized page instead
  • 19:52 mutante: db1019 - shutdown -h now (T146265)
  • 19:51 mutante: db1042 - i came to shut it down .. and noticed it had died (or somebody did it) about 3 hours ago .. there it goes (T149793)
  • 18:52 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 wikis to 1.29.0-wmf.9
  • 18:52 twentyafterfour: Rolling forward with group2 to 1.29.0-wmf.9 refs T156364 T154683
  • 17:55 papaul: OS installation on mc2019-mc2036
  • 16:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: db1072 change IP - T156226 (duration: 00m 40s)
  • 16:07 marostegui@tin: Synchronized wmf-config/db-codfw.php: db1072 change IP - T156226 (duration: 00m 40s)
  • 16:01 jynus: submitted wmf-mariadb10_10.0.29-2 for T156373 fix
  • 15:48 marostegui: Stop mysql and shutdown db1072 for maintenance - T156226
  • 15:45 ema: cache_text: ban req.url == "/apple-app-site-association" && obj.status == 404 (T155504)
  • 14:13 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add rack positions for s7 in codfw (duration: 00m 40s)
  • 14:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add rack positions for s7 in eqiad (duration: 00m 40s)
  • 13:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add rack positions for s6 in eqiad (duration: 00m 40s)
  • 13:44 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add rack positions for s6 in codfw (duration: 00m 40s)
  • 13:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add rack positions for s5 in eqiad (duration: 00m 40s)
  • 13:30 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add rack positions for s5 in codfw (duration: 00m 40s)
  • 13:14 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add rack positions for s4 in codfw (duration: 00m 41s)
  • 13:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add rack positions for s4 in eqiad (duration: 00m 43s)
  • 13:11 jynus: starting db1048 until db1043-bin.001457:753455353, expect it to stop soon
  • 13:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add rack positions for s3 in eqiad (duration: 00m 40s)
  • 13:00 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add rack positions for s3 in codfw (duration: 00m 40s)
  • 12:24 moritzm: upgrading mediawiki canaries to new openssl 1.1 package
  • 12:13 moritzm: upgrading openjdk-7 packages (security updates) on wdqs cluster
  • 11:56 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add rack positions for s2 in codfw (duration: 00m 47s)
  • 11:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add rack positions for s2 in eqiad (duration: 00m 59s)
  • 11:11 moritzm: initial installation of openssl bugfix/security updates
  • 10:54 moritzm: uploaded openssl 1.0.2k for jessie-wikimedia to carbon
  • 10:35 paravoid: manually running certspotter -all_time as my user on einstenium (will take a few days to complete)
  • 10:21 legoktm: added addshore to labs-tools-wikibugs2 gerrit group
  • 08:13 moritzm: uploaded openssl 1.1.0d packages for jessie-wikimedia to carbon
  • 02:58 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Jan 27 02:58:52 UTC 2017 (duration 5m 46s)
  • 02:53 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 14m 14s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.8) (duration: 07m 57s)
  • 01:20 twentyafterfour: deploying hotfix for phabricator refs T154479
  • 00:48 mutante: carbon - moved the 1.5TB /srv/"mirrors.off", which used to be mirrors but is now on sodium, into / to that /srv/ can be synced without this
  • 00:32 dereckson@tin: Synchronized wmf-config/throttle.php: Fix throttle rule for Her Girl Friday + Lenny Unconference (T156278) (duration: 00m 53s)
  • 00:12 volans: re-enabled puppet (with a temporary fix to keep parsoid-vd and parsoid-vd-client stopped) on ruthenium T156177

2017-01-26

  • 23:49 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 wikis to 1.29.0-wmf.8
  • 23:37 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.9
  • 23:33 robh: archiva.w.o maint done, uses new LE cert.
  • 23:14 robh: going to try to convert archiva.wikimedia.org from GS to LE cert. will require rehup of nginx
  • 22:58 mutante: db1019, db1042 - revoke puppet certs, delete salt keys, schedule icinga downtime, stop services (T149793, T146265)
  • 22:56 mutante: analytics1015, analytics1026 - puppet node clean (again?) - again having problems to remove decom'ed nodes from Icinga (T147313)
  • 22:53 mutante: analytics1015, analytics1026 - puppet node clean (again?) - again having problems to remove decom'ed nodes from Icinga
  • 22:46 mutante: mw1181, mw1272, mw1212, mw1174 - service hhvm restart
  • 22:06 mobrovac@tin: Finished deploy [trending-edits/deploy@e0e32bb]: Bump replay time to 6h for T156411 (duration: 01m 42s)
  • 22:04 mobrovac@tin: Starting deploy [trending-edits/deploy@e0e32bb]: Bump replay time to 6h for T156411
  • 20:30 andrewbogott: refreshing logins on wikitech
  • 19:13 elukey: restore analytics1001 as RM and HDFS masters
  • 18:56 otto@tin: Finished deploy [eventstreams/deploy@f1a1866]: (no message) (duration: 03m 16s)
  • 18:52 otto@tin: Starting deploy [eventstreams/deploy@f1a1866]: (no message)
  • 18:36 elukey: restarting Yarn node managers on an102[89] and an103[01], impacted by the switch restart
  • 18:32 paravoid: starting pybal on lvs1001/lvs1002/lvs1003
  • 18:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055, 56, 57, 59 (duration: 00m 54s)
  • 18:14 paravoid: rebooting newly provisioned asw-c2-eqiad to enable mixed mode
  • 17:57 elukey: boostrapping aqs1007-a cassandra instance
  • 17:51 paravoid: replacing asw-c2-eqiad
  • 17:46 paravoid: stopping pybal on lvs1001/lvs1002/lvs1003
  • 17:34 elukey@tin: Finished deploy [analytics/aqs/deploy@5917fd4]: (no message) (duration: 02m 25s)
  • 17:31 elukey@tin: Starting deploy [analytics/aqs/deploy@5917fd4]: (no message)
  • 15:43 bblack: cache_misc puppet re-enabled and up to date
  • 15:37 godog: bounce uwsgi on graphite1003 with less workers - T155872
  • 15:35 moritzm: installing gnupg2 updates from jessie point update
  • 15:15 akosiaris: T156242 add /dev/sdb partitions to mdadm devices
  • 15:15 bblack: puppet disabled on cache_misc for merging complicated stuff
  • 15:06 moritzm: installing gnupg updates from jessie point update
  • 14:52 jynus: stopping mysql on db1048 T156373
  • 14:29 zeljkof: finished with eu swat
  • 14:28 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: IP Cap Lift for Edit-a-Thon (T156258) [throttle] Her Girl Friday + Lenny Unconference / Editathon in NYC, 2017-01-28 (T156278) (duration: 00m 41s)
  • 13:54 godog: delete labs 'instances' graphite three for data >30d, graphite low on disk space
  • 13:53 elukey: restarting cassandra on aqs100[56] to complete the openjdk update
  • 13:32 moritzm: rolling restart of maps cluster in eqiad
  • 13:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1054 - T156225 (duration: 00m 40s)
  • 13:05 hashar@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 back to 1.29.0-wmf.9 T156310
  • 13:02 godog: reboot ms-be1013
  • 12:54 elukey: restarting the aqs1004-b casandra instance to pick up the new openjdk (last test before complete rollout)
  • 12:28 elukey: restarting the aqs1004-a casandra instance to pick up the new openjdk
  • 12:28 moritzm: upgrading java on maps cluster, rolling restart of maps cluster in codfw
  • 12:17 hashar@tin: Synchronized php-1.29.0-wmf.9/extensions/FlaggedRevs/backend/FlaggedRevision.php: Fix fatal in prod caused by deprecated function removal T156310 (duration: 00m 41s)
  • 12:04 moritzm: installing java security updates on aqs cluster
  • 11:12 hashar@tin: rebuilt wikiversions.php and synchronized wikiversions files: FlaggedRevs is broken in wmf.9 causing blank pages. T156356 T156310
  • 09:55 marostegui: Disable semi-sync on db1057 old s1 master - https://phabricator.wikimedia.org/T156008
  • 09:39 marostegui: Enable semi-sync replication on db1052 (s1 master) - T156008
  • 09:04 marostegui: Change dbstore1002 to replicate from the new s1 master db1052 - T156008
  • 08:57 marostegui: Change db1047 to replicate from the new s1 master db1052 - T156008
  • 08:48 marostegui: Change db1069 to replicate from the new s1 master db1052 - T156008
  • 08:48 jynus: deploying dns CNAME updates due to master swithover
  • 07:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change s1 master to db1057 - T156008 (duration: 00m 20s)
  • 07:18 jynus: last message was master of db1057
  • 07:18 jynus: change master of db1052 from db2016 to db1052
  • 06:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1052 - T156008 (duration: 00m 31s)
  • 03:50 mutante: rsyncing apt.wikimedia.org data from carbon to install2001 (T84380)
  • 02:07 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Jan 26 02:07:05 UTC 2017 (duration 4m 42s)
  • 02:02 l10nupdate@tin: LocalisationUpdate failed (1.29.0-wmf.9) at 2017-01-26 02:02:23+00:00
  • 02:02 l10nupdate@tin: LocalisationUpdate failed (1.29.0-wmf.8) at 2017-01-26 02:02:23+00:00
  • 02:02 twentyafterfour: phabricator update complete
  • 01:47 twentyafterfour: upgrading phabricator, downtime should be minimal but expect the service to be offline for up to a few minutes
  • 01:11 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings.php: No-op documentation change to InitialiseSettings.php (duration: 00m 46s)
  • 00:31 krenair@tin: Synchronized wmf-config: https://gerrit.wikimedia.org/r/298397 part 2 (duration: 00m 42s)
  • 00:29 krenair@tin: Synchronized wmf-config: https://gerrit.wikimedia.org/r/298397 (duration: 00m 43s)
  • 00:26 Krenair: sync 298397 to mwdebug1001
  • 00:11 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable deprecation logging with gerrit:334206 (duration: 00m 53s)
  • 00:09 ebernhardson: sync 334206 to mwdebug1002

2017-01-25

  • 23:39 mutante: mwdebug1002 - service hhvm restart
  • 23:38 mutante: mw1185, mw1268 - service hhvm restart
  • 23:37 ejegg: updated civicrm from cd058a0 to 6b6f5d6
  • 23:21 ejegg: restarted civicrm donation queue consumer and dedupe jobs
  • 22:47 demon@tin: Synchronized w/extract2.php: (no message) (duration: 00m 40s)
  • 22:35 ejegg: paused civicrm dedupe and donation import jobs
  • 22:33 ejegg: updated civicrm from af8d735 to cd058a0
  • 21:39 akosiaris: reload pfw1-codfw node 0 in an effort to debug high RTTs
  • 20:50 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.9
  • 20:44 twentyafterfour: deploying mediawiki 1.29.0-wmf.9 to group1 wikis
  • 20:36 volans: upgrading nodejs-legacy (it is just the symlink) to v6 on parsoid hosts T149331
  • 20:11 eileen: renabled dedupe job, disabled major gifts (one should be enough). Will investigate next error
  • 19:26 mutante: analytics1015,analytics1026 - decom: remove DNS names, delete salt keys, revoke puppet certs, puppet node clean (to remove from icinga) (T147313)
  • 18:22 bd808@tin: Finished deploy [striker/deploy@5aa3aa8]: Update Striker to 5aa3aa8 (T144710, T147024, T144712, T144711, T153935) (duration: 00m 24s)
  • 18:22 bd808@tin: Starting deploy [striker/deploy@5aa3aa8]: Update Striker to 5aa3aa8 (T144710, T147024, T144712, T144711, T153935)
  • 18:09 demon@tin: Synchronized docroot/foundation/logos: rm a junk logo (duration: 00m 50s)
  • 18:02 elukey: running authdns-update on ns0.w.o to pick up changes made in https://gerrit.wikimedia.org/r/334040
  • 17:38 jynus: restarting and upgrading db2060
  • 16:57 ostriches: gerrit: everything back up!
  • 16:56 ostriches: gerrit: quick service reboot to pick up new java version
  • 16:31 Jeff_Green: renamed fdb2001 to frdb2001
  • 16:25 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db1054 IP - T156225 (duration: 00m 40s)
  • 16:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db1054 IP - T156225 (duration: 00m 41s)
  • 16:07 marostegui: Stop mysql and power off db1054 for maintenance - T156225
  • 15:28 gehel: removing jieba / ltr / swift plugins from elasticsearch relforge - T156150
  • 15:27 gehel: deleting indices using jieba plugin from relforge - T156150
  • 15:19 chasemp: (slightly late) of 'maintain-views --all-databases --table watchlist_count --replace-all' across labsdbs
  • 15:11 godog: graphite1003 / graphite2002 at 94% utilization, increase lv size by 300G
  • 14:42 moritzm: installing ruby2.1 updates from jessie point release
  • 14:23 moritzm: installing wget updates from jessie point release
  • 14:13 dcausse: EU SWAT done
  • 14:11 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: T156234 Revert [cirrus] properly set wgCirrusSearchUseIcuFolding (duration: 00m 41s)
  • 14:09 moritzm: removed totally outdated openjdk-8 packages from trusty-wikimedia (from 2014) on carbon
  • 13:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1054 - T156225 (duration: 00m 50s)
  • 13:52 moritzm: upgrading openjdk-8 on maps-test*
  • 13:11 gehel: pooling new elasticsearch nodes on codfw - T154251
  • 12:38 moritzm: installing libxml security updates
  • 12:04 Dereckson: Refresh site statistics on simple. (T156247)
  • 11:01 moritzm: upgrading restbase staging cluster to new openjdk (also piggyback reboot to latest 4.4 kernel)
  • 10:49 moritzm: uploaded openjdk-8 u121 to apt.wikimedia.org
  • 10:28 moritzm: uploaded ca-certificates-java 20161107~bpo8+1 to apt.wikimedia.org
  • 10:11 ema: repooled codfw
  • 09:25 elukey: updating puppet-compiler facts
  • 08:53 ema: upgrade cp3040 to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8 T155401
  • 08:15 ema: upgrade cp3034 to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8 T155401
  • 08:15 ema: upgrade cp3034 to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8
  • 08:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2054 - T153300 (duration: 00m 51s)
  • 07:48 _joe_: restarting pybal on lvs1003
  • 07:28 elukey: upgrading aqs100[56] to node6
  • 06:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T156005 (duration: 00m 42s)
  • 05:37 mobrovac: zotero restarting zotero, taking 95% of mem ...
  • 02:55 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Jan 25 02:55:57 UTC 2017 (duration 5m 37s)
  • 02:50 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 12m 53s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.8) (duration: 06m 25s)
  • 00:59 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1052 after maintenance (duration: 00m 40s)
  • 00:37 mutante: planet2001 - re-add new salt key, fix minion
  • 00:11 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Amend import sources for en.wikisource (T155922) (duration: 00m 47s)
  • 00:09 mutante: analytics1015,analytics1026 - revoked puppet cert, removing from puppet, shutting down (T147313)

2017-01-24

  • 23:50 mutante: carbon - stopping puppet, stopping atftpd
  • 23:49 mutante: carbon stopping DHCP
  • 23:49 mutante: analytics1015 (unused spare system) - use for test OS install
  • 23:26 jynus: restarting db1052 for kernel upgrade
  • 23:09 ebernhardson@tin: Synchronized php-1.29.0-wmf.9/includes/specials/SpecialSearch.php: Update special:search security patc h to not fatal (duration: 00m 44s)
  • 23:04 jynus: reimage db1066
  • 22:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 for reimage (duration: 00m 55s)
  • 22:22 Pchelolo: update RESTBase to 69065e2
  • 22:19 Pchelolo: update RESTBase to 69065e2: canary on restbase1007
  • 22:13 Pchelolo: update RESTBase to 69065e2: staging
  • 21:55 jynus@tin: Synchronized wmf-config/db-eqiad.php: repool db1065 as dump/vslow & clean up s1 comments (duration: 00m 43s)
  • 21:43 twentyafterfour: Finished group0 to wmf/1.29.0-wmf.9 (refs T15525) Changelog: https://www.mediawiki.org/wiki/MediaWiki_1.29/wmf.9/Changelog
  • 21:34 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 wikis to 1.29.0-wmf.9 refs T155525
  • 21:31 demon@tin: Synchronized docroot: Drop labs docroot, unused in prod (duration: 00m 44s)
  • 21:22 twentyafterfour@tin: Finished scap: test wikis to 1.29.0-wmf.9 refs T155525 (duration: 32m 37s)
  • 20:49 twentyafterfour@tin: Started scap: test wikis to 1.29.0-wmf.9 refs T155525
  • 20:49 volans: disabled puppet on ruthenium to avoid the restart of parsoid-vd and parsoid-vd-client processes T156177
  • 20:09 demon@tin: Synchronized docroot: Adding new wikimediafoundation.org docroot (duration: 01m 05s)
  • 19:51 volans: ruthenium: stopped parsoid-vd and parsoid-vd-client to avoid uncontrolled spawning of phantomjs childs
  • 19:39 volans: sudo service parsoid-vd stop on ruthenium
  • 19:37 twentyafterfour: branching 1.29.0-wmf.9 refs T154683
  • 19:35 volans: killed 822 "/srv/visualdiff/node_modules/phantomjs/lib/phantom/bin/phantomjs" processes on ruthenium. RAM and swap full, host unresponsive
  • 19:29 jynus: change replication master of db1095 to db1065
  • 19:07 jynus: change replication master of db1095 to db1052
  • 19:07 demon@tin: Synchronized docroot/foundation/logos: rm some old junk logos (duration: 00m 42s)
  • 18:58 arlolra: Updated Parsoid to version d000fdb4 (T58846, T154804, T152633)
  • 18:45 arlolra@tin: Finished deploy [parsoid/deploy@c1a14c0]: Retry updating Parsoid to d000fdb4 (duration: 04m 14s)
  • 18:41 arlolra@tin: Starting deploy [parsoid/deploy@c1a14c0]: Retry updating Parsoid to d000fdb4
  • 18:40 arlolra@tin: Finished deploy [parsoid/deploy@c1a14c0]: Updating Parsoid to d000fdb4 (duration: 21m 28s)
  • 18:37 demon@tin: Synchronized docroot: tidying up mobileportal docroot stuff (duration: 00m 41s)
  • 18:24 demon@tin: Synchronized docroot: Removing old wikidata docroot (duration: 00m 46s)
  • 18:19 arlolra@tin: Starting deploy [parsoid/deploy@c1a14c0]: Updating Parsoid to d000fdb4
  • 18:10 marostegui: restart mysql db1065 maintenance - https://phabricator.wikimedia.org/T155999)
  • 18:07 mutante: planet2001 - re-adding to puppet, revoke old cert, sign new cert, initial run
  • 18:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T156006 (duration: 00m 49s)
  • 16:54 andrewbogott: tools deleting tools-mail-01
  • 16:52 mutante: planet2001 - reinstalling to test DHCP/TFTP from install2001
  • 16:37 elukey: upgrading aqs1004 to node6
  • 16:26 marostegui: Restart mysql db1072
  • 16:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 - T156006 (duration: 00m 41s)
  • 16:18 paravoid: removing lvs4002_T151273 policy from cr1/2-ulsfo
  • 16:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 - T156006 (duration: 00m 47s)
  • 16:12 godog: kill stray swift-proxy processes from ms-fe1* T156143
  • 16:07 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1001.eqiad.wmnet
  • 16:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 - T155999 (duration: 00m 48s)
  • 15:58 moritzm: upgraded nodejs on thorium to 6.9 / restarted pivot
  • 15:57 papaul: shutting down ms-be2002 for maintenance
  • 15:54 chasemp: drbdadm adjust tools for 1004/1005 w/ 192.168.0.0/30
  • 15:49 moritzm: installing tomcat7 security updates on trusty hosts (jessie already fixed a while ago)
  • 15:14 godog: bounce pybal on lvs1003 - T134893
  • 15:10 chasemp: drbdadm adjust misc for 1004/1005 w/ 192.168.0.0/30
  • 15:09 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1001.eqiad.wmnet
  • 15:08 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1001.eqiad.wmnet
  • 15:07 chasemp: drbdadm adjust test for 1004/1005 w/ 192.168.0.0/30
  • 15:04 chasemp: recabling labstore1004/1005 eth1
  • 14:55 marostegui: Stop replication on db1052 and db1073 for maintenance - T156006
  • 14:54 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1001.eqiad.wmnet
  • 14:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T155999 (duration: 00m 39s)
  • 14:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 with less weight - T156004 (duration: 00m 41s)
  • 14:26 dcausse: EU SWAT Done
  • 14:23 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: T155515 [cirrus] properly set wgCirrusSearchUseIcuFolding (duration: 00m 39s)
  • 14:13 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: T155142 [cirrus] Increase weigths for content namespaces on mw.org (duration: 00m 39s)
  • 13:49 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db1052 IP - T156006 (duration: 00m 39s)
  • 13:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db1052 IP - T156006 (duration: 00m 39s)
  • 13:41 marostegui: Shutdown db1052 for maintenance - T156006
  • 13:37 marostegui: Shutdown mysql on db1052 for maintenance - T156006
  • 13:16 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db1051 IP - T156004 (duration: 00m 39s)
  • 13:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: wmf-config/db-codfw.php Change db1051 IP - T156004 (duration: 00m 39s)
  • 13:00 marostegui: Shutdown db1051 for maintenance - T156004
  • 12:56 marostegui: Shutdown mysql on db1051 for maintenance - T156004
  • 12:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T156004 (duration: 00m 39s)
  • 12:51 moritzm: installing pcsc-lite security updates on trusty hosts (jessie already fixed a while ago)
  • 12:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T156004 (duration: 00m 39s)
  • 12:23 akosiaris: switch all networks to use install1001, install2001 as DHCP relay endpoint. T156109
  • 12:17 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Revert last (duration: 00m 39s)
  • 12:14 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155995 Copy InterwikiSorting settings from wmgWikibaseClientSettings noop (duration: 00m 39s)
  • 12:07 addshore@tin: Synchronized wmf-config/CommonSettings.php: T155995 Prepare to enable InterwikiSorting on beta cluster & Populate InterwikiSortingInterwikiSortOrders with WB Client 4/4 noop (duration: 00m 39s)
  • 12:06 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T155995 Prepare to enable InterwikiSorting on beta cluster 3/4 noop (duration: 00m 39s)
  • 12:05 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155995 Prepare to enable InterwikiSorting on beta cluster 2/4 noop (duration: 00m 39s)
  • 12:05 addshore@tin: Synchronized wmf-config/extension-list-labs: T155995 Prepare to enable InterwikiSorting on beta cluster 1/4 noop (duration: 00m 39s)
  • 09:53 akosiaris: mark /dev/sdb as faulty on md devices on bast3001 T154603
  • 09:36 akosiaris: add /dev/sdb partitions to md RAID device on mw2251
  • 09:33 addshore@tin: Synchronized wmf-config/CommonSettings.php: T155995 Prepare to enable InterwikiSorting on beta cluster 4/4 noop (duration: 00m 38s)
  • 09:33 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T155995 Prepare to enable InterwikiSorting on beta cluster 3/4 noop (duration: 00m 40s)
  • 09:32 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155995 Prepare to enable InterwikiSorting on beta cluster 2/4 noop (duration: 00m 41s)
  • 09:30 addshore@tin: Synchronized wmf-config/extension-list-labs: T155995 Prepare to enable InterwikiSorting on beta cluster 1/4 noop (duration: 00m 53s)
  • 09:21 marostegui: Alter table db2054 metawiki.pagelinks - T153300
  • 09:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2054 - T153300 (duration: 00m 39s)
  • 08:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1065 original weight - T156005 (duration: 00m 39s)
  • 08:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add rack positions - T155999 (duration: 00m 41s)
  • 08:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: wmf-config/db-eqiad.php Add rack positions - T155999 (duration: 00m 50s)
  • 06:20 _joe_: repooling mw2098 after scap pull
  • 02:23 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Jan 24 02:23:01 UTC 2017 (duration 4m 23s)
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.8) (duration: 06m 40s)
  • 01:37 ejegg: updated SmashPig from 03880ce to ab52dbe
  • 01:18 Krinkle: mwscript deleteEqualMessages.php --wiki gotwiki (T45917)
  • 01:16 ejegg: updated payments-wiki from c22353b to dd8a16d
  • 00:26 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 with low load after reimage (duration: 00m 45s)

2017-01-23

  • 22:38 Pchelolo: update RESTBase to 598fa56f
  • 22:37 Pchelolo: update RESTBase to 598fa56f: canary on restbase1007
  • 22:22 Pchelolo: update RESTBase to d1663345c
  • 22:21 Pchelolo: update RESTBase to d1663345c: canary on restbase1007
  • 22:18 Pchelolo: update RESTBase to d1663345c: staging
  • 21:54 demon@tin: Synchronized wmf-config: interwiki update, dropping some old ExtensionMessages files (duration: 00m 41s)
  • 21:43 mutante: sca2004 was out of memory but also fixed itself and i could run puppet again a few minutes later
  • 21:41 demon@tin: Synchronized w: Removing wiki.phtml, apache does the rewrites (duration: 00m 48s)
  • 21:39 bsitzmann@tin: Finished deploy [mobileapps/deploy@7615bf9]: Update mobileapps to 66ef3c2 (duration: 03m 16s)
  • 21:36 bsitzmann@tin: Starting deploy [mobileapps/deploy@7615bf9]: Update mobileapps to 66ef3c2
  • 20:29 nuria@tin: Finished deploy [analytics/aqs/deploy@025ef23]: (no message) (duration: 00m 08s)
  • 20:29 nuria@tin: Starting deploy [analytics/aqs/deploy@025ef23]: (no message)
  • 20:28 nuria@tin: Finished deploy [analytics/aqs/deploy@025ef23]: (no message) (duration: 00m 07s)
  • 20:28 nuria@tin: Starting deploy [analytics/aqs/deploy@025ef23]: (no message)
  • 20:18 otto@tin: Finished deploy [analytics/aqs/deploy@025ef23]: (no message) (duration: 00m 01s)
  • 20:18 otto@tin: Starting deploy [analytics/aqs/deploy@025ef23]: (no message)
  • 20:11 gehel@tin: Finished deploy [wdqs/wdqs@fd88fda]: (no message) (duration: 01m 56s)
  • 20:09 otto@tin: Finished deploy [analytics/aqs/deploy@025ef23]: (no message) (duration: 00m 01s)
  • 20:09 otto@tin: Starting deploy [analytics/aqs/deploy@025ef23]: (no message)
  • 20:09 gehel@tin: Starting deploy [wdqs/wdqs@fd88fda]: (no message)
  • 20:06 gehel: deplyoing latest wdqs version (2h behind planned schedule)
  • 20:04 otto@tin: Finished deploy [analytics/aqs/deploy@025ef23]: (no message) (duration: 02m 16s)
  • 20:02 otto@tin: Starting deploy [analytics/aqs/deploy@025ef23]: (no message)
  • 19:26 reedy@tin: Synchronized wmf-config/CommonSettings.php: Make sure CommonSettings-labs is one of the last things loaded so we don't get problems from things being included after (duration: 00m 40s)
  • 18:57 nuria@tin: Finished deploy [analytics/aqs/deploy@025ef23]: (no message) (duration: 00m 35s)
  • 18:56 nuria@tin: Starting deploy [analytics/aqs/deploy@025ef23]: (no message)
  • 18:47 nuria@tin: Finished deploy [analytics/aqs/deploy@56ab863]: (no message) (duration: 00m 01s)
  • 18:47 nuria@tin: Starting deploy [analytics/aqs/deploy@56ab863]: (no message)
  • 18:47 nuria@tin: Finished deploy [analytics/aqs/deploy@56ab863]: (no message) (duration: 00m 01s)
  • 18:47 nuria@tin: Starting deploy [analytics/aqs/deploy@56ab863]: (no message)
  • 18:45 nuria@tin: Finished deploy [analytics/aqs/deploy@56ab863]: (no message) (duration: 01m 08s)
  • 18:44 nuria@tin: Starting deploy [analytics/aqs/deploy@56ab863]: (no message)
  • 18:43 nuria@tin: Finished deploy [analytics/aqs/deploy@56ab863]: (no message) (duration: 00m 01s)
  • 18:43 nuria@tin: Starting deploy [analytics/aqs/deploy@56ab863]: (no message)
  • 18:33 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: no-op, completeness (duration: 00m 40s)
  • 18:32 demon@tin: Synchronized wmf-config/extension-list-labs: no-op, completeness (duration: 00m 40s)
  • 18:30 jynus: reimaging db1065 to jessie
  • 18:18 papaul: shutting down ms-be2010 for maintenance
  • 18:12 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 (duration: 00m 39s)
  • 16:31 papaul: shutting down mw2098 for maintenance
  • 15:38 marostegui: Alter tables: flow_topic_list and flow_tree_node on db1031 (x1 master) - T149819
  • 15:19 elukey: whitelisted dbproxy1011 on cr1/cr2 for analytics-in4 input filter
  • 15:18 moritzm: installing mysql 5.5 security updates (as packaged by jessie/trusty, not the internal mariadb packages)
  • 15:17 moritzm: installing pdns-recursor security update on labservices1002
  • 15:17 Dereckson: Fixed namespaces dupes following NS_PROJECT update on sa.wikisource (T101634)
  • 15:15 gehel: reimage elastic2025 - T154251
  • 15:02 Dereckson: EU SWAT done (handled by addshore)
  • 15:00 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Set site name and meta namespace for Sanskrit wikis (T101634) (duration: 00m 40s)
  • 14:34 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155844 [fix] Add finds.org.uk without wildcard too (duration: 00m 39s)
  • 14:24 addshore@tin: Synchronized wmf-config/Wikibase.php: T150183 Move InterwikiSortOrders to own file PT 2/2 (duration: 00m 39s)
  • 14:23 addshore@tin: Synchronized wmf-config/InterwikiSortOrders.php: T150183 Move InterwikiSortOrders to own file PT 1/2 (duration: 00m 40s)
  • 14:23 gehel: disabling puppet on elastic20(2[5-9]|3[0-6]) prior to reimage - T154251
  • 14:20 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155916 Amend category collation for de.wikisource to uca-de-u-kn (duration: 00m 39s)
  • 14:20 godog: depool ms-fe200[1234] T152612
  • 14:15 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155906 Add n, n:es and n:fr as import sources in test2wiki (duration: 00m 39s)
  • 14:14 Dereckson: Fix namespaces dupes on sa.wikisource to prepare T101634 / Gerrit:333640
  • 14:09 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: gerrit:333476 (NOOP) Temporarily set $wgDisableUserGroupExpiry to true on labs (duration: 00m 40s)
  • 14:07 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: gerrit:333294 Add *.finds.org.uk to wgCopyUploadsDomains (duration: 00m 41s)
  • 11:54 elukey: whitelisted dbproxy1010 on cr1/cr2 for analytics-in4 input filter
  • 10:50 moritzm: installing pdns-recursor security updates on trusty systems
  • 10:38 moritzm: installing openjpeg security updates
  • 09:06 marostegui: Compress s2 on dbstore2001 - T151552
  • 07:47 marostegui: Enabling gtid_domain_id on db1047 (eventlogging host) - T149418
  • 07:43 marostegui: Enabling gtid_domain_id on db1046 (eventlogging master) - T149418
  • 07:32 marostegui: Deploy gtid_domain_id db1043 (passive master) - last host pending in m3 - T149418
  • 07:28 marostegui: Compressing cebwiki.templatelinks on db1015 (224G table) - T153739
  • 03:02 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Jan 23 03:02:32 UTC 2017 (duration 4m 44s)
  • 02:57 reedy@tin: scap sync-l10n completed (1.29.0-wmf.8) (duration: 13m 14s)
  • 02:26 Reedy: running l10nupdate manually
  • 02:23 Reedy: cleaned up reCaptcha extension in l10ncache dirs
  • 02:02 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed

2017-01-22

  • 23:26 mobrovac: restbase deploying d1663345 - blacklist of a bot log page on enwiki
  • 02:01 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed
  • 00:08 krenair@tin: Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/333468 - needed to be handled before next window (duration: 00m 42s)

2017-01-21

  • 21:57 mobrovac@tin: Finished deploy [changeprop/deploy@2b980fa]: (no message) (duration: 00m 54s)
  • 21:56 mobrovac@tin: Starting deploy [changeprop/deploy@2b980fa]: (no message)
  • 20:02 legoktm@tin: Synchronized php-1.29.0-wmf.8/RELEASE-NOTES-1.29: for completeness (duration: 00m 39s)
  • 20:01 legoktm@tin: Synchronized php-1.29.0-wmf.8/resources: Revert "Added reason suggestion in block/delete/protect forms" (1/2) - T34950 (duration: 00m 39s)
  • 20:00 legoktm@tin: Synchronized php-1.29.0-wmf.8/includes: Revert "Added reason suggestion in block/delete/protect forms" (1/2) - T34950 (duration: 01m 31s)
  • 04:03 ema: graphite1003: carbon-cache@c restarted, it's been killed by OOM killer again
  • afk: disabled civicrm dedupe high numbers
  • afk: disabled civicrm dedupe
  • 02:02 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed
  • 01:21 mobrovac@tin: Finished deploy [changeprop/deploy@eb27062]: (no message) (duration: 01m 03s)
  • 01:20 mobrovac@tin: Starting deploy [changeprop/deploy@eb27062]: (no message)
  • 00:36 volans: restarted carbon-cache@c on graphite1003 (was killed by oom-killer)
  • 00:20 mobrovac: restbase deploying 7c753fe6

2017-01-20

  • 23:37 mattflaschen@tin: Synchronized docroot: No-op file rename (duration: 00m 46s)
  • 23:36 mattflaschen@tin: Synchronized dblists: No-op file rename (duration: 00m 54s)
  • 19:55 robh: done fixing ulsfo serial in ulsfo
  • 19:45 robh: messing with ulsfo serial connections
  • 19:29 robh: cp4012 donating its redundant power supply to lvs4002 with redundant supplies
  • 19:12 ejegg: re-enabled fundraising Jenkins jobs
  • 19:01 ejegg: disabled fundraising jenkins jobs
  • 17:22 chasemp: shutdown eth1 on labstore1004 for testing
  • 17:17 andrewbogott: graceful'd apache on silver, in hopes that the wikitech instance api will update
  • 16:18 cmjohnson1: swapping cable eth0 labstore1004 (chasemp)
  • 16:03 jynus: restart and upgrade of db2066
  • 14:42 jynus: restart and upgrade of db2067
  • 13:30 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2047 - T153300 (duration: 00m 39s)
  • 12:32 ladsgroup@tin: Synchronized php-1.29.0-wmf.8/extensions/ORES/includes/Hooks.php: ORES database query fix (T155500) (duration: 00m 40s)
  • 12:13 Amir1: deploy wmf.8 in mwdebug1002 (T155500)
  • 10:48 godog: reload swift-proxy on ms-fe100* to pick up https://gerrit.wikimedia.org/r/333222
  • 10:39 elukey: manually forcing a /etc/init.d/apache2 reload on mw1259 (videoscaler) to replicate the effects of a logrotate run and test why alarms go off.
  • 10:15 moritzm: installing exim bugfix updates from latest jessie point release
  • 10:02 godog: reload swift-proxy on ms-fe1001 to pick up https://gerrit.wikimedia.org/r/333222
  • 09:19 jynus: rolling restart and upgrade of labsdb1009/10/11 to mariadb 10.1.21-2
  • 08:41 marostegui: Remove partitions on metawiki.pagelinks db2047 - T153300
  • 08:25 _joe_: restarting pybal on lvs1003/1006 to pick up config changes
  • 07:38 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2047 - T153300 (duration: 00m 48s)
  • 07:09 marostegui: Compress pagelinks tables on db1015 - T153739
  • 02:36 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Jan 20 02:36:44 UTC 2017 (duration 5m 34s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.8) (duration: 11m 23s)
  • 00:52 mutante: tin - keyholder disarm and arm again using new passphrase
  • 00:49 mutante: mira - arming keyholder after setting service/dumps/eventlogging/phabricator key passphrases to the same one (T154943)
  • 00:46 mutante: setting all deployment key passphrases to the one used for mw deploy - update key files in private repo (T154943)
  • 00:40 mobrovac@tin: Finished deploy [parsoid/deploy@465f9c4]: Restarting Parsoid everywhere for Node v6 switch T149331 (duration: 04m 21s)
  • 00:39 thcipriani@tin: Synchronized php-1.29.0-wmf.8/includes/specials/SpecialContributions.php: SWAT: SpecialContributions: Username input is not really required T155780 (duration: 00m 39s)
  • 00:35 mobrovac@tin: Starting deploy [parsoid/deploy@465f9c4]: Restarting Parsoid everywhere for Node v6 switch T149331
  • 00:34 thcipriani@tin: Synchronized php-1.29.0-wmf.8/resources/lib/oojs-ui: SWAT: resources: Update OOjs UI with fixes on top of v0.18.3 T155728 (duration: 00m 41s)
  • 00:24 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Wikidata description taglines shown on English Wikipedia T152743 (duration: 00m 39s)
  • 00:22 volans: apt-upgrading nodejs to v6 on the rest of parsoid hosts (a deploy with restart will follow) T149331

2017-01-19

  • 23:34 mutante: force puppet run on restbase1*
  • 23:32 volans: upgrading node to v6 on wtp1003 T149331
  • 23:31 mutante: force puppet run on restbase2*
  • 23:31 mutante: icinga - replace check command names in puppet_services.cfg for change 333010
  • 23:26 mobrovac@tin: Finished deploy [eventstreams/deploy@0d1d9c6]: Bump preq to 0.5.2 for Node v6 (duration: 02m 11s)
  • 23:24 mobrovac@tin: Starting deploy [eventstreams/deploy@0d1d9c6]: Bump preq to 0.5.2 for Node v6
  • 23:16 mobrovac@tin: Finished deploy [cxserver/deploy@5ae4f8b]: Bump preq to 0.5.2 for Node v6 (duration: 01m 56s)
  • 23:14 mobrovac@tin: Starting deploy [cxserver/deploy@5ae4f8b]: Bump preq to 0.5.2 for Node v6
  • 23:11 mobrovac@tin: Finished deploy [graphoid/deploy@da37386]: Bump preq to 0.5.2 for Node v6 (duration: 02m 21s)
  • 23:10 ejegg: updated SmashPig from f05c9a3 to 03880ce
  • 23:08 mobrovac@tin: Starting deploy [graphoid/deploy@da37386]: Bump preq to 0.5.2 for Node v6
  • 22:59 ppchelko@tin: Finished deploy [eventstreams/deploy@fe77f19]: Deploy for switching to node 6 T149331 (duration: 01m 30s)
  • 22:58 mobrovac@tin: Finished deploy [trending-edits/deploy@0abcf25]: Switching to node 6 T149331 (duration: 01m 59s)
  • 22:58 ppchelko@tin: Starting deploy [eventstreams/deploy@fe77f19]: Deploy for switching to node 6 T149331
  • 22:58 ppchelko@tin: Finished deploy [cxserver/deploy@ff0225e]: Deploy for switching to node 6 T149331 (duration: 02m 12s)
  • 22:56 mobrovac@tin: Finished deploy [electron-render/deploy@f1df2d3]: Switching to node 6 T149331 (duration: 01m 58s)
  • 22:56 mobrovac@tin: Starting deploy [trending-edits/deploy@0abcf25]: Switching to node 6 T149331
  • 22:56 mobrovac@tin: Finished deploy [mobileapps/deploy@cacb3c9]: Switching to node 6 T149331 (duration: 02m 41s)
  • 22:55 ppchelko@tin: Starting deploy [cxserver/deploy@ff0225e]: Deploy for switching to node 6 T149331
  • 22:55 ppchelko@tin: Finished deploy [citoid/deploy@95df861]: Deploy for switching to node 6 T149331 (duration: 02m 18s)
  • 22:54 mobrovac@tin: Starting deploy [electron-render/deploy@f1df2d3]: Switching to node 6 T149331
  • 22:54 mobrovac@tin: Finished deploy [mathoid/deploy@ba3217e]: (no message) (duration: 02m 02s)
  • 22:53 mobrovac@tin: Starting deploy [mobileapps/deploy@cacb3c9]: Switching to node 6 T149331
  • 22:53 mobrovac@tin: Finished deploy [graphoid/deploy@f872f94]: Switching to node 6 T149331 (duration: 01m 39s)
  • 22:53 ppchelko@tin: Starting deploy [citoid/deploy@95df861]: Deploy for switching to node 6 T149331
  • 22:52 ppchelko@tin: Finished deploy [changeprop/deploy@ffd0b8b]: Deploy for switching to node 6 T149331 (duration: 00m 58s)
  • 22:52 mobrovac@tin: Starting deploy [mathoid/deploy@ba3217e]: (no message)
  • 22:51 mobrovac@tin: Starting deploy [graphoid/deploy@f872f94]: Switching to node 6 T149331
  • 22:51 ppchelko@tin: Starting deploy [changeprop/deploy@ffd0b8b]: Deploy for switching to node 6 T149331
  • 22:51 mutante: scb1003,scb1004 - upgrade nodejs
  • 22:51 ppchelko@tin: Finished deploy [changeprop/deploy@ffd0b8b]: Canary deploy for switching to node 6 T149331 (duration: 07m 36s)
  • 22:48 mobrovac@tin: Finished deploy [mathoid/deploy@ba3217e]: (no message) (duration: 03m 24s)
  • 22:47 mobrovac@tin: Finished deploy [graphoid/deploy@f872f94]: Switching to node 6 T149331 (duration: 01m 43s)
  • 22:45 mobrovac@tin: Starting deploy [graphoid/deploy@f872f94]: Switching to node 6 T149331
  • 22:45 mobrovac@tin: Finished deploy [graphoid/deploy@f872f94]: Switching to node 6 T149331 (duration: 01m 10s)
  • 22:45 mobrovac@tin: Starting deploy [mathoid/deploy@ba3217e]: (no message)
  • 22:44 mobrovac@tin: Starting deploy [graphoid/deploy@f872f94]: Switching to node 6 T149331
  • 22:44 ppchelko@tin: Starting deploy [changeprop/deploy@ffd0b8b]: Canary deploy for switching to node 6 T149331
  • 22:41 mutante: scb1001-1004 - upgraded nodejs version
  • 22:38 mutante: scb2003 - repool, scb2001,scb2002 - upgrade nodejs, libuv1 packages
  • 22:35 mutante: scb2003 - depool, upgrade nodejs, libuv1 packages
  • 22:33 mutante: scb2004 - re-pooled
  • 21:35 chasemp: rebooting labstore1004
  • 21:33 chasemp: failover secondary labstore cluster from 1004 to 1004
  • 21:17 chasemp: force non tools on NFS to go ro
  • 21:03 volans: upgrading node to v6 on wtp1002 T149331
  • 20:46 mutante: scb2004 - upgrading nodejs, libuv1
  • 20:45 volans: upgrading node to v6 on wtp2003 T149331
  • 20:44 mutante: depooling scb2004 for nodejs install
  • 20:26 volans: upgrading node to v6 on wtp2002 T149331
  • 20:09 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.8
  • 20:05 dereckson@tin: Synchronized wmf-config/throttle.php: Add throttle rule for BAMU event (T154312) (duration: 00m 39s)
  • 19:44 ejegg: updated SmashPig from 48675c3 to f05c9a3
  • 19:40 mutante: switching dumps.wikimedia.org to Letsencrypt SSL cert
  • 19:20 jynus: restarting db1069:3311 due to query being "stuck" on tokudb table
  • 19:13 demon@tin: Synchronized wmf-config/CommonSettings.php: ContentTranslation: Enable publishing article in testwiki (2/2) (duration: 00m 39s)
  • 19:12 demon@tin: Synchronized wmf-config/InitialiseSettings.php: ContentTranslation: Enable publishing article in testwiki (1/2) (duration: 00m 39s)
  • 19:06 demon@tin: Synchronized wmf-config/CommonSettings.php: Double $wgTranscodeBackgroundTimeLimit to compensate for threading (duration: 00m 47s)
  • 18:54 ema: libvmod-header removed from carbon, varnish-modules provides it
  • 18:51 mutante: dataset1001 - temp disabling puppet, ms1001 - switching to Letsencrypt cert
  • 18:34 jynus: aborting rolling restart on labsdb1010, labsdb1011 due to package bug to be fixed on 10.1.21-2
  • 18:19 mobrovac: restbase updating firejail in production
  • 17:52 jynus: rolling restart and upgrade of labsdb1009/10/11 to mariadb 10.1.21
  • 17:47 Dereckson: Reattach Zlazstadpieroniebomiurwieszkabelodinternetu CentralAuth account (T155184)
  • 17:26 jynus: restarting and upgrading mariadb on labsdb1004 to 10.0.29
  • 14:40 Dereckson: EU SWAT done
  • 14:40 Dereckson: `mwscript namespaceDupes.php sawiki --fix` (T101634)
  • 14:36 ottomata: restarting apache/puppetmaster on labcontrol1001 to try to fix 'invalid byte sequence in US-ASCII' puppet error
  • 14:34 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Fix Portal talk namespace name on Sanskrit Wikipedia (T101634) (duration: 00m 39s)
  • 14:25 dereckson@tin: Synchronized wmf-config: Add noratelimit user right to translation admins on Commons (T155162) (duration: 00m 42s)
  • 13:21 marostegui: Compressing revision,pagelinks and templatelinks tables on db1035 - T110504
  • 11:41 marostegui: Compressing dewiki db1045 - T155399
  • 10:14 marostegui: Compressing templatelinks tables on db1015 - T153739
  • 09:32 moritzm: upgrading firejail on image scalers
  • 08:57 godog: bounce udp2log on fluorine after https://gerrit.wikimedia.org/r/313604
  • 07:45 volans@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2098.codfw.wmnet
  • 07:42 volans@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2098.wmnet
  • 07:28 dereckson@tin: Synchronized wmf-config/throttle.php: Fix throttle rule for KCES IMR edit-a-thon (duration: 02m 42s)
  • 07:18 marostegui: Compressing enwikivoyage.text and shwiki.logging tables on db1044 - T153826
  • 07:14 marostegui: Compressing enwikivoyage.text and shwiki.logging tables on db1038 - T154465
  • 02:28 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 07m 28s)
  • 01:50 mutante: install - if in private1-c-eqiad, private1-b-codfw, you are using install1001 for both DHCP and TFTP, if in other networks you still use carbon as DHCP but then also install1001 as TFTP
  • 01:47 mutante: install1001/2001 - re-enabled, carbon is still DHCP for some rows
  • 01:23 mutante: install1001 - re-enable puppet - install2001 - same thing, temp disable and live-hack mw2251 to use trusty installer
  • 01:16 mutante: switching mw2251 to trusty-installer for test
  • 01:14 mutante: temp disable puppet on install1001 for papaul debugging
  • 00:24 maxsem@tin: Synchronized php-1.29.0-wmf.8/extensions/Graph: SWAT https://gerrit.wikimedia.org/r/#/c/332916/1 (duration: 00m 40s)

2017-01-18

  • 23:38 demon@tin: Synchronized php-1.29.0-wmf.8/extensions/Collection: Unbreak (duration: 00m 40s)
  • 22:59 demon@tin: Synchronized php-1.29.0-wmf.8/extensions/ProofreadPage/includes/index/ProofreadIndexPage.php: Unbreak, T155682 (duration: 00m 39s)
  • 22:49 demon@tin: Synchronized php-1.29.0-wmf.8/extensions/LiquidThreads/classes/Hooks.php: Unbreak hook mess (duration: 00m 41s)
  • 22:48 demon@tin: Synchronized php-1.29.0-wmf.8/includes/widget/search/FullSearchResultWidget.php: Unbreak hook mess (duration: 00m 45s)
  • 22:46 madhuvishy: Reenabled nfs-exportd and puppet on labstore1004. All of misc being exported as rw now. T154336
  • 21:32 bd808: Updated wikimania-scholarships to 29ba0ec "Add Tulu (tcy) to Communities" (T155666)
  • 21:18 volans: restarted pybal on lvs2001 (active) T134893
  • 21:00 volans: restarted pybal on lvs2004 (passive) T134893
  • 20:48 demon@tin: Finished scap: group1 to wmf.8 (duration: 46m 33s)
  • 20:47 volans: Upgraded nodejs to v6 on wtp1001 T149331
  • 20:31 nuria@tin: Finished deploy [analytics/refinery@666d98d]: (no message) (duration: 02m 19s)
  • 20:28 nuria@tin: Starting deploy [analytics/refinery@666d98d]: (no message)
  • 20:01 demon@tin: Started scap: group1 to wmf.8
  • 19:56 volans: restarted pybal on lvs2003
  • 19:51 volans: restarted pybal on lvs2006
  • 19:26 demon@tin: Synchronized dblists: Remove old compact lang list dblist (duration: 00m 39s)
  • 19:25 volans: Upgrading nodejs to v6 on wtp2001 T149331
  • 19:25 demon@tin: Synchronized docroot/noc/conf: Using new compact lang list dblist (duration: 00m 39s)
  • 19:24 demon@tin: Synchronized tests/cirrusTest.php: Use new compact lang list dblist (duration: 00m 39s)
  • 19:22 papaul: OS installation on mw2251-mw2260
  • 19:22 demon@tin: Synchronized wmf-config: Use new compact lang links dblist (duration: 00m 41s)
  • 19:21 demon@tin: Synchronized dblists/compact-language-links.dblist: New dblist (duration: 00m 39s)
  • 19:12 volans: Upgrading nodejs to v6 on ruthenium T149331
  • 19:06 dereckson@tin: Synchronized wmf-config/throttle.php: Add throttle rule for KCES IMR edit-a-thon (T154312) (duration: 00m 39s)
  • 18:35 demon@tin: Synchronized multiversion/MWMultiVersion.php: Swapping 500 -> 400 when specifying invalid host headers (duration: 00m 39s)
  • 18:28 demon@tin: Synchronized multiversion/MWMultiVersion.php: minor cleanup (duration: 00m 48s)
  • 18:03 madhuvishy: Rolling out https://gerrit.wikimedia.org/r/#/c/332735/ across labs instances T154336
  • 17:54 madhuvishy: Disabled (systemctl disable) nfs-export on labstore1001 and 1004 to prevent auto restart from bringing them back up T154336
  • 17:42 mobrovac: restbase deploying 3027682
  • 17:40 madhuvishy: Starting final sync of latest diff from labstore1001 to labstore-secondary T154336
  • 17:38 madhuvishy: Disabling puppet on labstore1001 and 1004 to make sure nfs exports are not overridden T154336
  • 17:37 madhuvishy: Exporting all misc shares from labstore1004 as RO T154336
  • 17:30 madhuvishy: Stopping nfs-exportd on labstore1004 T154336
  • 17:29 madhuvishy: Exported all misc exports as RO on labstore1001 T154336
  • 17:26 madhuvishy: Stopping nfs-exportd on labstore1001 T154336
  • 17:13 madhuvishy: Disabling puppet across labs instances with NFS (/home and/or /data/project) mounted for T154336
  • 17:12 madhuvishy: Silenced shinken, and icinga on labstore1001 for misc nfs migration T154336
  • 15:45 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw122[6-9].*
  • 15:45 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw123[0-5].*
  • 15:41 oblivian@puppetmaster1001: conftool action : set/weight=15; selector: service=apache2,cluster=api_appserver,dc=eqiad,name=mw1(1[8-9]|2[0-1]|22[0-5]).*
  • 15:24 zeljkof: finished EU SWAT
  • 15:23 zfilipin@tin: Synchronized php-1.29.0-wmf.7/maintenance/importImages.php: SWAT: maintenance/importImages: Dont sleep after the last upload (duration: 00m 41s)
  • 15:07 hashar: tin.eqiad.wmnet : committed an uncommitted live hack for php-1.29.0-wmf.7/includes/AutoLoader.php by ostriches
  • 15:06 zeljkof: extending EU SWAT until 332766 is deployed
  • 14:39 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Increase $wgHTTPImportTimeout to 50 seconds (T155209) (duration: 00m 39s)
  • 14:31 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set wgDisableUserGroupExpiry to true on production, false on labs (T155605) (duration: 00m 40s)
  • 14:30 zfilipin@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: Set wgDisableUserGroupExpiry to true on production, false on labs (T155605) (duration: 00m 40s)
  • 14:03 moritzm: upgrading firejail on aqs cluster
  • 13:12 moritzm: uploaded firejail 0.9.44.6 for jessie-wikimedia to carbon
  • 12:21 marostegui: Enable gtid_domain_id on m3 - T149418
  • 12:06 moritzm: installing libio-socket-ssl-perl bugfix updates from jessie point release
  • 11:42 moritzm: installing sed bugfix updates from jessie point release
  • 11:17 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2064 - T154097 (duration: 00m 39s)
  • 11:09 moritzm: restarting mediawiki canary servers to pick up cairo and libpng updates
  • 10:38 moritzm: installing libxml security updates
  • 10:38 oblivian@puppetmaster1001: conftool action : set/pooled=no; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw123[0-5].*
  • 10:11 godog: pool ms-fe200[789] T152612
  • 10:10 marostegui: Restart mysql dbstore2001 to enable gtid_domain_id manually before deploying it on m3 - T149418
  • 10:05 marostegui: Remove partitions from enwiktionary.templatelinks on db2064 - T154097
  • 10:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2064 - T154097 (duration: 00m 45s)
  • 09:56 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2063 - T154097 (duration: 00m 48s)
  • 09:40 marostegui: Restart mysql dbstore2002 to enable gtid_domain_id manually before deploying it on m3 - T149418
  • 08:51 marostegui: Remove partitions from enwiktionary.templatelinks on db2063 - T154097
  • 08:50 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2063 - T154097 (duration: 00m 39s)
  • 08:33 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2060 - T154031 (duration: 00m 40s)
  • 08:18 marostegui: Compressing templatelinks tables on db1035 - T154465
  • 08:16 marostegui: Compressing templatelinks tables on db1038 - T154465
  • 08:00 _joe_: restarting pybal on lvs1003
  • 07:56 _joe_: restarting pybal on lvs1003
  • 07:34 oblivian@puppetmaster1001: conftool action : set/pooled=no; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw122[6-9].*
  • 07:32 _joe_: depooling mw1226-mw1235 from the https pool in eqiad, T152074
  • 07:30 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw12[7-9].*
  • 07:25 marostegui: Restart MySQL dbstore2001 to apply InnoDB defaults
  • 03:05 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Jan 18 03:05:47 UTC 2017 (duration 5m 38s)
  • 03:00 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.8) (duration: 13m 22s)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 08m 23s)
  • 02:00 ejegg: updated SmashPig from 03ef6b1 to 48675c3
  • 01:21 mobrovac@tin: Finished deploy [citoid/deploy@9f93a00]: (no message) (duration: 04m 14s)
  • 01:17 mobrovac@tin: Starting deploy [citoid/deploy@9f93a00]: (no message)
  • 00:59 mobrovac@tin: Finished deploy [trending-edits/deploy@1d53b7c]: fixes for T153122 and T145571 (duration: 05m 06s)
  • 00:54 mobrovac@tin: Starting deploy [trending-edits/deploy@1d53b7c]: fixes for T153122 and T145571
  • 00:35 demon@tin: Synchronized w/mobilelanding.php: Last major fix for multiversion (duration: 00m 45s)
  • 00:29 ejegg: updated SmashPig from 3da597f to 03ef6b1
  • 00:00 demon@tin: Synchronized multiversion/MWVersion.php: Swap to using MWMultiVersion and make this a fallback (duration: 00m 39s)

2017-01-17

  • 23:47 demon@tin: Synchronized w: Step 4/∞ of multiversion cleanups (duration: 00m 39s)
  • 23:42 demon@tin: Synchronized multiversion/MWScript.php: Step 3/∞ of multiversion cleanups (duration: 00m 39s)
  • 23:21 demon@tin: Synchronized rpc/RunJobs.php: Step 2/∞ of multiversion cleanups (duration: 00m 39s)
  • 22:30 demon@tin: Synchronized multiversion: Step 1/∞ of multiversion cleanups (duration: 00m 55s)
  • 21:48 demon@tin: Synchronized multiversion/MWVersion.php: Removing old getMediaWikiCli() entry point, unused (duration: 00m 39s)
  • 20:41 demon@tin: Synchronized php-1.29.0-wmf.8/extensions/LiquidThreads/classes/Hooks.php: Fix warning about pass-by-ref (duration: 00m 40s)
  • 20:09 ema: restarting hhvm on mw1227 - hhvm-dump-debug in /tmp/hhvm.25127.bt
  • 20:08 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.8
  • 19:52 demon@tin: Finished scap: testwiki to wmf.8 + rebuild l10n (duration: 46m 21s)
  • 19:36 ejegg: updated payments-wiki from 1f9ea80 to c22353b
  • 19:06 demon@tin: Started scap: testwiki to wmf.8 + rebuild l10n
  • 19:05 demon@tin: Synchronized wmf-config/throttle.php: throttle rule for T155493 (duration: 00m 40s)
  • 19:04 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Enable subpages in NS_MAIN in eswikiversity (duration: 00m 39s)
  • 19:02 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Add *.leventhalmap.org to the copyupload whitelist (duration: 00m 39s)
  • 19:01 demon@tin: Synchronized wmf-config/InitialiseSettings.php: avwiki namespace tweaks, T155321 (duration: 00m 39s)
  • 19:00 demon@tin: Synchronized wmf-config/throttle.php: T155510 throttle rule (duration: 01m 36s)
  • 18:03 mobrovac: restbase deploying a0e542b, switching to Node v6 T149331
  • 17:48 mobrovac: restbase installing node v6.9.1 on the cluster T149331
  • 16:29 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2056 - T154097 (duration: 00m 48s)
  • 16:22 marostegui: Powering off db2060 for maintenance - T154031
  • 16:16 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2005.codfw.wmnet
  • 15:01 moritzm: installing bash security updates
  • 14:10 moritzm: installing bind9 security updates
  • 13:33 moritzm: installing libpng security updates
  • 12:29 marostegui: Remove partitions from enwiktionary.templatelinks on db2056 - T154097
  • 12:27 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2056 - T154097 (duration: 00m 42s)
  • 12:16 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2049 - T154097 (duration: 00m 47s)
  • 11:54 moritzm: installing potrace security updates
  • 11:32 moritzm: installing w3m security updates
  • 11:22 moritzm: installing tre security updates
  • 11:19 moritzm: installing python-werkzeug security updates
  • 11:15 marostegui: Remove partitions from enwiktionary.templatelinks on db2049 - T154097
  • 11:13 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2049 - T154097 (duration: 00m 38s)
  • 10:58 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2041 - T154097 (duration: 00m 38s)
  • 10:43 moritzm: installing file/libmagic security updates
  • 10:35 hashar: CI switched NodeJS from v4 to v6 T155443 T149331
  • 10:22 moritzm: installing jq security updates
  • 10:16 hashar: Updating CI Jessie image for NodeJs 4 -> 6 upgrade. T155443
  • 10:06 moritzm: installing libwmf security updates
  • 10:02 marostegui: Remove partitions from enwiktionary.templatelinks on db2041 - T154097
  • 09:59 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2041 - T154097 (duration: 00m 38s)
  • 09:22 moritzm: installing tomcat security updates
  • 08:42 marostegui: Compressing wikidatawiki on db1026 - https://phabricator.wikimedia.org/T154929
  • 08:33 moritzm: installing tiff security updates
  • 07:50 marostegui: Remove partitions from enwiktionary.templatelinks on dbstore2001 - T154097
  • 07:26 marostegui: Compressing revision tables db1035 (depooled)
  • 06:58 marostegui: Compressing cebwiki/templatelinks (215G) table on db1038 - T154465
  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Jan 17 02:26:07 UTC 2017 (duration 4m 22s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 07m 20s)

2017-01-16

  • 18:42 moritzm: uploaded nodejs 6.9.1 for jessie-wikimedia to carbon
  • 15:01 elukey: restarting hhvm on mw1167 - hhvm-dump-debug in /tmp/hhvm.20360.bt
  • 14:47 hashar: European SWAT complete
  • 14:44 hashar@tin: Synchronized wmf-config/throttle.php: Add a new throttle rule - T155416 (duration: 00m 38s)
  • 14:26 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Namespace aliases on Bhojpuri Wikipedia (bhwiki) - T155278 (duration: 00m 41s)
  • 14:20 hashar@tin: Synchronized wmf-config/throttle.php: Add one throttle rule + remove obsolete ones T155345 (duration: 00m 38s)
  • 14:18 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: wgBabelMainCategory for cswikiversity to Uživatel %code% T155301 (duration: 00m 39s)
  • 13:12 moritzm: installing pysaml2 security updates
  • 12:26 moritzm: installing pdns-recursor security updates
  • 10:35 marostegui: Compressing templatelinks tables on db1044 (depooled) - T153826
  • 10:30 marostegui: Compressing pagelinks tables on db1038 - T154465
  • 09:13 marostegui: Compressing dewiki on db1026 - T154929
  • 08:56 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2034 - T149553 (duration: 00m 38s)
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Jan 16 02:25:46 UTC 2017 (duration 4m 21s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 08m 04s)

2017-01-15

  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Jan 15 02:25:27 UTC 2017 (duration 4m 23s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 07m 46s)

2017-01-14

  • 02:35 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Jan 14 02:35:07 UTC 2017 (duration 4m 25s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 12m 03s)

2017-01-13

  • 22:56 godog: delete labs instance data older than 60d from graphite[21]001, low disk space
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Jan 13 02:25:57 UTC 2017 (duration 5m 16s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 07m 11s)
  • 01:32 bsitzmann@tin: Finished deploy [trending-edits/deploy@cf388a9]: Update trending-edits to 421fa63 (duration: 01m 53s)
  • 01:30 bsitzmann@tin: Starting deploy [trending-edits/deploy@cf388a9]: Update trending-edits to 421fa63
  • 00:12 demon@tin: Synchronized multiversion: Clean up cli entry point (duration: 00m 54s)

2017-01-12

  • 23:05 demon@tin: Synchronized README: no-op for force co-master sync (duration: 00m 40s)
  • 22:49 demon@tin: Synchronized docroot/foundation: Yay no more powerpoints (duration: 00m 38s)
  • 22:38 demon@tin: Synchronized docroot/foundation/presentations: removing some of these powerpoints (duration: 00m 38s)
  • 22:08 maxsem@tin: Synchronized php-1.29.0-wmf.7/extensions/Graph/includes/ApiGraph.php: Debug for T155057 (duration: 00m 38s)
  • 18:13 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Use HD logos for (nap|os|pl|pt)wiki (duration: 00m 41s)
  • 18:12 demon@tin: Synchronized static/images/project-logos: HD logos for (nap|os|pl|pt)wiki (duration: 00m 39s)
  • 09:16 akosiaris: T155112 upload Vagrant 1.9.1 to apt.wikimedia.org/jessie-wikimedia/thirdparty and apt.wikimedia.org/trusty-wikimedia/thirdparty
  • 08:59 hashar: disabling puppet on contint1001 to live hack apache conf ( T150727 )
  • 02:46 demon@tin: Synchronized wmf-config/interwiki.php: T154225 (duration: 00m 38s)
  • 02:37 demon@tin: Synchronized wmf-config/InitialiseSettings-labs.php: no-op, completeness (duration: 00m 38s)
  • 02:36 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Jan 12 02:36:23 UTC 2017 (duration 5m 15s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 11m 12s)
  • 01:33 Reedy: running scap pull on tin
  • 00:29 demon@tin: Synchronized wmf-config/InitialiseSettings.php: oathauth group for wikitech (duration: 00m 38s)
  • 00:21 demon@tin: Synchronized docroot/noc/db.php: (no message) (duration: 00m 39s)
  • 00:17 reedy@tin: Synchronized wmf-config: More consistency for various commits (duration: 00m 40s)
  • 00:12 nuria: restarted apache2 and mysql on bohrium to see if mysql no connection errors disappear
  • 00:12 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Wikidata lang config (duration: 00m 38s)

2017-01-11

  • 23:55 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Minerva hacks (duration: 00m 38s)
  • 23:47 reedy@tin: Synchronized wmf-config: consistency (duration: 00m 41s)
  • 23:16 demon@tin: Synchronized wmf-config/CommonSettings.php: video transcode jobqueue stuff for Brion (duration: 00m 38s)
  • 23:10 demon@tin: Synchronized w/static.php: For Timo <3 (duration: 00m 40s)
  • 22:48 demon@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Use new metawiki logo, no-op in prod (duration: 00m 38s)
  • 22:47 demon@tin: Synchronized static/images/project-logos: beta logos (duration: 00m 40s)
  • 22:44 demon@tin: Synchronized wmf-config/CommonSettings.php: commentfix (duration: 00m 38s)
  • 22:39 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Simplify ores config. Enable sitenotice banners for arwiki (duration: 00m 39s)
  • 22:32 demon@tin: Synchronized wmf-config/abusefilter.php: comment fix (duration: 00m 39s)
  • 22:26 elukey: added mw1239.eqiad.wmnet back to service - T148421
  • 22:20 elukey: restarting hhvm on mw1198 (dump-debug in /tmp/hhvm.9737.bt)
  • 22:13 demon@tin: Synchronized wmf-config/abusefilter.php: Set $wgAbuseFilterNotificationsPrivate = true; for Meta-Wiki (duration: 00m 40s)
  • 22:04 demon@tin: Synchronized wmf-config/flaggedrevs.php: Deprecated variable cleanup (duration: 00m 38s)
  • 21:54 demon@tin: Synchronized scap/plugins/wmf-beta-autoupdate.py: no-op, not yet used (duration: 00m 38s)
  • 21:45 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: no-op (duration: 00m 38s)
  • 21:28 demon@tin: Synchronized wmf-config: massmessage hack cleanup + comments on kartographer wikivoyage mode (duration: 00m 41s)
  • 21:17 demon@tin: Synchronized wmf-config/InitialiseSettings.php: use new HD logos (duration: 00m 38s)
  • 21:16 demon@tin: Synchronized static/images/project-logos/notifications: New HD logos (duration: 00m 38s)
  • 21:10 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: pawiki logos. remove technican from trwikiquote (duration: 00m 38s)
  • 21:09 reedy@tin: Synchronized static/images: pawiki (duration: 00m 42s)
  • 21:03 Reedy: update collation of fiwikivoyage T151570
  • 21:02 reedy@tin: Synchronized php-1.29.0-wmf.7/extensions/CentralAuth/extension.json: fix name (duration: 00m 41s)
  • 21:01 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Translation namespace for mlwikisource. fiwikivoyage collation (duration: 00m 40s)
  • 20:53 reedy@tin: Synchronized wmf-config/CommonSettings.php: Upgrade Collections license URL to HTTPS (duration: 00m 57s)
  • 20:52 reedy@tin: Synchronized wmf-config/throttle.php: Fix throttle (duration: 00m 42s)
  • 20:44 Dereckson: Reset user e-mail for account for Panam2014
  • 20:43 demon@tin: Synchronized tests/noc-conf/NOCDblistTest.php: No-op (duration: 00m 40s)
  • 20:40 demon@tin: Synchronized wmf-config/InitialiseSettings-labs.php: no-op (duration: 00m 40s)
  • 19:36 demon@tin: Synchronized multiversion: rollback (duration: 00m 56s)
  • 19:34 demon@tin: Synchronized multiversion: MWVersion fallbacks & such (duration: 00m 56s)
  • 19:27 demon@tin: Synchronized php-1.29.0-wmf.7/extensions/FlaggedRevs: Stupid errors (duration: 00m 46s)
  • 18:56 demon@tin: Synchronized multiversion/MWMultiVersion.php: Attempt #2 for Multiversion cleanup (duration: 00m 41s)
  • 18:08 ebernhardson: restart elasticsaerch on relforge100[12] for new test version of ltr plugin
  • 14:12 hashar: European SWAT completed
  • 14:11 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Import source on bd.wikimedia.org T154990 + Turn of patrolling on ruwiki T154285 (duration: 00m 42s)
  • 14:10 hashar@tin: Synchronized composer.json: build: Update PHPUnit from 3.7 to 4.8, add phplint to composer-test - T85947 (duration: 00m 45s)
  • 14:09 hashar@tin: Synchronized composer.lock: build: Update PHPUnit from 3.7 to 4.8, add phplint to composer-test - T85947 (duration: 00m 55s)
  • 14:06 hashar: scap pull on terbium
  • 02:35 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Jan 11 02:35:46 UTC 2017 (duration 4m 31s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 10m 47s)
  • 00:39 _joe_: restart hhvm on mw1182, stuck on HPHP::Treadmill::getAgeOldestRequest

2017-01-10

  • 23:09 hoo: Ran DELETE FROM wbc_entity_usage WHERE eu_row_id IN(1714177, 1714178, 1714179, 1714180, 1714181, 1714182, 1714183, 1714184, 3914375); on s5 master (T147630)
  • 22:53 demon@tin: Synchronized php-1.29.0-wmf.7/extensions/VisualEditor/ApiVisualEditor.php: T154962 logspam (duration: 00m 41s)
  • 22:46 mutante: gerrit restarting for config change 331553
  • 22:13 reedy@tin: Synchronized php-1.29.0-wmf.7/includes/registration/ExtensionRegistry.php: (no message) (duration: 00m 43s)
  • 22:12 reedy@tin: Synchronized php-1.29.0-wmf.7/extensions/Wikidata: (no message) (duration: 02m 21s)
  • 22:06 demon@tin: Synchronized php-1.29.0-wmf.7/includes/libs/objectcache/WANObjectCache.php: Silence obnoxious replag errors (duration: 00m 42s)
  • 21:33 eileen2: civicrm updated from b26844d to af8d735
  • 21:22 mutante: cp3048 - labservices1001 - ran puppet, in this case it wasn't about gerrit, but recovered too
  • 21:15 mutante: sca2004, labsdb1003 - ran puppet (they wanted to git clone during gerrit restart)
  • 21:04 mutante: gerrit restarting for config change 49993 (T40114)
  • 20:16 dereckson@tin: Synchronized php-1.29.0-wmf.7/extensions/SemanticMediaWiki: Remove deprecated function usages (T147924) (duration: 00m 49s)
  • 20:14 dereckson@tin: Synchronized php-1.29.0-wmf.7/extensions/UploadWizard/resources/transports/mw.FormDataTransport.js: mw.FormDataTransport: Don't remove Unicode characters from temp filename (T155039) (duration: 00m 41s)
  • 19:49 demon@tin: Synchronized scap/plugins/prep.py: another no-op (duration: 00m 41s)
  • 19:39 demon@tin: Synchronized scap/plugins/prep.py: prod no-op, for completeness (duration: 00m 40s)
  • 18:41 legoktm: re-attached User:Fuu5tgsrygr / T154983
  • 04:30 dereckson@tin: Synchronized wmf-config/throttle.php: Fix Dayanand College Solapur event throttle rule (T154312) (duration: 00m 44s)
  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Jan 10 02:26:47 UTC 2017 (duration 4m 22s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 07m 46s)
  • 01:34 reedy@tin: Synchronized rpc/RunJobs.php: revert (duration: 00m 40s)
  • 01:32 reedy@tin: Synchronized w: revert 0a2a096 (duration: 00m 40s)
  • 01:31 reedy@tin: Synchronized multiversion: revert 0a2a096 (duration: 00m 56s)
  • 01:01 Dereckson: Updated articles count on pl.wikisource: 491 100 (T154711)
  • 00:59 Dereckson: Fixed links with namespaceDupes on pl.wikisource
  • 00:58 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Add Collection namespace to the Polish Wikisource (T154711) (duration: 00m 41s)

2017-01-09

  • 23:38 reedy@tin: Synchronized wmf-config/CommonSettings.php: wfLoadExtension (duration: 00m 40s)
  • 23:34 reedy@tin: Synchronized wmf-config/extension-list: More to extension.json (duration: 00m 40s)
  • 23:29 demon@tin: Synchronized multiversion: Final batch of MWVersion cleanup (in song form) (duration: 00m 56s)
  • 23:28 demon@tin: Synchronized rpc/RunJobs.php: More cleanup songs (duration: 00m 40s)
  • 23:28 mutante: ganglia web - replacing SSL cert with Letsencrypt
  • 23:26 demon@tin: Synchronized w: Cleanup cleanup everybody do your share (duration: 00m 40s)
  • 23:25 demon@tin: Synchronized multiversion/MWMultiVersion.php: Cleanup cleanup everybody everywhere (duration: 00m 40s)
  • 23:09 reedy@tin: Synchronized wmf-config/CommonSettings-labs.php: T154927 (duration: 00m 41s)
  • 23:08 reedy@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T154927 (duration: 00m 42s)
  • 23:07 reedy@tin: Synchronized wmf-config/extension-list-labs: T154927 (duration: 00m 41s)
  • 22:53 robh: updating lists.w.o to use LE cert
  • 21:50 akosiaris: service restart zotero on sca1003, sca1004. Zotero OOMed again as usual
  • 19:49 robh: updating librenms.wikimedia.org cert, netmon1001 only system affected
  • 18:16 paravoid: rebooting and powercycling mira, CPU frequency throttled, suspecting firmware bug
  • 17:51 hoo: Updated the Wikidata property suggester with data from last Monday's JSON dump and applied the T132839 workarounds
  • 15:35 zeljkof: finished EU SWAT!
  • 15:33 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add en.wikinews and es.wikinews as import source in testwiki (T154879) (duration: 02m 38s)
  • 15:26 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable import from cswiki to arbcom_cswiki (T154799) (duration: 02m 38s)
  • 15:10 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add DW alias for NS_PROJECT_TALK in frwiki (T153952) (duration: 02m 36s)
  • 15:04 zeljkof: extending EU SWAT, tree more patches left to deploy
  • 15:00 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: [throttle] Lift for 2017-01-10/12 + minor cleanup (T154312) (duration: 02m 36s)
  • 14:52 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Extension:Babel s category on cswikiversity (T67211) (duration: 02m 36s)
  • 14:39 zfilipin@tin: Synchronized static/images/project-logos: SWAT: Add HD logos for multiple projects (T150618) (duration: 02m 36s)
  • 14:25 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add digitalmedia.fws.gov to the whitelist (T154671) (duration: 02m 38s)
  • 10:38 akosiaris: restart nginx and rcstream on rcs1001.eqiad.wmnet to debug issue with prematurely closed connections and 502 returned to clients. No change witnessed.
  • 07:12 ebernhardson: restart elasticsearch on relforge100[12] to adjust ltr logging settings
  • 02:45 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Jan 9 02:45:57 UTC 2017 (duration 4m 36s)
  • 02:41 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 27m 03s)

2017-01-08

  • 02:27 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Jan 8 02:27:29 UTC 2017 (duration 4m 40s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 07m 33s)

2017-01-07

  • 19:07 ejegg: disabled dedupe civicrm contacts
  • afk: re-enabled dedupe civicrm contacts
  • afk: disabled dedupe civicrm contacts
  • 13:31 dcausse: elastic@codfw removing/readding replicas for viwiki_general and zhwiki_content (affected by something similar to https://github.com/elastic/elasticsearch/issues/12661) - T154765
  • 11:35 _joe_: from medelevium
  • 11:35 _joe_: restarted apache/otrs, removed a 8 gb error.log
  • 05:11 Dereckson: Update statistics count on so.wikipedia (T154833)
  • 02:35 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Jan 7 02:35:56 UTC 2017 (duration 5m 20s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 11m 08s)

2017-01-06

  • 23:08 godog: force puppet run on cache_upload in eqiad to switch thumbs back from codfw
  • 22:38 demon@tin: Synchronized multiversion: updateBranchPointers consolidation (duration: 00m 56s)
  • 20:14 demon@tin: Synchronized w: Dropping old entry point (duration: 00m 41s)
  • 19:35 demon@tin: Synchronized php-1.29.0-wmf.7/extensions/UploadWizard/resources: I32e0b8 (duration: 00m 40s)
  • 19:29 demon@tin: Synchronized php-1.29.0-wmf.7/extensions/UploadWizard/resources/mw.UploadWizard.js: I32e0b8 (duration: 00m 59s)
  • 19:17 ebernhardson: restarting elasticsearch on relforge100[12] to test new search-ltr plugin
  • 19:07 ostriches: gerrit: Started full reindex of all changes, should be background but will be watching
  • 18:59 mutante: gerrit restarting for config change 308753 - will be back in seconds
  • 18:13 mutante: mw1205 - restarted hhvm
  • 18:08 godog: force puppet run on eqiad cache_upload to switch thumbs to codfw
  • 16:17 godog: bounce swift-proxy on ms-fe100[123] leave ms-fe1004 for investigation
  • 16:14 hashar: Restarting Nodepool
  • 16:05 ema: wiping codfw caches T154758
  • 15:29 cmjohnson1: powering off mw1239 to reseat DIMM
  • 15:22 papaul: elastic2025-elastic2036 - signing puppet certs, salt-key, initial run
  • 14:53 mark: papaul powercycled asw-a7-codfw 14:50
  • 14:16 reedy@tin: Finished scap: Rebuild message cache for Echo api messages being missing T154110 (duration: 25m 00s)
  • 13:51 reedy@tin: Started scap: Rebuild message cache for Echo api messages being missing T154110
  • 10:00 ariel@tin: Synchronized wmf-config/throttle.php: test, noop (duration: 02m 45s)
  • 09:24 paravoid: asw-a7-codfw is down, serial console unresponsive
  • 09:12 ariel@tin: Synchronized wmf-config/throttle.php: Adjust throttle rule for Maharashtra 'Edit Wikipedia' workshop (VNGIASS) (duration: 02m 46s)
  • 07:08 moritzm: installing crypto++ security updates on trusty hosts
  • 04:37 matt_flaschen: Finished FlowFixInconsistentBoards.php (production mode) on all wikis
  • 04:27 matt_flaschen: Started FlowFixInconsistentBoards.php (production mode) on all wikis
  • 04:10 mutante: Icinga now using Letsencrypt cert and all good
  • 03:42 mutante: icinga - debugging issue with cert change
  • 03:05 papaul: OS instalaltion on elastic2025-elastic2036
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 13m 23s)
  • 01:27 bd808: Restarted logstash on logstash1002 (T154732)
  • 01:06 Krinkle: stream.wikimedia.org problems - nginx responds with HTTP 502 Bad Gateway to most requests
  • 01:01 ostriches: gerrit: back up from upgrade
  • 01:00 ostriches: gerrit: down for upgrade
  • 00:13 mutante: analytics1036, ms-fe1003 - ran puppet to fix Icinga
  • 00:11 mutante: carbon - stopping ganglia-monitor-aggregator for good

2017-01-05

  • 23:49 mutante: rolling out exim4 upgrades (DSA 3747-1) on all remaining eqiad (all-eqiad)
  • 23:46 godog: fix root-owned files on puppetmaster1001:/var/lib/git/operations/private/ causing /srv/private post-commit hook to fail
  • 23:18 mutante: switching eqiad ganglia aggregator - running puppet on install1001 - disabling on carbon, re-enabling puppet across eqiad
  • 23:05 mutante: temp disabling puppet on all eqiad hosts via salt - during ganglia aggregator switch
  • 22:51 demon@tin: Synchronized php-1.29.0-wmf.7/extensions/Echo/includes/api: silence api warnings (duration: 02m 46s)
  • 22:40 demon@tin: Synchronized php-1.29.0-wmf.7/extensions/VisualEditor: silence some api warnings (duration: 02m 48s)
  • 22:22 demon@tin: Synchronized php-1.29.0-wmf.7/extensions/CodeReview/api/ApiQueryCodeComments.php: silence some warnings (duration: 02m 46s)
  • 21:59 mutante: rolling out exim4 upgrades (DSA 3747-1) on all remaning ones in codfw (all-codfw)
  • 21:53 ejegg: updated payments wiki from 21ea9bc to 1f9ea80
  • 21:53 mutante: mx1001 - upgrading exim4 packages, exim4-daemon-heavy, forcing puppet run
  • 21:35 mutante: mx2001 - upgrading exim4 packages, daemon-heavey, forcing puppet run
  • 21:22 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.7
  • 20:58 mutante: mendelevium (OTRS) - upgrade exim4 packages, force puppet run
  • 20:45 mutante: iridium (phabricator) - upgrade exim4 packages, force puppet run
  • 20:29 demon@tin: Synchronized docroot/wikipedia.org: removing junk 15 stuff (duration: 04m 50s)
  • 20:18 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.7
  • 20:07 demon@tin: Synchronized docroot: Ok last docroot thing for today I promise (duration: 05m 00s)
  • 19:52 demon@tin: Synchronized docroot: Final bit of this round of docroot cleanup (duration: 05m 00s)
  • 19:44 thcipriani@tin: Finished scap: SemanticForms l10n cache rebuild for 1.29.0-wmf.7 (duration: 39m 21s)
  • 19:35 mutante: mira - upgrade exim, python-requests, linux-image
  • 19:29 mutante: rolled out exim4 upgrades (DSA 3747-1) on contint1001/2001, dbmonitor1001/2001, tungsten, seaborgium, hassium, labstore*
  • 19:05 thcipriani@tin: Started scap: SemanticForms l10n cache rebuild for 1.29.0-wmf.7
  • 18:34 mutante: fermium (lists server) - upgrading exim packages, exim4-daemon-heavy, forcing puppet run
  • 18:29 yurik@tin: Finished deploy [graphoid/deploy@d20b00e]: (no message) (duration: 03m 11s)
  • 18:26 yurik@tin: Starting deploy [graphoid/deploy@d20b00e]: (no message)
  • 18:25 arlolra: Updated Parsoid to 974dd5b3 (T143183, T102134, T113044)
  • 18:25 mutante: rolling out exim4 upgrades (DSA 3747-1) on prometheus, mwlog, pollux, labmon, lithium, hassaleh, dubnium, graphite1002, tin, serpens, bromine, dataset1001
  • 18:16 arlolra@tin: Finished deploy [parsoid/deploy@465f9c4]: Updating Parsoid to 974dd5b3 (duration: 10m 46s)
  • 18:15 mutante: rolling out exim4 upgrades (DSA 3747-1) on puppetmaster, yubiauth, oresrdb, (oresrdb1001 - Unknown installation error.. eh.. this is new)
  • 18:05 arlolra@tin: Starting deploy [parsoid/deploy@465f9c4]: Updating Parsoid to 974dd5b3
  • 17:57 robh: shutting down mw2075-2089 for decom per T154621
  • 17:35 robh: diabling puppet on mw2075-2089 to decommission them today.
  • 17:15 mutante: rolling out exim4 upgrades (DSA 3747-1) on ruthenium, einsteinium (icinga), etherpad1001, rhodium, | einsteinium: upgrade python packages, kernel | xenon: apt-get autoremove, upgrade python- arcconf, libs...
  • 17:07 mutante: scandium (zuul merger), upgrade exim, python-requests, kernel version
  • 17:06 mutante: rolling out exim4 upgrades (DSA 3747-1) on kraz, wdqs2003, wezen, zosma, tegmen, rutherfordium. upgrade kernel and python-requests on zosma
  • 16:59 mutante: rolling out exim4 upgrades (DSA 3747-1) on all-db-noncore, all-mw-eqiad, restbase-eqiad, kafka-main
  • 16:52 mutante: rolling out exim4 upgrades (DSA 3747-1) on notebook, lvs-canary, lvs, mw-maintenance, all-mw-codfw
  • 16:33 mutante: cobalt upgrading exim packages
  • 16:28 mutante: rolling out exim4 upgrades (DSA 3747-1) on mw-api, swift-fe, swift-be, sca, scb, misc-analytics
  • 16:21 mutante: rolling out exim4 upgrades (DSA 3747-1) on db-core-eqiad, db-misc-servers, videoscaler
  • 15:59 moritzm: upgrading firejail on thumbor servers
  • 15:44 chasemp: labstore1005 systemctl disable create-dbusers
  • 15:01 mobrovac: restbase restarting for firejail upgrade
  • 14:46 moritzm: upgrading firejail on image scalers
  • 14:19 moritzm: upgrading firejail on restbase production hosts
  • 14:15 hashar@tin: Synchronized php-1.29.0-wmf.7/extensions/ContentTranslation: Workaround to fix restoration for truncated section ids - T154279 (duration: 02m 10s)
  • 14:10 moritzm: upgrading firejail on restbase staging hosts
  • 13:46 moritzm: installing audiofile security updates
  • 13:09 mobrovac@tin: Finished deploy [trending-edits/deploy@c5d239b]: Restart for firejail upgrade (duration: 00m 46s)
  • 13:08 mobrovac@tin: Starting deploy [trending-edits/deploy@c5d239b]: Restart for firejail upgrade
  • 13:07 mobrovac@tin: Finished deploy [electron-render/deploy@b2a820e]: Restart for firejail upgrade (duration: 05m 29s)
  • 13:02 mobrovac@tin: Starting deploy [electron-render/deploy@b2a820e]: Restart for firejail upgrade
  • 13:01 mobrovac@tin: Finished deploy [electron-render/deploy@b2a820e]: Restart for firejail upgrade (duration: 03m 40s)
  • 12:57 mobrovac@tin: Starting deploy [electron-render/deploy@b2a820e]: Restart for firejail upgrade
  • 12:45 mobrovac@tin: Finished deploy [mobileapps/deploy@c39bd1f]: Restart for firejail upgrade (duration: 04m 05s)
  • 12:41 mobrovac@tin: Starting deploy [mobileapps/deploy@c39bd1f]: Restart for firejail upgrade
  • 12:41 mobrovac@tin: Finished deploy [mathoid/deploy@79fdd56]: Restart for firejail upgrade (duration: 00m 41s)
  • 12:40 mobrovac@tin: Starting deploy [mathoid/deploy@79fdd56]: Restart for firejail upgrade
  • 12:40 mobrovac@tin: Finished deploy [graphoid/deploy@151f26c]: Restart for firejail upgrade (duration: 00m 43s)
  • 12:39 mobrovac@tin: Starting deploy [graphoid/deploy@151f26c]: Restart for firejail upgrade
  • 12:38 mobrovac@tin: Finished deploy [cxserver/deploy@0279029]: Restart for firejail upgrade (duration: 00m 39s)
  • 12:37 mobrovac@tin: Starting deploy [cxserver/deploy@0279029]: Restart for firejail upgrade
  • 12:36 mobrovac@tin: Finished deploy [citoid/deploy@da96f4b]: (no message) (duration: 01m 06s)
  • 12:35 mobrovac@tin: Starting deploy [citoid/deploy@da96f4b]: (no message)
  • 12:24 moritzm: installing firejail security updates on scb
  • 11:13 akosiaris: rebooting bast3001, T154603
  • 11:11 moritzm: uploaded firejail 0.9.44+wmf2 for jessie-wikimedia to carbon
  • 07:54 elukey: chown www-data:www-data all the root:adm hhvm log files on mw eqiad hosts (T132324)
  • 07:11 marostegui: Compressing revision tables across all the wikis - db1038 - T154465
  • 07:09 marostegui: Compressing pagelinks tables across all the wikis - db1044 - T153826
  • 07:08 marostegui: Compressing revision tables across all the wikis - db1015 - T153739
  • 06:15 bd808: sudo -u l10nupdate rm /var/lock/scap on tin to clean up lock left by bad l10nupdate locking attempt
  • 05:49 mutante: rolling out exim4 upgrades (DSA 3747-1) on swift-fe-codfw, swift-be-codfw, ALL remaining mw
  • 05:44 mutante: rolling out exim4 upgrades (DSA 3747-1) on db-core-codfw, etcd, graphite, kafka-analytics-canary, kafka-analytics, logstash
  • 05:39 mutante: rolling out exim4 upgrades (DSA 3747-1) on prometheus, aqs, db-es
  • 05:35 mutante: rolling out exim4 upgrades (DSA 3747-1) on cp-eqiad, memcached-eqiad
  • 01:48 mutante: rolling out exim4 upgrades (DSA 3747-1) on redis-codfw (rdb2005 needed manual) and all of dc-ulsfo, dc-esams
  • 01:41 mutante: rolling out exim4 upgrades (DSA 3747-1) on ganeti, cp-ulsfo, wdqs, thumbor, db-es-codfw
  • 01:36 mutante: rolling out exim4 upgrades (DSA 3747-1) on parsoid, maps, cp-esams
  • 01:32 mutante: servermon - after the next update by cron - package data is back
  • 01:13 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings.php: Disable NewUserMessage on gomwiki (duration: 00m 41s)
  • 01:10 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2090.codfw.wmnet
  • 01:06 demon@tin: Synchronized scap/plugins: (no message) (duration: 00m 40s)
  • 01:04 mattflaschen@tin: Synchronized php-1.29.0-wmf.7/extensions/Flow: Flow script to add more troubleshooting information to a maintenance script (duration: 00m 56s)
  • 00:57 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2090.codfw.wmnet
  • 00:57 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2089.codfw.wmnet
  • 00:57 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2088.codfw.wmnet
  • 00:57 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2087.codfw.wmnet
  • 00:57 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2086.codfw.wmnet
  • 00:56 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2085.codfw.wmnet
  • 00:55 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2084.codfw.wmnet
  • 00:55 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2083.codfw.wmnet
  • 00:55 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2082.codfw.wmnet
  • 00:55 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2081.codfw.wmnet
  • 00:54 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2080.codfw.wmnet
  • 00:54 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2080.codfw.wmnet
  • 00:52 mattflaschen@tin: Synchronized php-1.29.0-wmf.6/extensions/Flow: Two Flow fixes related to production database/content inconsistencies. (duration: 00m 59s)
  • 00:42 mutante: phab2001 - same chmod 751 on exim4 dirs that i manually did on krypton is done by puppet here, fully automatic, not sure why krypton was a one-off
  • 00:41 mutante: planet1001/2001, phab2001 - upgrade exim4, exim4-daemon-heavy
  • 00:37 mutante: servermon - weird behaviour in the "pending package upgrades" list? exim4 package was shown as pending on lots of hosts, after next upgrade it disppears from list, even though about half the servers should still be listed
  • 00:23 mutante: krypton - chmod 751 /var/spool/exim4/ to fix Icinga alerts about unaccesible tmpfs (nagios user could not access), it was 751 on other hosts like ununpentium
  • 00:17 mutante: krypton - stop exim, umount orphaned "scan" tmpfs (there is no clamav here)
  • 00:00 mutante: gerrit slowdown reported around 23:55 UTC, was back to normal after 2 minutes (T148478) - attaching latest jvm_gc log
  • 00:00 aaron@tin: Synchronized wmf-config/logging.php: No-op sync of 7e103f2 (duration: 00m 42s)

2017-01-04

  • 23:55 mutante: krypton - chown Debian-exim:Debian-exim /var/spool/exim4/scan/ to fix Icinga-reported DISK issue - wrong permissions - see puppet/modules/exim4/manifests/init.pp line 57 ff "catch-22 with Puppet vs. package"
  • 23:20 mutante: rolled out exim4 upgrades (DSA 3747-1) on memcached-canary, memcached-codfw, restbase-codfw, cp-codfw
  • 23:11 aaron@tin: Synchronized wmf-config/logging.php: Include DB shard as a logstash column (duration: 00m 41s)
  • 23:09 mutante: rolling out exim4 upgrades (DSA 3747-1) on memcached-canary, memcached-codfw, restbase-codfw
  • 23:06 mutante: rolling out exim4 upgrades (DSA 3747-1) on mw-eqiad
  • 22:52 robh: all my server depools and decoms for the mw range are on T154621
  • 22:51 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2078.codfw.wmnet
  • 22:51 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2077.codfw.wmnet
  • 22:51 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2076.codfw.wmnet
  • 22:50 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2075.codfw.wmnet
  • 22:49 godog: rename / reimage restbase-test1* to restbase-dev1*
  • 22:46 bblack: TLS: unified certificates in esams switching to digicert
  • 22:19 mutante: rolling out exim4 upgrades (DSA 3747-1) on mw-codfw
  • 21:55 Krinkle: mwscript deleteEqualMessages.php --wiki nowikinews (T45917)
  • 21:55 Krinkle: mwscript deleteEqualMessages.php --wiki nowiki (T45917)
  • 21:48 otto@tin: Finished deploy [eventstreams/deploy@a103be2]: (no message) (duration: 01m 32s)
  • 21:46 otto@tin: Starting deploy [eventstreams/deploy@a103be2]: (no message)
  • 21:24 bsitzmann@tin: Finished deploy [mobileapps/deploy@c39bd1f]: Update mobileapps to b43c5d6 (duration: 02m 55s)
  • 21:21 bsitzmann@tin: Starting deploy [mobileapps/deploy@c39bd1f]: Update mobileapps to b43c5d6
  • 21:15 smalyshev@tin: Finished deploy [wdqs/wdqs@3762556]: (no message) (duration: 02m 40s)
  • 21:13 smalyshev@tin: Starting deploy [wdqs/wdqs@3762556]: (no message)
  • 21:01 smalyshev@tin: Finished deploy [wdqs/wdqs@3762556]: (no message) (duration: 00m 59s)
  • 21:00 smalyshev@tin: Starting deploy [wdqs/wdqs@3762556]: (no message)
  • 20:49 madhuvishy: adding temporary IP tables rule on labservices1001 to drop traffic from toolchecker for tests (T152369)
  • 20:39 mutante: rolling out exim4 upgrades (DSA 3747-1) on mw-canary hosts
  • 20:36 mutante: rolling out exim4 upgrades (DSA 3747-1) on stat* and kubernetes hosts
  • 20:05 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.7
  • 19:09 thcipriani@tin: Synchronized portals: SWAT: Bumping portal to master T128546 (duration: 00m 42s)
  • 19:08 thcipriani@tin: Synchronized portals/prod/wikipedia.org/assets: SWAT: Bumping portal to master T128546 (duration: 00m 43s)
  • 18:30 mutante: rolling out exim4 upgrades (DSA 3747-1) on misc servers
  • 18:29 ejegg: enabled payment processor audit parser jobs
  • 17:47 matt_flaschen: Ran manual DB update to officewiki for T153320.
  • 15:20 zeljkof: EU SWAT finished
  • 15:19 hashar@tin: Synchronized wmf-config/throttle.php: (no message) (duration: 00m 41s)
  • 15:03 zeljkof: extending eu swat until 330392 is merged
  • 14:56 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Throttle rules for 2017-01-06/07, tewiki (T154568) (duration: 00m 40s)
  • 14:32 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set valid content language for Norwegian wikis (T126146) (duration: 00m 41s)
  • 14:20 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert $wgMFEEditorOptions[anonymousEditing] = false for kowiki (T119823) (duration: 00m 41s)
  • 11:18 jynus: continuing maintenance on db1035 (mysql replication stopped)
  • 10:12 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2060 - T154031 (duration: 00m 47s)
  • 07:43 marostegui: Compressing tables on db1015 - T153739
  • 07:24 marostegui: Compressing more tables on db1044 - T153826
  • 03:10 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Jan 4 03:10:08 UTC 2017 (duration 5m 33s)
  • 03:05 bd808@tin: Synchronized wmf-config/throttle.php: Add throttle rules for January 2017 events in Maharashtra (T154312) (duration: 00m 42s)
  • 03:04 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 13m 05s)
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.6) (duration: 11m 17s)
  • 01:24 maxsem@tin: Synchronized php-1.29.0-wmf.6/extensions/CentralAuth: https://gerrit.wikimedia.org/r/#/c/330345/ (duration: 00m 44s)
  • 01:03 maxsem@tin: Synchronized php-1.29.0-wmf.7/extensions/Flow: https://gerrit.wikimedia.org/r/#/c/330338/ (duration: 00m 58s)
  • 00:42 foks: removed 2fa for account per T154171
  • 00:33 maxsem@tin: Synchronized wmf-config/unitConversionConfig.json: https://gerrit.wikimedia.org/r/#/c/327907/5 (duration: 00m 40s)
  • 00:13 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/330264/3 (duration: 00m 41s)
  • 00:06 eileen: update civicrm from f78c894 to b26844d

2017-01-03

  • 23:57 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.29.0-wmf.7
  • 23:48 thcipriani@tin: Finished scap: testwiki to php-1.29.0-wmf.7 and rebuild l10n cache (duration: 49m 50s)
  • 23:19 ostriches: gerrit: quick restart of services
  • 22:58 thcipriani@tin: Started scap: testwiki to php-1.29.0-wmf.7 and rebuild l10n cache
  • 22:55 chasemp: iptables block of tools-checker-01 to debug DNS SPoF
  • 22:44 maxsem@tin: Synchronized php-1.29.0-wmf.7/extensions/Kartographer: https://gerrit.wikimedia.org/r/#/c/330321/ (duration: 00m 42s)
  • 22:38 maxsem@tin: Synchronized php-1.29.0-wmf.6/extensions/Kartographer: https://gerrit.wikimedia.org/r/#/c/330322/ (duration: 00m 42s)
  • 22:11 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: mapframe on fr: and fi: https://gerrit.wikimedia.org/r/#/c/330311/3 (duration: 00m 41s)
  • 21:00 gehel@tin: Finished deploy [wdqs/wdqs@a25d3aa]: (no message) (duration: 00m 55s)
  • 20:59 gehel@tin: Starting deploy [wdqs/wdqs@a25d3aa]: (no message)
  • 20:57 otto@tin: Finished deploy [eventstreams/deploy@9095b4e]: (no message) (duration: 11m 15s)
  • 20:45 otto@tin: Starting deploy [eventstreams/deploy@9095b4e]: (no message)
  • 20:18 gehel@tin: Finished deploy [wdqs/wdqs@cd7215c]: (no message) (duration: 04m 54s)
  • 20:13 gehel@tin: Starting deploy [wdqs/wdqs@cd7215c]: (no message)
  • 19:56 demon@tin: Synchronized multiversion/updateBranchPointers: Removing unused --dry-run option (duration: 00m 40s)
  • 19:52 thcipriani@tin: Synchronized multiversion: SWAT: Remove checkoutMediaWiki (duration: 00m 58s)
  • 19:13 thcipriani@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Add badge for "digitaldocument" in Wikibase T153186 (duration: 01m 33s)
  • 18:01 moritzm: installing fontconfig security updates
  • 17:44 mutante: terbium - Notice: /Stage[main]/Mediawiki::Maintenance::Generatecaptcha/Cron[generatecaptcha]/ensure: created(T150029)
  • 17:24 thcipriani: starting branch cut for 1.29.0-wmf.7
  • 17:04 mutante: iridium (phab) - apt-get clean ; find /var/log/account/ -mtime +10 -delete ; find /var/log/atop/ -mtime +10 -delete (T154407)
  • 16:59 thcipriani: enable l10nupdate cron post deployment-freeze
  • 16:49 mutante: iridium (phab) - reduce process accounting from 30 days to 10 days to save disk space used by /var/log/account, run /etc/cron.daily/acct (T154407)
  • 16:37 bd808: Updated scholarships to 1690808 on krypton; needed help from _joe_ to make trebuchet work
  • 15:58 _joe_: rolling restart of restbase on the production cluster
  • 15:49 _joe_: rolling restart of restbase on the test cluster
  • 14:53 akosiaris: reenabling ntpd on the restbase in eqiad
  • 14:43 gehel: upgrade liblogstash-gelf on elastic* - T150408
  • 14:35 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: [2/2] Add HD logos for multiple wikis - T150618 (duration: 00m 40s)
  • 14:33 hashar@tin: Synchronized static/images/project-logos: [1/2] Add HD logos for multiple wikis - T150618 (duration: 00m 40s)
  • 14:29 akosiaris: reenabling ntpd on the rest of the boxes. Leaving restbase only out for last
  • 14:21 hashar@tin: Synchronized wmf-config/throttle.php: New rules + remove obsolete rules T154245 (duration: 00m 40s)
  • 14:19 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable mapframe for nowiki T154021 (duration: 00m 39s)
  • 14:17 akosiaris: reenabling ntpd on aqs boxes
  • 14:17 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Set sortPrepend for gdwiki T153900 (duration: 00m 40s)
  • 14:16 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable SandboxLink on ruwiki T153855 (duration: 00m 40s)
  • 14:10 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable subpages in NS0 for arbcom_cswiki - T154247 (duration: 00m 40s)
  • 14:08 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add new page protection level on etwiki - T153465 (duration: 00m 53s)
  • 14:08 akosiaris: reenabling ntpd on maps, maps-test boxes
  • 14:07 akosiaris: reenabling ntpd on kafka boxes
  • 13:37 akosiaris: reenabling ntpd on elastic boxes
  • 13:27 akosiaris: reenabling ntpd on rdb boxes
  • 13:22 akosiaris: reenabling ntpd on conf boxes
  • 13:22 akosiaris: reenabling ntpd on es* boxes
  • 13:15 akosiaris: reenabling ntpd on scb* boxes
  • 13:09 moritzm: uploaded Linux 4.4.39 for jessie-wikimedia to carbon
  • 13:09 akosiaris: reenabling ntpd on mc* boxes
  • 13:01 akosiaris: reenabling ntpd on ms-fe boxes
  • 13:00 moritzm: installing libgd security updates
  • 12:55 akosiaris: reenabling ntpd on ms-be boxes
  • 12:52 akosiaris: reenabling ntpd on lvs boxes
  • 12:48 akosiaris: reenabling ntpd on analytics boxes
  • 12:41 moritzm: installing python security updates
  • 12:12 moritzm: installing squid security updates
  • 12:07 akosiaris: reenabling ntpd on pc eqiad & codfw boxes
  • 12:06 akosiaris: reenabling ntpd on ganeti eqiad & codfw boxes
  • 11:54 akosiaris: reenabling ntpd on wtp eqiad boxes
  • 11:52 akosiaris: reenabling ntpd on logstash eqiad boxes
  • 11:51 akosiaris: reenabling ntpd on db* eqiad boxes
  • 11:46 akosiaris: reenabling ntpd on cobalt (gerrit)
  • 11:32 moritzm: installing tar security updates on trusty hosts
  • 11:27 gehel: upgrade liblogstash-gelf on deployment-elastic* - T150408
  • 11:16 gehel: upgrade lilogstash-gelf on relforge - T150408
  • 11:13 akosiaris: reenabling ntpd on db* codfw boxes
  • 11:07 akosiaris: reenabling ntpd on wtp codfw boxes
  • 10:59 akosiaris: reenabling ntpd on mw eqiad boxes
  • 10:53 jynus: stopping mysql replication on db1035 (depooled)
  • 10:50 akosiaris: reenabling ntpd on mw codfw boxes
  • 10:44 akosiaris: reenabling ntpd on eqiad cp boxes
  • 10:39 akosiaris: reenabling ntpd on codfw cp boxes
  • 10:14 akosiaris: start enabling ntpd again across the fleet. Starting with cp boxes on ulsfo and esams
  • 09:23 marostegui: stop MySQL dbstore2002 for maintenance - T151552
  • 09:10 marostegui: stop MySQL dbstore2001 for maintenance - T151552
  • 08:21 marostegui: Run optimize table on db1038 on all the revision,templatelinks and pagelinks tables - T154465
  • 08:00 marostegui: Run optimize table on a few large tables - db1015 - T153739
  • 07:58 elukey: chown www-data:www-data all the root:adm hhvm log files on mw codfw hosts (T132324)
  • 07:54 marostegui: Run optimize table on a few large tables - db1044 - T153826
  • 07:30 marostegui: Stop mysql db2048 and db2034 for maintenance - https://phabricator.wikimedia.org/T149553

2017-01-02

  • 23:09 hoo: Removed 2fa from an account, per T154450
  • 17:20 ema: iridium: removed /var/log/account/pacct.2[0-9].gz to free up more disk space
  • 16:05 ema: removing old kernels and kernel headers from iridium to free up some disk space
  • 13:24 elukey: powercycled mw1280, not pingable and mgmt console frozen

2017-01-01

  • 02:23 chasemp: labservices1001 'racadm serveraction hardreset'
  • 02:23 godog: reboot labservices1001, unresponsive on console and MCE/temperature alerts found on lithium
  • 00:56 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1286.eqiad.wmnet,service=apache2
  • 00:55 bd808: Restarted logstash on logstash1001 (T154388)
  • 00:46 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1286.eqiad.wmnet
  • 00:27 godog: dump core file and restart varnish-frontend on cp2026


Archives