Server Admin Log

From Wikitech
(Redirected from Server admin log)
Jump to: navigation, search

2016-12-07

  • 08:24 marostegui: Deploy ALTER table db2023 (codfw master) wikidatawiki.revision - T150644
  • 03:00 godog: bounce uwsgi-graphite-web on graphite1003, using a lot of memory
  • 02:58 godog: upload prometheus-node-exporter 0.13.0~rc.2 to carbon - T152580
  • 01:01 godog: dump debug and restart hhvm on mw1232
  • 00:49 ejegg: updated SmashPig from 36be698 to f143378
  • 00:48 maxsem@tin: Synchronized php-1.29.0-wmf.5/extensions/UploadWizard: https://gerrit.wikimedia.org/r/#/c/325625/ (duration: 00m 46s)
  • 00:45 ejegg: updated civicrm from 610364c to 85918f7
  • 00:04 godog: upgrade prometheus-varnish-exporter on cache boxes in codfw and eqiad - T150479

2016-12-06

  • 23:53 godog: upgrade prometheus-varnish-exporter on cache boxes in esams - T150479
  • 23:00 godog: upgrade grafana on labmon1001 - T152473
  • 22:43 ladsgroup@tin: Synchronized php-1.29.0-wmf.5/extensions/ORES/includes: Deploy gerrit:325624 (UBN! T152542) (duration: 00m 45s)
  • 22:43 Amir1: scap sync-dir php-1.29.0-wmf.5/extensions/ORES/includes/ 'Deploy gerrit:325624 (UBN! T152542)'
  • 22:32 ladsgroup@tin: Synchronized php-1.29.0-wmf.4/extensions/ORES/includes: Deploy gerrit:325624 (UBN! T152542) (duration: 00m 46s)
  • 22:32 Amir1: scap sync-dir php-1.29.0-wmf.4/extensions/ORES/includes/ 'Deploy gerrit:325624 (UBN! T152542)'
  • 21:21 hashar: CI: pushed a new Jessie image that is faster to boot, should slightly help the current load. T113342
  • 21:12 hashar: CI is working. Lot of changes caused it to reach the limit of queries Nodepool can do to wmflabs, the queue is being processed just fine though.
  • 21:07 ebernhardson@tin: Synchronized php-1.29.0-wmf.5/extensions/PageImages: T152155: Add job queue option to PageImages initImageData maint script (duration: 00m 57s)
  • 21:06 ebernhardson@tin: Synchronized php-1.29.0-wmf.4/extensions/PageImages: T152155: Add job queue option to PageImages initImageData maint script (duration: 00m 46s)
  • 21:05 hashar: CI has some kind of overloading since roughly 19:00UTC investigating.
  • 20:08 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.5
  • 19:58 ebernhardson@tin: Synchronized php-1.29.0-wmf.4/extensions/PageImages: T152155: Add job queue option to PageImages initImageData maint script (duration: 00m 45s)
  • 19:56 ebernhardson@tin: Synchronized php-1.29.0-wmf.5/extensions/PageImages: T152155: Add job queue option to PageImages initImageData maint script (duration: 00m 45s)
  • 19:51 cmjohnson1: asw2-d swapping cable fpc2 <-> fpc5 (paravoid)
  • 19:44 demon@tin: Finished scap: testwiki to wmf.5 to bootstrap (duration: 48m 58s)
  • 19:08 awight: update civicrm from 36a49c5 to d20c1c4
  • 18:55 demon@tin: Started scap: testwiki to wmf.5 to bootstrap
  • 15:07 zeljkof: EU SWAT finished
  • 15:07 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Re-enable centralauth-rename rights for when maintenance is done (T148242 T151155) (duration: 00m 43s)
  • 15:06 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Re-enable centralauth-rename rights for when maintenance is done (T148242 T151155) (duration: 00m 43s)
  • 14:49 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: [cirrus] enable BM25 on all but wikis with spaceless languages [step 1/3] (T152092) (duration: 00m 44s)
  • 14:49 zfilipin@tin: Synchronized tests/cirrusTest.php: SWAT: [cirrus] enable BM25 on all but wikis with spaceless languages [step 1/3] (T152092) (duration: 00m 43s)
  • 14:40 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Add a wiki configuration tag for configured language (T149755) (duration: 00m 47s)
  • 14:34 ema: removed varnish 4.1.3-1wm4 and varnishkafka 1.0.12-1 from experimental on carbon
  • 14:30 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create import sources list for hsbwiki (T152382) (duration: 00m 44s)
  • 14:29 zfilipin@tin: Synchronized wmf-config/abusefilter.php: SWAT: Disable wgAbuseFilterProfile at cswiki (T149899) (duration: 00m 44s)
  • 12:09 gehel: starting elasticsearch codfw cluster restart for Java 8 upgrade - T151325
  • 11:17 ema: varnish 4.1.4-1wm1 uploaded to carbon
  • 08:47 elukey: restarting hhvm on mw1285 (hhvm debug in /tmp/hhvm.100918.bt)
  • 05:47 aaron@tin: Synchronized wmf-config/CommonSettings.php: Turn off duplicate key reporting for parser cache (duration: 00m 46s)
  • 05:33 aaron@tin: Synchronized wmf-config/InitialiseSettings.php: Turn off duplicate key reporting for parser cache (2) (duration: 00m 44s)
  • 02:36 aaron@tin: Synchronized wmf-config/InitialiseSettings.php: Turn off duplicate key reporting for parser cache (duration: 02m 15s)
  • 02:12 aaron@tin: Synchronized wmf-config/InitialiseSettings.php: Added objectcache group (duration: 00m 58s)
  • 01:58 eileen1: CiviCRM update from 0f1e80b to 36a49c5
  • 01:00 maxsem@tin: Synchronized dblists/nowikidatadescriptiontaglines.dblist: https://gerrit.wikimedia.org/r/#/c/325366/2 (duration: 00m 43s)
  • 00:59 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/325366/2 (duration: 00m 44s)
  • 00:49 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/325370/ (duration: 00m 44s)
  • 00:49 awight: update fundraising-tools from 4398247 to 931c8cf
  • 00:35 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/325369/2 + https://gerrit.wikimedia.org/r/#/c/325447/ (duration: 00m 44s)
  • 00:16 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/325365/ (duration: 00m 45s)
  • 00:15 maxsem@tin: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/325365/ (duration: 00m 48s)
  • 00:14 maxsem@tin: Synchronized dblists/related-articles-footer-blacklisted-skins.dblist: https://gerrit.wikimedia.org/r/#/c/325365/ (duration: 00m 59s)

2016-12-05

  • 23:24 awight: update fundraising-tools from da80929 to 4398247
  • 21:59 eileen2: all dedupe jobs updated to have no-run window 2 hours earlier to reflect silverpop
  • 21:39 awight: Silverpop job pushed back to 0600 UTC
  • 21:05 mholloway-shell@tin: Finished deploy [mobileapps/deploy@ccc69fb]: Update mobileapps to 2fcd49d (duration: 01m 13s)
  • 21:04 mholloway-shell@tin: Starting deploy [mobileapps/deploy@ccc69fb]: Update mobileapps to 2fcd49d
  • 20:25 thcipriani@tin: Synchronized php-1.29.0-wmf.4/resources/lib/oojs-ui/oojs-ui-core.js: SWAT: OOjs UI: Backport I73f95965694ec7fb0fa9a474742286e1105e5c85 T151061 (duration: 00m 46s)
  • 20:23 thcipriani@tin: Synchronized php-1.29.0-wmf.4/vendor/oojs/oojs-ui/php/layouts/FieldsetLayout.php: SWAT: OOjs UI: Backport I73f95965694ec7fb0fa9a474742286e1105e5c85 T151061 (duration: 00m 46s)
  • 20:06 godog: run swift-thumb-stats to gather thumbnail stats on ms-fe1001
  • 19:49 demon@tin: Synchronized scap/plugins/clean.py: Clean all the things (duration: 00m 43s)
  • 19:45 thcipriani@tin: Synchronized wmf-config: SWAT: Turn off CirrusSearch interwiki load test T149740 PART III (duration: 00m 46s)
  • 19:44 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Turn off CirrusSearch interwiki load test T149740 PART II (duration: 00m 47s)
  • 19:41 thcipriani@tin: Synchronized wmf-config/CirrusSearch-production.php: SWAT: Turn off CirrusSearch interwiki load test T149740 PART I (duration: 00m 44s)
  • 19:19 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Add b/c for the $wgEchoConfig -> $wgEchoEventLoggingSchema rename in I2f9d5d111f (duration: 00m 47s)
  • 19:10 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Re-enable the Flow beta feature T138310 (duration: 00m 45s)
  • 18:41 jynus: stopping for a few minutes replication on db1048 to change dbstore1002's master
  • 18:23 gehel@tin: Finished deploy [wdqs/wdqs@2b1e1fd]: (no message) (duration: 01m 27s)
  • 18:22 gehel@tin: Starting deploy [wdqs/wdqs@2b1e1fd]: (no message)
  • 18:03 gehel: upgrading Wikidata query service to Java 8
  • 17:19 elukey: restarting hhvm on mw1268 (hhvm-debug in /tmp/hhvm.16827.bt.)
  • 17:16 elukey: restarting hhvm on mw1285 (hhvm-debug in /tmp/hhvm.140129.bt.)
  • 16:50 elukey: added nagios process check alarms for varnishakfka-statsv and varnishkafka-eventlogging on cache::text hosts
  • 16:12 jynus: reloading haproxy on dbproxy1011 to catch the master going back up
  • 16:08 marostegui: Stop mysql db2048 for maintenance - T149553
  • 16:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 - T148967 (duration: 00m 49s)
  • 16:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 - T148967 (duration: 00m 59s)
  • 15:25 marostegui: Set disk 32:2 as failed db1048 - T152411
  • 15:08 marostegui: db1048 - set disk 32:0 offline
  • 14:56 jynus: running CHANGE MASTER ON db2012 to base it on db1043
  • 14:43 hashar: European SWAT done
  • 14:42 marostegui: Restart MySQL labsdb1011 to disable parallel replication
  • 14:41 aude@tin: Synchronized wmf-config/Wikibase.php: Add interwiki sorting config from Wikibase (duration: 00m 47s)
  • 14:30 hashar@tin: Synchronized wmf-config/timeline.php: Drop ttf from $wgTimelineFontFile and bump epoch - T22825 (duration: 00m 47s)
  • 14:26 hashar@tin: Synchronized fonts: For T22825 (duration: 00m 47s)
  • 14:23 hashar@tin: Synchronized wmf-config/CommonSettings.php: Move EasyTimeline config to its own file - T22825 (duration: 00m 44s)
  • 14:21 hashar@tin: Synchronized wmf-config/timeline.php: Move EasyTimeline config to its own file - T22825 (duration: 00m 44s)
  • 14:15 hashar@tin: Synchronized wmf-config/abusefilter.php: Enable $wgAbuseFilterProfile for eswiki - T152087 (duration: 00m 44s)
  • 14:11 hashar@tin: Synchronized wmf-config/interwiki.php: Update interwiki map for fiwikivoyage - T152201 (duration: 00m 46s)
  • 14:08 elukey: depooling mw1239 for maintenance (T148421)
  • 13:50 addshore: ElectronPdfService extension deploy window finished
  • 13:48 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: {{gerrit|325298}} T150944 Disable ElectronPdfService extension on mediawikiwiki until messages are fixed (duration: 00m 45s)
  • 13:48 marostegui: Deploy alter table db1087 - dewiki.revision - T148967
  • 13:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 - T148967 (duration: 00m 44s)
  • 13:41 addshore@tin: Finished scap: Add ElectronPdfService to extensions-list, sync-l10n seems to have a bug. (Take 2) (duration: 53m 13s)
  • 13:04 marostegui: Stopping mysql labsdb1010 and labsdb1009 for maintenance - T152194
  • 12:47 addshore@tin: Started scap: Add ElectronPdfService to extensions-list, sync-l10n seems to have a bug. (Take 2)
  • 12:45 addshore@tin: scap aborted: Add ElectronPdfService to extensions-list, sync-l10n seems to have a bug (duration: 00m 47s)
  • 12:44 addshore@tin: Started scap: Add ElectronPdfService to extensions-list, sync-l10n seems to have a bug
  • 12:23 addshore@tin: Synchronized wmf-config/CommonSettings.php: {{gerrit|324487}} T150944 Enable ElectronPdfService extension on test wikis & mediawikiwiki PT2 (duration: 00m 44s)
  • 12:22 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: {{gerrit|324487}} T150944 Enable ElectronPdfService extension on test wikis & mediawikiwiki PT1 (duration: 00m 45s)
  • 12:18 addshore@tin: Synchronized php-1.29.0-wmf.4/extensions/ElectronPdfService/specials/SpecialElectronPdf.php: {{gerrit|324791}} Use prefixedDbKey when redirecting to Electron (duration: 00m 45s)
  • 11:26 Reedy: that was "Avoid using CONTENT_MODEL_FLOW_BOARD" for T152379
  • 11:26 reedy@tin: Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 46s)
  • 10:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 - T148967 (duration: 00m 57s)
  • 10:38 joal@tin: Finished deploy [analytics/refinery@2c3b78c]: (no message) (duration: 02m 19s)
  • 10:36 joal@tin: Starting deploy [analytics/refinery@2c3b78c]: (no message)
  • 10:27 gehel: enabling trace logging on indices recovery on elasticsearch codfw - T145065
  • 08:01 marostegui: Stop MySQL labsdb1010 - maintenance T152194
  • 07:48 marostegui: Deploy alter table db1082 - dewiki.revision - T148967
  • 07:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 - T148967 (duration: 02m 12s)
  • 02:14 godog: add --skip-ssl to mysql commands on eventlogging_sync on dbstore1002 - T152364
  • 02:12 godog: add --skip-ssl to mysql commands on eventlogging_sync on db1047 - T152364
  • 01:57 godog: move /var/log/eventlogging_sync.err to a symlink on /srv on db1047

2016-12-04

  • 20:09 eileen1: update civicrm from f8f9263 to 0f1e80b
  • 02:42 godog: powercycle labservices1001
  • 02:30 godog: silence checker.tools.wmflabs.org for 2h

2016-12-03

  • 18:44 krenair@tin: Synchronized wmf-config/mobile-labs.php: no-op in prod, this file is not loaded, for https://gerrit.wikimedia.org/r/#/c/325116/ (duration: 00m 45s)
  • 18:43 krenair@tin: Synchronized wmf-config/CommonSettings-labs.php: no-op in prod, this file is not loaded, for https://gerrit.wikimedia.org/r/#/c/325119/ (duration: 00m 45s)
  • 03:08 catrope@tin: Synchronized php-1.29.0-wmf.4/extensions/PageImages: SWAT: return any image, not just the non-free image (duration: 01m 31s)
  • 02:20 mutante: iridium - starting fresh rsync of /srv/repos over to phab2001 as backup
  • 02:19 mutante: phab2001 - deleted outdated contents of /srv/repos

2016-12-02

  • 23:33 godog: roll-restart pdfrender on sbc after applying fonts.conf firejail whitelist
  • 22:41 mutante: restarting salt-minion on all analytics servers
  • 22:24 mutante: restarting salt-minion on all appservers (via debdeploy -s all-mw)
  • 22:19 mutante: restarting salt-minion on mw-canary
  • 21:05 mutante: scb2004 - depooling, restarting services, repooling
  • 20:42 godog: roll-restart pdfrender on scb1*
  • 20:41 mutante: scb2003 - depooling, restarting services, repooling
  • 20:39 godog: upgrade thumbor to 0.1.31 on thumbor100[12]
  • 20:31 mutante: scb2002 - depooling, restarting services, repooling
  • 20:18 mutante: scb2001 - depooling, restarting services, repooling
  • 20:03 mutante: scb1004 - depooling, restarting services, repooling
  • 19:35 mutante: scb1003 - depooling, restarting services, repooling
  • 19:31 dzahn@puppetmaster1001: conftool action : set/pooled=yes; selector: name=scb1002.eqiad.wmnet,service=apertium
  • 19:30 dzahn@puppetmaster1001: conftool action : set/pooled=no; selector: name=scb1002.eqiad.wmnet,service=apertium
  • 19:22 mutante: scb1002 - depooling, restarting services, repooling
  • 19:09 mutante: scb1001 - re-pooling all services
  • 19:06 mutante: scb1001 - restarting all (-oid) services
  • 19:04 mutante: depooling all services on scb1001 for service restart
  • 19:03 godog: rollback python-thumbor-wikimedia to 0.1.29
  • 18:48 godog: deploy thumbor 0.1.30 to thumbor100[12]
  • 17:18 mobrovac: restbase deploy end of 1651e35
  • 16:49 mobrovac: restbase deploy start of 1651e35
  • 15:39 mobrovac@tin: Finished deploy [changeprop/deploy@8f53dc6]: (no message) (duration: 00m 54s)
  • 15:38 mobrovac@tin: Starting deploy [changeprop/deploy@8f53dc6]: (no message)
  • 15:02 _joe_: rolling restart of API appservers to catch up with the new jemalloc arenas config T151702
  • 14:44 oblivian@tin: Synchronized php-1.29.0-wmf.4/api.php: Reverting the API block after template has been protected (duration: 00m 45s)
  • 14:33 marostegui: Deploy alter table wikidatawiki.revision in db2066 - T150644
  • 13:45 marostegui: Deploy alter table wikidatawiki.revision in db2059 -T150644
  • 13:23 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 (duration: 00m 45s)
  • 11:56 jynus: mysql restart for db1060 T152188
  • 11:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Really depool db1060 && pool db1074 with full load after warmup (duration: 00m 44s)
  • 11:40 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 (duration: 00m 45s)
  • 11:39 marostegui: Stop MySQL db1095 for maintenance - T150802
  • 11:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1071 - T148967 (duration: 00m 44s)
  • 11:11 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1074 with low load (duration: 00m 45s)
  • 11:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1071 - T148967 (duration: 00m 45s)
  • 10:36 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2045 - T150644 (duration: 00m 44s)
  • 10:34 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1076 with full load (duration: 00m 44s)
  • 10:30 marostegui: Deploy alter table wikidatawiki.revision in db2052 -T150644
  • 10:28 jynus: mysql restart and upgrade for db1074 T152188
  • 10:09 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 (duration: 00m 45s)
  • 09:52 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1076 with low load (duration: 00m 49s)
  • 09:18 jynus: mysql restart and upgrade for db1076 T152188
  • 09:18 marostegui: Deploy alter table wikidatawiki.revision in db2045 -T150644
  • 09:16 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2045 - T150644 (duration: 00m 47s)
  • 09:03 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 (duration: 00m 48s)
  • 08:43 oblivian@tin: Synchronized php-1.29.0-wmf.4/api.php: API bandaid (duration: 00m 48s)
  • 08:37 elukey: restarting hhvm (/usr/local/bin/restart-hhvm) on G@cluster:api_appserver and G@site:eqiad (batch 10%)
  • 07:59 marostegui: Deploy alter table db1071 - dewiki.revision - T148967
  • 07:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1071 - T148967 (duration: 02m 22s)
  • 05:55 mutante: thumbor - also got upgraded to imagemagick deb8u6+wmf1
  • 05:49 mutante: imagescalers - upgraded imagemagick 8:6.8.9.9-5+deb8u5+wmf1 -> 8:6.8.9.9-5+deb8u6+wmf1 (https://www.debian.org/security/2016/dsa-3726)
  • 03:39 mutante: mw1293 - upgrade imagemagick to 8:6.8.9.9-5+deb8u6+wmf1
  • 00:47 ebernhardson@tin: Synchronized php-1.29.0-wmf.4/extensions/PageImages/maintenance/initImageData.php: T152155: Maintenance script updates for re-initializing page images (duration: 00m 44s)
  • 00:41 demon@tin: Synchronized wmf-config: Removing some old ExtensionMessages files (duration: 00m 47s)
  • 00:30 aaron@tin: Synchronized wmf-config/CommonSettings.php: Bump $wgJobBackoffThrottling for cache purges (duration: 00m 45s)
  • 00:20 ebernhardson@tin: Synchronized php-1.29.0-wmf.4/extensions/PageImages/includes/ApiQueryPageImages.php: T152155: Thumbnails are not showing in search on multiple platforms (duration: 00m 45s)
  • 00:00 bsitzmann@tin: Finished deploy [mobileapps/deploy@b545699]: Update mobileapps to 04a6e84 (duration: 01m 17s)

2016-12-01

  • 23:58 bsitzmann@tin: Starting deploy [mobileapps/deploy@b545699]: Update mobileapps to 04a6e84
  • 22:12 godog: upgrade scap to 3.4.1-1 on tin and mira
  • 22:07 Krenair: Dropped ar_usertext_timestamp indexes from the archive tables of all 4 wikis created since September (olowiki, ecwikimedia, projectcomwiki, fiwikivoyage) - replaced with usertext_timestamp index to match all older wikis. MW fix to follow. see -tech
  • 21:15 ottomata: rolling bounce of main kafka brokers and then eventbus service to pick up api_version change, and to apply min.insync.replicas=1 to kafka
  • 21:01 ottomata: bouncing kafka broker on kafka1002 to troubleshoot production only missing messages
  • 20:55 otto@tin: Finished deploy [eventlogging/eventbus@948765d]: accept api_version parameter (duration: 00m 09s)
  • 20:55 otto@tin: Starting deploy [eventlogging/eventbus@948765d]: accept api_version parameter
  • 20:53 otto@tin: Finished deploy [eventlogging/eventbus@948765d]: (no message) (duration: 00m 08s)
  • 20:53 otto@tin: Starting deploy [eventlogging/eventbus@948765d]: (no message)
  • 20:27 ottomata: bouncing kafka broker on kafka1018 to test config changes to eventlogging analytics kafka clients
  • 20:03 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.4
  • 19:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool pc1006 (duration: 00m 46s)
  • 19:48 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2092.codfw.wmnet
  • 19:45 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool pc1006 (duration: 00m 44s)
  • 19:34 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool pc1005 (duration: 00m 46s)
  • 19:26 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool pc1005 (duration: 00m 45s)
  • 19:17 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool pc1004 (duration: 00m 44s)
  • 19:08 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool pc1004 (duration: 00m 45s)
  • 18:49 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1019 (duration: 00m 45s)
  • 18:20 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1015 (duration: 00m 45s)
  • 18:15 jynus: mysql restart and general upgrade for pc2006 T152029
  • 18:02 jynus: mysql restart and general upgrade for pc2005 T152029
  • 17:23 mobrovac@tin: Finished deploy [electron-render/deploy@d6f7044]: (no message) (duration: 01m 02s)
  • 17:22 mobrovac@tin: Starting deploy [electron-render/deploy@d6f7044]: (no message)
  • 17:15 jynus: mysql restart and general upgrade for pc2004 T152029
  • 17:05 paravoid: cr1-eqiad: re-enabling ae4 and its members (links to asw2-d-eqiad)
  • 17:01 otto@tin: Finished deploy [eventlogging/analytics@948765d]: (no message) (duration: 00m 03s)
  • 17:01 otto@tin: Starting deploy [eventlogging/analytics@948765d]: (no message)
  • 16:58 jynus: mysql restart for es1019 T151995
  • 16:54 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1019 (duration: 00m 45s)
  • 16:29 chasemp: labsdb: maintain-views --databases fiwikivoyage --debug
  • 16:19 jynus: mysql restart for es1015 T151995
  • 16:12 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1018; depool es1015 (duration: 01m 00s)
  • 15:56 paravoid: rebooting asw2-d-eqiad again
  • 15:51 marostegui: Deploy alter table wikidatawiki.revision in dbstore2002 -T150644
  • 15:38 marostegui: Stopping mysql and shutting down db2048 for maintenance - T149553
  • 15:08 marostegui: DNS change for es2 and es3 after the master switchovers - T151995
  • 15:06 gehel: upgrading logstash to Java 8, including rolling restart - T151325
  • 14:52 hashar: Nodepool / CI are processing again
  • 14:48 jynus@tin: Synchronized wmf-config/db-eqiad.php: switchover es3 master (eqiad) es1019 -> es1014 (duration: 00m 44s)
  • 14:46 elukey: restarting kafka on kafka100[123] (EventBus) for openjdk upgrades
  • 14:39 zeljkof: EU SWAT finished
  • 14:38 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow contentadmin and sysop to add/remove autopatrolled users on Wikitech (duration: 00m 50s)
  • 14:35 mobrovac: restbase deployed 91551bf
  • 14:19 elukey: restarting kafka also on kafka2003
  • 14:17 elukey: restarting kafka on kafka200[12] for openjdk upgrades
  • 14:14 paravoid: rebooting asw2-d-eqiad
  • 13:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1070 - T148967 (duration: 00m 45s)
  • 13:52 jynus: changing es3 eqiad replication topology in preparation for master switchover
  • 13:47 hashar: Nodepool is out of instances due to OpenStack API spurting a nova.exception.ImageNotAuthorized HTTP 500
  • 13:21 paravoid: Upgrading asw2-d-eqiad to JunOS 15.1R5 (T133387)
  • 13:19 paravoid: cr1-eqiad: setting ae4 and its members (links to asw2-d-eqiad) to disable
  • 12:49 reedy@tin: Synchronized php-1.29.0-wmf.3/api.php: Remove oris bandaid T151702 (duration: 00m 46s)
  • 10:39 ariel@tin: Finished deploy [dumps/dumps@a3801fa]: second try on db_user fixup (duration: 00m 01s)
  • 10:39 ariel@tin: Starting deploy [dumps/dumps@a3801fa]: second try on db_user fixup
  • 10:25 elukey: removed --debug flag to the puppet compiler output
  • 10:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: switchover es2 master (eqiad) es1015 -> es1011 (duration: 00m 45s)
  • 09:54 elukey: added --debug to the puppet compiler options in Jenkins
  • 09:31 jynus: chaning es2 eqiad replication topology in preparation for master switchover
  • 09:14 jynus: mysql restart and upgrade for es1018 T151995
  • 09:06 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1018 (duration: 00m 48s)
  • 08:57 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1014 (duration: 00m 45s)
  • 08:51 ariel@tin: Finished deploy [dumps/dumps@2b35e77]: less logging, fix regression for db_user/password retrieval (duration: 00m 03s)
  • 08:50 ariel@tin: Starting deploy [dumps/dumps@2b35e77]: less logging, fix regression for db_user/password retrieval
  • 08:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1011 (duration: 00m 48s)
  • 08:10 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1016 (duration: 00m 45s)
  • 07:57 elukey@tin: Finished deploy [analytics/pivot/deploy@0513a6e]: (no message) (duration: 00m 02s)
  • 07:57 elukey@tin: Starting deploy [analytics/pivot/deploy@0513a6e]: (no message)
  • 07:51 jynus: mysql restart for es1014 T151995
  • 07:50 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1014 (duration: 00m 44s)
  • 07:38 jynus: mysql restart for es1011 T151995
  • 07:37 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1011 (duration: 00m 48s)
  • 07:32 marostegui: Deploy alter table db1070 - dewiki.revision - T148967
  • 07:24 marostegui: Stop replication db1095 (sanitarium2) on s3 instance - T150802
  • 07:22 jynus: mysql upgrade and restart for es1016 T151995
  • 07:12 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1016 (duration: 00m 45s)
  • 07:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1070 - T148967 (duration: 00m 48s)
  • 07:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1070 - T148967 (duration: 02m 31s)
  • 06:50 marostegui: Deploy alter table wikidatawiki.revision in codfw hosts only - T150644
  • 06:42 akosiaris: performed apt-get clean and minor log file cleanup on silver
  • 01:48 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Set $wgUserEmailUseReplyTo = true; on group0 wikis - T66795 (duration: 00m 46s)
  • 01:20 hoo: Ran "CREATE TABLE wbc_entity_usage LIKE dewikivoyage.wbc_entity_usage;" for fiwikivoyage on db1075 (s3 master) (Related: T151570)
  • 01:15 krinkle@tin: Synchronized php-1.29.0-wmf.4/extensions/CentralNotice/extension.json: I0224288 (duration: 00m 45s)
  • 01:14 krinkle@tin: Synchronized php-1.29.0-wmf.4/extensions/Citoid/extension.json: I022428 (duration: 00m 46s)

2016-11-30

  • 22:55 ariel@tin: Finished deploy [dumps/dumps@04a57c5]: (no message) (duration: 00m 01s)
  • 22:55 ariel@tin: Starting deploy [dumps/dumps@04a57c5]: (no message)
  • 22:42 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1017 (duration: 00m 45s)
  • 22:16 Krenair: Ran the dumpInterwiki.php script but it just produced the existing data, so nothing to do there
  • 22:13 Krenair: Ran mwscript extensions/WikimediaMaintenance/filebackend/setZoneAccess.php fiwikivoyage --backend=local-multiwrite
  • 22:12 krenair@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/323695/ (duration: 00m 49s)
  • 22:11 krenair@tin: rebuilt wikiversions.php and synchronized wikiversions files: https://gerrit.wikimedia.org/r/#/c/323695/
  • 22:11 krenair@tin: Synchronized dblists: https://gerrit.wikimedia.org/r/#/c/323695/ (duration: 00m 49s)
  • 22:08 bsitzmann@tin: Finished deploy [mobileapps/deploy@d004bb4]: mobileapps deployment: 'Update service-mobileapp-node to 14deac7' (duration: 01m 09s)
  • 22:07 mutante: re-enabling puppet on einsteinium, starting ircecho
  • 22:06 bsitzmann@tin: Starting deploy [mobileapps/deploy@d004bb4]: mobileapps deployment: 'Update service-mobileapp-node to 14deac7'
  • 22:03 Krenair: Ran mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki fi wikivoyage fiwikivoyage fi.wikivoyage.org
  • 21:36 mutante: phab/iridium: deleting tmp files older than 2 weeks
  • 21:17 mutante: temp. stopping ircecho
  • 20:55 milimetric@tin: Finished deploy [analytics/refinery@9cd8845]: (no message) (duration: 03m 02s)
  • 20:52 milimetric@tin: Starting deploy [analytics/refinery@9cd8845]: (no message)
  • 20:24 ejegg: re-enabled CiviMail bounce fetching job
  • 20:14 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.4
  • 20:11 ejegg: enabled CiviMail record creation at 100% for thank you letters
  • 20:07 jynus: mysql restart and general upgrade for es1017 T151995
  • 19:50 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1013; depool es1017 (duration: 00m 45s)
  • 19:39 thcipriani@tin: Synchronized php-1.29.0-wmf.4/extensions/ORES/modules/ext.ores.styles.css: SWAT: Use darker shade of yellow (duration: 00m 45s)
  • 19:34 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add "softest" values for ores T150224 (duration: 00m 46s)
  • 19:04 demon@tin: Synchronized php-1.29.0-wmf.4/includes/specials/SpecialUserrights.php: Ia0e583a5 (duration: 00m 45s)
  • 18:43 jynus: mysql restart and general upgrade for es1013 T151995
  • 18:39 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1013 (duration: 00m 45s)
  • 18:27 mutante: last log message was about "labtestnet2001" not "labnet2001"
  • 18:25 mutante: labnet2001 - ran low on disk, gzipped large /var/log/upstart/nova-api.log.1 / apt-get clean
  • 18:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1012 (duration: 00m 46s)
  • 17:45 demon@tin: Synchronized wmf-config/CommonSettings.php: extdist stuffs (duration: 00m 46s)
  • 17:20 jynus: mysql restart and general upgrade for es1012 T151995
  • 17:02 mutante: gerrit restarting to disable gc, config change 323655)
  • 17:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1012 (duration: 00m 53s)
  • 16:18 _joe_: rolling upgrade of HHVM on the jobrunner, terbium/tin/wasat/mira
  • 16:12 jynus: mysql restart and general upgrade for es2018 T151995
  • 15:50 jynus: mysql restart and general upgrade for es2016 T151995
  • 15:44 jynus: stopping for 24 hours cross-dc replication on shards es2,es3 codfw->eqiad (es1015, es1019)
  • 15:29 jynus: mysql restart and general upgrade for es2013 T151995
  • 15:14 _joe_: upgrading HHVM on the imagescalers
  • 15:11 jynus: mysql restart and general upgrade for es2012 T151995
  • 14:44 marostegui: Stop MySQL and shutdown db2048 for maintenance - T149553
  • 14:44 Dereckson: EU SWAT done
  • 14:42 jynus: mysql restart and general upgrade for es2011 T151995
  • 14:42 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Add WMF staff local groups to $wmgPrivilegedGroups (T150951) (duration: 00m 46s)
  • 14:40 dereckson@tin: Synchronized php-1.29.0-wmf.4/extensions/ContentTranslation/modules/tools/ext.cx.tools.template.js: Allow template editor even if parameter mapping fails completely (T151868) (duration: 00m 45s)
  • 14:23 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Add dashboard.wikiedu.org IPv6 to en.wikipedia rate limit exempt (T151823) (duration: 00m 45s)
  • 14:17 dereckson@tin: Synchronized static/images/project-logos: New project logos for wiki to create (arbcom cs, fi.wikivoyage) (duration: 00m 46s)
  • 14:06 jynus: mysql restart and general upgrade for es2014 T151995
  • 13:55 Dereckson: Reset user email for projectcomwiki initial account "Mjohnson (WMF)"
  • 13:15 jynus: mysql restart and general upgrade for es2017 T151995
  • 13:10 reedy@tin: Synchronized php-1.29.0-wmf.4/extensions/CentralAuth/maintenance/populateLocalAndGlobalIds.php: More skipping (duration: 00m 44s)
  • 13:09 reedy@tin: Synchronized php-1.29.0-wmf.3/extensions/CentralAuth/maintenance/populateLocalAndGlobalIds.php: More skipping (duration: 01m 34s)
  • 13:08 jynus: mysql restart and general upgrade for es2019 T151995
  • 12:55 ema: bumping vsl log buffer on cp3032 (depooled) -- T151643
  • 12:29 jynus: mysql restart and general upgrade for es2015 T151995
  • 11:44 _joe_: upgrading HHVM across appservers in eqiad
  • 11:28 ariel@tin: Finished deploy [dumps/dumps@50689c8]: (no message) (duration: 00m 07s)
  • 11:28 ariel@tin: Starting deploy [dumps/dumps@50689c8]: (no message)
  • 11:22 _joe_: repooling mw1276, after tests for T151702
  • 11:16 marostegui: Stop replication s3 - db1095 - maintenance - T147052
  • 11:07 _joe_: rolling upgrade of hhvm on the eqiad api cluster
  • 09:42 marostegui: Stop mysql on db2048 for maintenance - https://phabricator.wikimedia.org/T149553
  • 08:44 _joe_: stopped dedicated commonswiki jobrunner T151196
  • 07:17 marostegui: Deploy alter table dbstore1002 - dewiki.revision - T148967
  • 07:06 marostegui: Stop mysql db2048 maintenance - T149553
  • 02:07 hoo: Updated Wikidata's property suggester with data from Monday's json dump and applied the T132839 workarounds
  • 00:53 kaldari@tin: Synchronized wmf-config/InitialiseSettings.php: sync InitialiseSettings to test cookie blocking on Test Wikipedia (duration: 00m 45s)
  • 00:52 TimStarling: on mw1276: tuning jemalloc, will restart hhvm several times, running it in a terminal
  • 00:40 mutante: phab2001 - enabled puppet to bring it up2date with a various changes

2016-11-29

  • 23:41 ejegg: updated fundraising dashboard from 43039fd to 7ce4f03
  • 23:04 demon@tin: Synchronized wmf-config/InitialiseSettings-labs.php: prod no-op (duration: 00m 46s)
  • 22:58 demon@tin: Synchronized php-1.29.0-wmf.4/includes/debug/logger/monolog: logging fixes (duration: 00m 45s)
  • 22:56 demon@tin: Synchronized php-1.29.0-wmf.4/autoload.php: logging fixes (duration: 00m 45s)
  • 22:49 ejegg: extended donation queue consumer duty cycle from 90 to 105 seconds
  • 22:33 ejegg: disabled CiviMail activity creation for thank you sender
  • 22:23 demon@tin: Finished scap: Ok, back to normal now (duration: 26m 14s)
  • 21:57 demon@tin: Started scap: Ok, back to normal now
  • 21:55 demon@tin: scap aborted: probably won't work, no-op (duration: 01m 03s)
  • 21:54 demon@tin: Started scap: probably won't work, no-op
  • 21:48 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 back to wmf.3
  • 21:47 demon@tin: scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="cawikibooks" --list-file="/srv/mediawiki-staging/wmf-config/extension-list" --output="/tmp/tmp.cPclJsf3pO" ' returned non-zero exit status 139 (duration: 00m 23s)
  • 21:47 demon@tin: Started scap: mw.org back to wmf.3
  • 21:44 demon@tin: scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="cawikibooks" --list-file="/srv/mediawiki-staging/wmf-config/extension-list" --output="/tmp/tmp.v5ELy9acbV" ' returned non-zero exit status 139 (duration: 00m 23s)
  • 21:44 demon@tin: Started scap: rebuild l10n for wmf.4 -- attempt #3
  • 21:42 demon@tin: scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="cawikibooks" --list-file="/srv/mediawiki-staging/wmf-config/extension-list" --output="/tmp/tmp.6ehbXfKDfD" ' returned non-zero exit status 139 (duration: 00m 23s)
  • 21:42 demon@tin: Started scap: rebuild l10n for wmf.4 -- attempt #2
  • 21:41 demon@tin: scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="cawikibooks" --list-file="/srv/mediawiki-staging/wmf-config/extension-list" --output="/tmp/tmp.wDhBxhXGtC" ' returned non-zero exit status 139 (duration: 00m 57s)
  • 21:41 demon@tin: Started scap: rebuild l10n for wmf.4
  • 21:40 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: helps to actually sync updated files
  • 21:39 mutante: servermon - manually running "make_updates" command from cron for debugging - failed with a mysql_excetpion, lock wait timeout exceeded
  • 21:21 mutante: phab2001 - upgrading scap and other packages (we need to get puppet running here again)
  • 20:30 demon@tin: Finished scap: wmf.4 for fun and profit (duration: 20m 55s)
  • 20:26 awight: new thank-you letter deployed...
  • 20:23 awight: update civicrm from 8c76f43 to f8f9263
  • 20:09 demon@tin: Started scap: wmf.4 for fun and profit
  • 20:07 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: wikitech cloudadmin: remove right that no longer exists (duration: 00m 45s)
  • 20:05 twentyafterfour: deploying D478 (refs T151844 )
  • 20:05 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow on all namespaces on meta (T150245) (duration: 00m 44s)
  • 20:00 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Allow a wiki to use and in all namespaces PART II (duration: 00m 48s)
  • 19:58 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow a wiki to use and in all namespaces PART I (duration: 00m 51s)
  • 19:44 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add autopatrolled group for wikitech (duration: 00m 45s)
  • 19:30 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add "global-renamer" to the list of privileged wiki groups T150951 (duration: 00m 45s)
  • 19:23 thcipriani@tin: Synchronized wmf-config/abusefilter.php: SWAT: Set "abusefilter-modify-global" to stewards locally at Meta-Wiki T150752 (duration: 00m 45s)
  • 19:15 thcipriani@tin: Synchronized wmf-config/flaggedrevs.php: SWAT: Remove FlaggedRevs autopromotion function at eowiki T150591 (duration: 01m 37s)
  • 18:02 marostegui: Stopping replication db1095 (new sanitarium, not in use) on s1 instance for maintenance - T150802
  • 17:36 otto@tin: Finished deploy [analytics/pivot/deploy@0513a6e]: (no message) (duration: 00m 08s)
  • 17:36 otto@tin: Starting deploy [analytics/pivot/deploy@0513a6e]: (no message)
  • 16:49 ema: setting gethdr_extrachance=0 on all cp* hosts T150247
  • 16:31 mutante: upgrading libicu on mc1020-1036
  • 16:25 ema: doubling workspace_backend on all cp* hosts
  • 15:59 marostegui: Stop temporarily stop MySQL db2070 maintenance - T149553
  • 15:30 marostegui: Stop mysql and shutdown db2048 and db2034 for maintenance - T149553
  • 14:47 marostegui: Deploye alter table dbstore1001 - dewiki.revision - T148967
  • 14:35 reedy@tin: Synchronized php-1.29.0-wmf.3/api.php: Resync after making into gerrit commit (duration: 00m 45s)
  • 14:19 gehel@tin: Finished deploy [kartotherian/deploy@f3805c4]: (no message) (duration: 20m 49s)
  • 13:59 gehel@tin: Starting deploy [kartotherian/deploy@f3805c4]: (no message)
  • 13:54 gehel: deploying kartotherian config with scap3 - T150021
  • 13:46 _joe_: rolling restart of HHVM in the api cluster
  • 13:44 reedy@tin: Synchronized php-1.29.0-wmf.3/api.php: Redeploy ori bandaid for T151702 (duration: 00m 44s)
  • 13:43 hashar: updating jouncebot so it properly reclaim its nick ( T150916 https://gerrit.wikimedia.org/r/#/c/324025/ )
  • 13:27 Reedy: deleted oathauth row on wikitech for user Shizhao per T144805
  • 13:14 jynus: restarting db1095's mysql for T151752
  • 12:05 elukey: complete rolling restart of apache in eqiad
  • 11:48 elukey: re-enable puppet on mw1* hosts and apply Apache config change (https://gerrit.wikimedia.org/r/#/c/314519)
  • 11:23 elukey: disabled puppet on mw1* hosts as pre-step for https://gerrit.wikimedia.org/r/#/c/314519
  • 10:53 _joe_: restarting pybal on low-traffic codfw
  • 10:50 _joe_: restarting pybal on low-traffic eqiad
  • 10:40 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: cluster=scb,service=pdfrender
  • 10:02 _joe_: upgrading firejail on all other scb servers in eqiad
  • 09:48 _joe_: upgrading firejail on scb1004, restarting all dependent services
  • 06:41 marostegui: Stopping replication db1095 - s1 instance for maintenance - T150802
  • 06:37 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2048 - T149553 (duration: 00m 46s)
  • 03:09 demon@tin: Synchronized docroot/foundation: rm old legalcode junk (duration: 01m 33s)
  • 03:03 mutante: druid1001 - restarting all druid services
  • 02:09 mutante: rolling restart of hhvm service across eqiad
  • 01:54 mutante: terbium, mw1260: purging libicu48 package in 'rc' status
  • 01:52 mutante: deploying icu upgrade on all eqiad mw servers
  • 01:46 mutante: restarting hhvm service across codfw
  • 01:45 demon@tin: Synchronized wmf-config/CommonSettings.php: fix implicit message parsing error (duration: 00m 48s)
  • 01:44 thcipriani@tin: Synchronized README: scap 3.4 sync file (duration: 00m 53s)
  • 01:42 godog: upgrade prometheus-varnish-exporter on cache boxes in ulsfo T150479
  • 01:34 mutante: mw2152 - remove libicu48 (for some reason this one host was different from all the others)
  • 01:27 godog: upgrade prometheus-varnish-exporter on cache_misc/ulsfo T150479
  • 01:24 mutante: rolling out security upgrades for libicu52 [DSA 3725-1] (CVE-2014-9911 CVE-2015-2632 CVE-2015-4844 CVE-2016-0494 CVE-2016-6293 CVE-2016-7415)
  • 01:17 demon@tin: Synchronized w/MWVersion.php: bleh, still used (duration: 00m 47s)
  • 01:05 demon@tin: Synchronized w/: remove old MWVersion entry point (duration: 00m 47s)
  • 01:04 demon@tin: Synchronized rpc/RunJobs.php: more relative mwversion stuff (duration: 00m 43s)
  • 01:03 demon@tin: Synchronized multiversion/: use MWVersion relative path (duration: 00m 59s)
  • 01:02 demon@tin: Synchronized scap/plugins/clean.py: typofix (duration: 00m 43s)
  • 00:42 maxsem@tin: Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/315985/ (duration: 00m 51s)
  • 00:33 maxsem@tin: Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/314748/ (duration: 00m 46s)
  • 00:20 mutante: re-enabled icinga notifications for labtest* services (first double checked they are _not_ paging anymore) (T120047)
  • 00:17 demon@tin: Finished scap: moving some stuff around, pruned old branches too (duration: 20m 29s)

2016-11-28

  • 23:56 demon@tin: Started scap: moving some stuff around, pruned old branches too
  • 23:33 demon@tin: Synchronized w/: (no message) (duration: 00m 48s)
  • 23:27 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Beta Cluster only (duration: 00m 44s)
  • 23:20 demon@tin: Synchronized scap/plugins/clean.py: Completeness, testing, etc (duration: 00m 43s)
  • 23:14 reedy@tin: Synchronized wmf-config/CommonSettings.php: fix undefined user (duration: 00m 48s)
  • 23:08 ejegg: updated payments-wiki from d7ed144 to bd8012c
  • 22:49 eileen2: disable job Project Dedupe CiviCRM contacts (name-match)
  • 22:43 reedy@tin: Synchronized wmf-config/CommonSettings.php: Log users elevated groups on login attempts (duration: 00m 47s)
  • 22:07 reedy@tin: Synchronized wmf-config/CommonSettings.php: Only add oathauth-enable if the group exists on the wiki (duration: 00m 43s)
  • 22:01 reedy@tin: Synchronized wmf-config/CommonSettings.php: OATHAuth for more groups and bump their password requirements (duration: 00m 45s)
  • 22:00 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: OATHAuth for more groups and bump their password requirements (duration: 00m 45s)
  • 21:27 mdholloway: mobileapps deployed 1d09b98
  • 20:56 filippo@tin: Synchronized php-1.29.0-wmf.3/api.php: Revert bandaid from ori (duration: 00m 53s)
  • 20:22 Krenair: wikitech-static: moved to REL1_28
  • 20:09 hoo: Started dumpwikidatajson.sh on snapshot1007 (T151787)
  • 20:05 ottomata: rolling restart of eventlogging-service-eventbus in eqiad to pick up new python-tornado version bump from jessie backports (so it doesn't bite us unexpectedly later)
  • 20:05 chasemp: T150679 changes for user_properties view on labsdb1001 and 1003
  • 19:50 thcipriani@tin: Synchronized php-1.29.0-wmf.3/extensions/Wikidata: SWAT: Update Wikibase: Use the "redirect" table in SqlEntityIdPager T151356 (duration: 02m 11s)
  • 19:42 urandom: T151086: bootstrap of restbase2012-a.codfw.wmnet starting...
  • 19:40 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable Popups A/B test on Russian and Italian Wikipedias T144490 (duration: 00m 45s)
  • 19:34 thcipriani@tin: Synchronized wmf-config/CirrusSearch-production.php: SWAT: Increase Cirrus interwiki load test to 100% (T149740) (duration: 00m 46s)
  • 19:29 godog: swift eqiad-prod: ms-be1027 to weight 3000 - T136631
  • 19:27 thcipriani@tin: Synchronized wmf-config/Wikibase.php: SWAT: Use entity types for the repoNamespaces Wikibase client setting (duration: 00m 45s)
  • 19:23 urandom: Stupidly issued iptables -F on restbase2012 <facepalm />
  • 18:08 gehel: deploying latest WDQS GUI and Blazegraph
  • 17:20 _joe_: turned on the second commonswiki htmlCacheUpdate dedicated jobrunner (T151196)
  • 16:50 _joe_: upgrading HHVM on canaries
  • 16:49 _joe_: re-started the commonswiki htmlCacheUpdate dedicated jobrunner (T151196)
  • 16:35 addshore: RevisionSlider updates window finished 35 mins late!
  • 16:34 addshore@tin: Finished scap: RevisionSlider updates - gerrit:323384, gerrit:323520, gerrit:323521, gerrit:323808 (duration: 55m 25s)
  • 15:39 addshore@tin: Started scap: RevisionSlider updates - gerrit:323384, gerrit:323520, gerrit:323521, gerrit:323808
  • 15:36 addshore: Turned change by ori to local commit on tin in core on .3 branch
  • 15:35 jynus: restarting db1069 (sanitarium) instances to apply new replication filters T151752
  • 15:13 addshore: fixed /srv/mediawiki-staging/php-1.29.0-wmf.3 clone of mw core on tin due to T151676
  • 14:53 zeljkof: EU SWAT finished
  • 14:52 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow sysop to revoke users from some groups on ne.wikipedia (T148171) (duration: 00m 45s)
  • 14:37 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: [throttle] Remove old throttle rules (duration: 00m 44s)
  • 14:33 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: [throttle] Exception for #MOWomenOnWikipedia Edit-A-Thon (T151650) (duration: 00m 45s)
  • 14:33 chasemp: labsdb1001 maintain-views --databases testwiki enwikivoyage enwiki --table page_assessments_projects --debug
  • 14:31 chasemp: labsdb1001 maintain-views --databases testwiki enwikivoyage enwiki --table page_assessments --debug
  • 14:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: HD logos for multiple wikis (T150618) (duration: 00m 49s)
  • 14:18 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: HD logos for multiple wikis (T150618) (duration: 00m 47s)
  • 13:01 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2049 - T150876 (duration: 00m 45s)
  • 12:35 marostegui: Stop replication db1052 (depooled) - maintenance - T150802
  • 12:17 hoo: Killed the Wikidata json dumpers on snapshot1007 due to T151356. Will be restarted once a fix has been deployed.
  • 10:34 volans: fixed permissions of old /var/log/hhvm/error.log-20160829 on osmium
  • 10:19 volans: fixed permissions of files in /srv/mediawiki-staging on tin and mira
  • 10:18 _joe_: stopping the dedicated commonswiki htmlCacheUpdate job runner, T151196
  • 09:09 marostegui: Deploy ALTER table (add an index) db1040 (master) commonswiki.revision - T147305
  • 07:42 marostegui: Deploying alter table s5 dewiki.revision - T148967
  • 07:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092 - T151272 (duration: 00m 47s)
  • 07:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Added comments to db1044 status - T150802 (duration: 00m 45s)
  • 07:08 marostegui: Stop MySQl on db1095 - maintenance T150802
  • 07:03 marostegui: Stop MySQL on db1044 - (depooled) maintenance - T150802
  • 02:05 Reedy: fixed localisationupdate clone of mw core on tin due to T151676
  • 02:00 l10nupdate@tin: LocalisationUpdate failed: git pull of core failed

2016-11-27

  • 21:47 legoktm: created wmf/1.29.0-wmf.3 branch pointing at master for mediawiki/extensions/ElectronPdfService to workaround T151725
  • 09:35 elukey: removed all the files not used in /tmp on stat1002 after a follow up with the owner
  • 06:20 ori@tin: Synchronized php-1.29.0-wmf.3/api.php: Bandaid: make API reqs fail fast if User-Agent ~= Parsoid and Host ~= eu.wikipedia.org (duration: 00m 50s)
  • 05:36 ori: Commented-out lived-hack from mw1290; if we see memory growth now, Parsoid would be strongly implicated.
  • 05:33 ori: With Parsoid requests hacked to fail fast, mw1290 is not showing the kind of aggressive growth in memory usage we're seeing on other API servers
  • 05:30 godog: roll restarting hhvm across api_cluster when hhvm uses more than 40% of memory
  • 05:21 ori: Live-hacked api.php on mw1290 to die if request user-agent contains 'Parsoid'; restarted HHVM.
  • 05:17 godog: roll restarting hhvm across api_cluster when hhvm uses more than 40% of memory
  • 04:57 godog: roll-restart hhvm on api_appcluster for on machines with hhvm leaking memory
  • 03:22 godog: roll-restart hhvm across api_appserver
  • 02:41 godog: dumping hhvm backtraces and roll-restart on affected api machines
  • 02:00 l10nupdate@tin: LocalisationUpdate failed: git pull of core failed

2016-11-26

  • 15:35 elukey: deleted tmp files on stat1002's /tmp partition because of disk space consumption. Will follow up with the owner.
  • 13:36 Krenair: ran refreshLinks on angwiki for T151584, it ran into issues with the EventBus extension at the links tables step
  • 12:29 volans: manually fixed the checkout of mediawiki core on stat1002 and stat1003 that was causing Puppet failing
  • 02:22 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Nov 26 02:22:26 UTC 2016 (duration 4m 18s)
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.3) (duration: 06m 28s)

2016-11-25

  • 20:09 Krinkle: mwscript deleteEqualMessages.php --wiki angwiki (T45917)
  • 17:15 jynus: drop database vewikimedia (deleted wiki) from sanitarium and its slaves
  • 14:22 Reedy: delete oathauth row on wikitech for user Liuxinyu970226 per T144805
  • 14:16 Reedy: delete oathauth row on wikitech for user Shoichi per T144805
  • 11:05 ema: uploaded libvmod-{netmapper,tbf,vslp} to carbon main component (T150660)
  • 10:20 _joe_: upgrading HHVM across codfw
  • 09:23 _joe_: upgraded hhvm on the debug hosts
  • 08:58 _joe_: uploading hhvm_3.12.7+dfsg-1+wmf4 to apt
  • 08:53 volans: restarting zotero on sca1003, almost out of RAM, puppet failing
  • 08:52 elukey: restarting Yarn and HDFS masters on analytics100[12] (Hadoop cluster) to complete the openjdk update
  • 07:51 marostegui: Stopping replication db1052 for maintenance - T151607
  • 02:22 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Nov 25 02:22:40 UTC 2016 (duration 4m 20s)
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.3) (duration: 06m 48s)

2016-11-24

  • 17:25 _joe_: turned off additional workers for htmlcacheupdate on commonswiki as the queue has reduced to acceptable sizes (T151196)
  • 15:03 ema: uploaded varnish 4.1.3-1wm4 to carbon main component, replacing version 3.0.6plus-wm9 (T150660)
  • 14:47 ema: uploaded varnishkafka 1.0.12-1 to carbon main component, replacing version 1.0.7-1 (T150660)
  • 13:31 akosiaris: balance the load between thumbor1001 and thumbor1002 evenly
  • 13:31 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: thumbor1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=thumbor', 'service=thumbor'])
  • 13:20 akosiaris@puppetmaster1001: conftool action : set/weight=5; selector: thumbor1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=thumbor', 'service=thumbor'])
  • 13:04 akosiaris@puppetmaster1001: conftool action : set/weight=20; selector: thumbor1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=thumbor', 'service=thumbor'])
  • 12:54 gilles: restarting thumbor on thumbor1001
  • 12:49 akosiaris: lower thumbor1001 load by 50% to easy debugging
  • 12:48 gilles: restarting thumbor on thumbor1001
  • 12:48 akosiaris@puppetmaster1001: conftool action : set/weight=5; selector: thumbor1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=thumbor', 'service=thumbor'])
  • 12:36 elukey: launched preferred-replica-election to re-add kafka1022 among the Topic partition leader brokers of the Analytics Kafka cluster (all metrics looks good)
  • 11:41 hoo: Killed the Wikidata JSON dump creation on snapshot1007: Wont succeed before Monday, due to T151356
  • 10:13 _joe_: running commonswiki htmlCacheUpdate jobs on terbium to catch up with the backlog, monitoring caches for vhtcpd queue overflows T151196
  • 09:38 marostegui: Stopping replication db1052 (depooled) for maintenance - T150960
  • 08:59 marostegui: Deploy alter table S5 - dewiki.revision on db1092 (depooled) - T148967
  • 08:15 _joe_: uploaded calico-cni 1.5.1 to jessie-wikimedia
  • 07:32 marostegui: Stopping MySQL db2070 for maintenance - https://phabricator.wikimedia.org/T149553
  • 02:35 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Nov 24 02:35:10 UTC 2016 (duration 5m 15s)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.3) (duration: 10m 39s)
  • 00:28 reedy@tin: Synchronized php-1.29.0-wmf.3/extensions/CentralAuth/maintenance/populateLocalAndGlobalIds.php: Some perf related improvements (duration: 00m 45s)
  • 00:12 demon@tin: Synchronized docroot/foundation/: rm more junk (duration: 00m 45s)

2016-11-23

  • 23:11 godog: cleanup older labs instances metrics from 'instances' hierarchy on graphite1001
  • 22:57 mutante: phab2001 - installing vim upgrade
  • 22:52 godog: cleanup older labs instances metrics from 'instances' hierarchy on graphite2001
  • 21:59 mutante: gerrit restarting for config change 323179
  • 21:07 demon@tin: Finished scap: pruning old deployment branches (duration: 19m 14s)
  • 20:48 demon@tin: Started scap: pruning old deployment branches
  • 20:42 XenoRyet: Updated payments-wiki from f8ca942 to d7ed144
  • 19:24 godog: swift eqiad-prod: ms-be1027 to weight 2000 T136631
  • 18:56 marostegui: Shutting down db2034 for maintenance - T149553
  • 18:04 volans@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2092.codfw.wmnet
  • 17:58 demon@tin: Synchronized php-1.29.0-wmf.3/extensions/CentralAuth/maintenance/populateLocalAndGlobalIds.php: (no message) (duration: 00m 53s)
  • 17:36 marostegui: Stopping MySQL on db2070 for maintenance - https://phabricator.wikimedia.org/T149553
  • 16:24 marostegui: Setting offline disk [32:4] on db1053 - looks like it is causing repl issues
  • 16:01 marostegui: Stopping replication db2070 for maintenance - T149553
  • 15:50 dcausse: elastic@eqiad: ruwiki reindex done (T148344)
  • 14:37 dcausse: elastic@eqiad: reindexing ruwiki from terbium, logs in ~dcausse/bm25_reindex/cirrus_log (T148344)
  • 14:33 jynus: rebooting, upgrading db1092 while it is depooled for maintenance
  • 14:31 marostegui: Stopping replication db1095 (not pooled) - maintenance - T150960
  • 11:48 _joe_: uploaded calico/kube-policy-controller:0.5.0 to the docker registry
  • 10:24 marostegui: Stopping replication on the following m3 hosts for maintenance - db1048, dbstore1002 (m3 instance), db2012 - T151384
  • 10:23 jynus: stopping replication to dbstore1001 to change its masters
  • 07:46 marostegui: Stopping MySQL db2070 for maintenance - T149553
  • 07:29 marostegui: Stopping replication on db1095 (depooled) for maintenance - T150960
  • 07:14 marostegui: Stopping replication on db1052 (depooled) for maintenance - T150960
  • 03:20 papaul: prometheus200[3-4] signing puppet certs, salt-key, initial run
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.3) (duration: 14m 13s)
  • 02:04 mutante: depooled mw2092 because it had I/O errors, dev sda
  • 02:03 dzahn@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2092.codfw.wmnet
  • 01:47 Krenair: mw2092 seems broken
  • 01:44 krenair@tin: Synchronized php-1.29.0-wmf.3/extensions/VisualEditor/modules/ve-mw: https://gerrit.wikimedia.org/r/323080 and https://gerrit.wikimedia.org/r/323103 (duration: 00m 49s)
  • 01:13 bd808: Updated striker to c546f4c (T151409)
  • 00:04 maxsem@tin: Synchronized wmf-config/CommonSettings-labs.php: https://gerrit.wikimedia.org/r/#/c/323078/ (duration: 00m 49s)

2016-11-22

  • 23:02 reedy@tin: Synchronized wmf-config/InitialiseSettings-labs.php: EmailAuth to beta T151015 (duration: 00m 51s)
  • 23:00 reedy@tin: Synchronized wmf-config/CommonSettings-labs.php: EmailAuth to beta T151015 (duration: 00m 57s)
  • 22:59 reedy@tin: Synchronized wmf-config/extension-list-labs: EmailAuth to beta T151015 (duration: 00m 55s)
  • 22:37 ejegg: updated payments-wiki from 6412a22 to f8ca942
  • 21:31 cwd: updated payments from 84b035e to 6412a22
  • 19:58 cwd: updated payments wiki from ac14126 to 84b035e
  • 19:54 thcipriani@tin: Synchronized debug.json: SWAT: debug.json: update eqiad debug hosts (duration: 00m 49s)
  • 19:49 ottomata: restarting pybal on lvs2003 and lvs2006 for eventstreams in codfw
  • 19:48 godog: set thumbor access for temp containers - T150760
  • 19:26 thcipriani@tin: Synchronized portals: SWAT: Bumping wikipedia.org portal to master (T128546) (duration: 00m 56s)
  • 19:25 thcipriani@tin: Synchronized portals/prod/wikipedia.org/assets: SWAT: Bumping wikipedia.org portal to master (T128546) (duration: 00m 53s)
  • 19:20 jynus: rebooting es2019 for upgrade
  • 19:17 thcipriani@tin: Synchronized wmf-config/CirrusSearch-production.php: SWAT: [cirrus] Increase interwiki loadtest to 75% (T149740) (duration: 00m 55s)
  • 18:53 jynus: trying schema change on db1057 (enwiki.page) T69223
  • 18:50 jynus: trying schema change on db1082 (wikidatawiki.page) T69223
  • 18:17 mobrovac: restbase deploy end of 9c7822d
  • 17:56 mobrovac: restbase deploy start of 9c7822d
  • 17:47 jynus: performing schema change on db1094 (metawiki.page) T69223
  • 17:46 godog: swift eqiad-prod: ms-be1027 to weight 1000 T136631
  • 17:39 jynus: performing schema change on db1076 (enwiktionary.page) T69223
  • 17:35 jynus: performing schema change on db1075 (page) T69223
  • 17:31 jynus: performing schema change on db1078 (page) T69223
  • 17:21 gehel: re-enabling alerts for maps-test* servers
  • 16:51 bd808: Testing safesubst: log message recording on wiki
  • 16:46 bblack: roll back to globalsign R3-based intermediate for unified complete and confirmed on all hosts
  • 16:30 bblack: disabling puppet on caches to do post-merge fixup on chain certs for https://gerrit.wikimedia.org/r/#/c/322913/
  • 16:15 cmjohnson1: relocating dbprox1010/1011 to rack c5
  • 16:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1052 - T150960 (duration: 00m 49s)
  • 15:19 mobrovac: scb in codfw restarting all services to pick up the new firejail
  • 14:46 bblack: deploying new unified certs to cache_misc
  • 14:42 hashar@tin: Synchronized php-1.29.0-wmf.3/extensions/UploadWizard/resources/mw.UploadWizardLicenseInput.js: mw.UploadWizardLicenseInput: Correct unguarded for...in - T151220 (duration: 00m 49s)
  • 14:42 bblack: deployin new unified certs to cache_upload + cache_text
  • 14:36 bblack: deployed new unified certs to cache_maps
  • 14:33 bblack: disabling puppet on caches ahead of unified cert update
  • 14:32 hashar@tin: Synchronized wmf-config: Add missing $wgPropertySuggesterClassifyingPropertyIds for beta (duration: 00m 56s)
  • 14:32 _joe_: upgraded firejail on all scb nodes in codfw
  • 14:31 hashar@tin: scap aborted: Add missing $wgPropertySuggesterClassifyingPropertyIds for beta (duration: 09m 32s)
  • 14:22 hashar@tin: Started scap: Add missing $wgPropertySuggesterClassifyingPropertyIds for beta
  • 14:20 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1171.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=appserver', 'service=apache2'])
  • 14:20 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1170.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=appserver', 'service=apache2'])
  • 14:17 zeljkof: EU SWAT continues
  • 14:07 hashar: European SWAT completed
  • 14:04 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RevisionSlider (non BetaFeature) on de,ar,hewiki - T149995 T148646 T150573 (duration: 00m 54s)
  • 12:05 jynus: retrying schema change on db1040 (page) T151029
  • 11:57 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1091 (duration: 00m 49s)
  • 11:42 volans: fixed logrotate on cp1008, removed empty created .1.gz files T151314
  • 11:16 marostegui: Deploy ALTER table db1091 commonswiki.revision - T147305
  • 11:15 jynus: performing blocking schema change on db1091 (depooled) T151029
  • 11:07 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1091 (duration: 00m 57s)
  • 10:34 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1059 & db1084 T151029 (duration: 00m 51s)
  • 10:01 gehel: starting elasticsearch codfw cluster restart for JDK and nginx upgrade
  • 10:00 gehel: starting elasticsearch cluster restart for JDK and nginx upgrade
  • 09:46 hashar: Replaced slow Jenkins job operations-puppet-puppetlint-strict in favor of using 'rake test' which runs puppet-lint solely against files changed in HEAD https://gerrit.wikimedia.org/r/322839
  • 09:44 jynus: performing blocking schema change on db1084 (depooled) T151029
  • 09:36 jynus: performing blocking schema change on db1059 (depooled) T151029
  • 09:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1059 T151029 (duration: 00m 49s)
  • 09:00 marostegui: Deploy ALTER table db1084 commonswiki.revision - T147305
  • 08:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 - T147305 (duration: 00m 53s)
  • 08:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1081 - T147305 (duration: 00m 54s)
  • 08:14 marostegui: Deploy ALTER table db1081 commonswiki.revision - T147305
  • 08:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1081 - T147305 (duration: 00m 50s)
  • 07:23 marostegui: Reboot db1092 for RAID controller upgrade - T151272
  • 06:57 marostegui: Stopping Replication on db2057 for maintenance - T150960
  • 06:54 marostegui: Stopping Replication on db1095 for maintenance - T150960
  • 02:32 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Nov 22 02:32:17 UTC 2016 (duration 4m 18s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.3) (duration: 10m 20s)
  • 02:05 ejegg: updated SmashPig from 3cbb42f to 36be698
  • 01:40 urandom: T151086: RESTBase: Starting 'a' instance Cassandra cleanups, rack 'b', codfw
  • 01:26 mutante: restbase2011 enabling puppet, initial run after activation with gerrit 322807
  • 01:14 mutante: restbast2011 - enabling puppet, running puppet, seeing error about missing secret, disabling puppet again
  • 00:54 ejegg: updated fundraising tools from 4f54cd8 to da80929
  • 00:52 godog: reboot ms-be2025 T151201
  • 00:47 bblack: cr[12]-ulsfo - added metric 15 to lvs4002 in policy LVS_import (for real this time) - T151273
  • 00:45 bblack: cr[12]-ulsfo - added metric 15 to lvs4002 in policy LVS_import - T151273
  • 00:35 bblack: depooled cp4008 (cache_text ulsfo) - T151275
  • 00:35 bblack@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4008.ulsfo.wmnet
  • 00:03 reedy@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 after crash T151272 (duration: 00m 59s)

2016-11-21

  • 23:53 robh: db1092, typo in my log!
  • 23:51 robh: db1095 alerted icinga, non-responsive to serial console (hard crash), rebooting
  • 23:38 twentyafterfour: disabling puppet on iridium until https://gerrit.wikimedia.org/r/#/c/322791/ lands
  • 23:37 awight: update civicrm from 3a77236 to 8c76f43
  • 23:16 ejegg: enabled donations queue consumer
  • 23:15 ejegg: updated fundraising dashboard from af8a493 to 43039fd
  • 23:03 ejegg: updated civicrm from 40efd0f to 3a77236
  • 23:01 ejegg: disabled donation queue consumer
  • 22:54 ejegg: updated payments-wiki from 3b3c8ce to ac14126
  • 22:36 robh: depooled ulsfo, unitedlayer has to do an emergency replacement of a failed pdu
  • 21:18 ori@tin: Synchronized docroot/noc: (no message) (duration: 00m 55s)
  • 21:17 ori@tin: Synchronized debug.json: (no message) (duration: 01m 06s)
  • 21:10 bearND: deployed mobileapps da269c3
  • 21:08 bearND: starting mobileapps deploy
  • 20:05 ejegg: restored civimail batch size to 400
  • 19:48 ejegg: enabled thank you mail job
  • 19:48 ejegg: updated CiviCRM from 27a9a2d to 40efd0f
  • 19:37 ejegg: changed thank you batch size from 400 to 5
  • 19:36 ejegg: disabled thank you send job
  • 19:30 thcipriani@tin: Synchronized php-1.29.0-wmf.3/extensions/VisualEditor: SWAT: Update VE core submodule to wmf/1.29.0-wmf.3 HEAD (68a1d94) (T151005) (duration: 00m 49s)
  • 19:28 godog: swift eqiad-prod ms-be1027 to weight 250 - T136631
  • 19:24 thcipriani@tin: Synchronized php-1.29.0-wmf.3/extensions/MobileFrontend/resources/skins.minerva.content.styles/hacks.less: SWAT: Correct flex display for thumbnail contents on mobile (T150706) (duration: 00m 59s)
  • 19:10 WMFLabs: testing
  • 17:29 elukey: unmasked kafka* on kafka1022 after disk swap
  • 17:12 reedy@tin: Synchronized wmf-config/trusted-xff.php: Update for forcepoint/websense (duration: 00m 50s)
  • 16:54 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 T151029 (duration: 00m 59s)
  • 16:07 jynus: performing blocking schema change on db1056 (depooled) T151029
  • 15:59 marostegui: Powering off es2019 for HW maintenance - T149526
  • 15:58 marostegui: Shutting down MySQL es2019 for HW maintenance - T149526
  • 15:52 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1056 T151029 (duration: 00m 53s)
  • 15:43 marostegui: Shutting down db2034 for HW maintenance - T149553
  • 14:50 zeljkof: EU SWAT finished!
  • 14:46 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Configure Babel for fr.wikibooks and fr.wikiversity (T146213) (duration: 00m 49s)
  • 14:40 Reedy: deleted oathauth row for wiki13 T151209
  • 14:21 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set $wgForeignUploadTargets to [ local ] for zhwiki (T139257) (duration: 00m 50s)
  • 13:56 jynus: performing blocking schema change on db2058 T151029
  • 13:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1068 - T147305 (duration: 00m 49s)
  • 12:51 hashar: Restarted udp2log-mw service on deployment-fluorine02 . Was not available (ping T146723 T151169)
  • 12:17 marostegui: Stopping MySQL db1095 - maintenance - T150960
  • 11:56 elukey: restarted jobchron/runner on mw208[0-5] since systemd was reporting degradation (broken pipes in the journald logs)
  • 11:32 gehel: starting cluster restart on elasticsearch eqiad for JVM upgrade
  • 10:29 jynus: performing blocking schema change on db2065 T151029
  • 10:11 marostegui: Deploy ALTER table db1068 commonswiki.revision - https://phabricator.wikimedia.org/T147305
  • 10:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1068 - T147305 (duration: 00m 48s)
  • 09:49 ema: removed varnishrls.service from non-cache_text hosts
  • 08:50 elukey: rolling restart of hadoop-related java daemons on analytics* hosts due to openjdk update
  • 08:08 marostegui: Deploy ALTER table db1069 commonswiki.revision - https://phabricator.wikimedia.org/T147305
  • 07:18 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2070 - T149553 (duration: 00m 49s)
  • 06:11 legoktm@tin: Synchronized wmf-config/: Disable centralauth-rename right for maintenance (T148242, T151155) (duration: 00m 52s)
  • 02:21 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Nov 21 02:21:35 UTC 2016 (duration 4m 19s)
  • 02:17 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.3) (duration: 05m 46s)

2016-11-20

  • 22:42 WMFLabs: testing
  • 02:21 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Nov 20 02:21:19 UTC 2016 (duration 4m 18s)
  • 02:17 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.3) (duration: 06m 03s)

2016-11-19

  • 02:20 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Nov 19 02:20:37 UTC 2016 (duration 4m 18s)
  • 02:16 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.3) (duration: 05m 27s)

2016-11-18

  • 21:32 hashar: CI / Zuul slightly overloaded. Will resolve by itself soon.
  • 18:52 demon@tin: Synchronized wmf-config/CommonSettings.php: extdist settings for 1.28 (duration: 00m 49s)
  • 18:27 jynus: deployed unix_auth on silver (labswiki) T150446
  • 18:22 jynus: removing mysql-test dir from silver to free up some space there
  • 17:34 ori@tin: Synchronized wmf-config/InitialiseSettings.php: If3b80b1a: Revert "Don't use AbuseFilterCachingParser on bgwiki" (T148660) (duration: 00m 50s)
  • 16:48 ejegg: disabled job 'Dedupe CiviCRM contacts (name-match)'
  • 16:46 ejegg: disabled job 'Dedupe CiviCRM contacts'
  • 14:19 twentyafterfour: phabricator: deploying more fixes from upstream/stable to wmf/stable. Fixes T150992
  • 10:13 jynus: performing schema change on dbstore2001:commonswiki/page (ALGORITHM=COPY)
  • 09:14 marostegui: Stopping MySQL on db2070 to use it to clone another host - https://phabricator.wikimedia.org/T149553
  • 09:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2070 - T149553 (duration: 00m 48s)
  • 08:33 elukey: kafka1022 up and running with kafka* daemon masked and broken disk removed from fstab (we mount partitions in there using UUIDs)
  • 08:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2049 - T150876 (duration: 00m 49s)
  • 08:07 _joe_: physical powercycle of kafka1022 (broken disk)
  • 07:55 _joe_: rebooting kafka1022, a shower of defunct processes, kafka refuses to startup again
  • 07:43 _joe_: restarting kafka on kafka1022, too many open files
  • 07:24 moritzm: installing openjdk-7 security updates on trusty systems
  • 02:31 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Nov 18 02:31:52 UTC 2016 (duration 4m 40s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.3) (duration: 09m 18s)
  • 00:11 reedy@tin: Synchronized wmf-config/CommonSettings.php: wfLoadExtension for numerous extensions (duration: 00m 48s)
  • 00:10 reedy@tin: Synchronized wmf-config/extension-list: More extensions to extension.json (duration: 00m 48s)

2016-11-17

  • 23:42 krinkle@tin: Synchronized php-1.29.0-wmf.3/resources/src/mediawiki/mediawiki.js: Ie21f5c: undo temp revert for metric observation (duration: 00m 54s)
  • 23:22 krinkle@tin: Synchronized php-1.29.0-wmf.3/resources/src/mediawiki/mediawiki.js: Ie21f5c: temp revert cache-eval for metric observation (duration: 00m 49s)
  • 22:24 ejegg: updated SmashPig from 1e895ba to 3cbb42f
  • 22:08 mutante: contint1001 - Apache fix puppetized, re-enabeld
  • 22:02 ejegg|food: updated SmashPig from cac1a19 to 1e895ba
  • 21:58 mutante: doc.wm.org Apache config live-hack fixed, puppet patch coming up
  • 21:02 Pchelolo: update RESTBase to b9722ba7c
  • 20:59 Pchelolo: update RESTBase to b9722ba7c - canary on restbase1007
  • 20:55 Pchelolo: update RESTBase to b9722ba7c - staging
  • 20:26 jynus: applying schema change on s1 (page) T69223
  • 20:07 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.3
  • 20:03 krinkle@tin: Synchronized php-1.29.0-wmf.3/extensions/WikimediaEvents/modules/ext.wikimediaEvents.visibilitychange.js: Ibd0935bef8f (duration: 00m 48s)
  • 19:58 ejegg: extended session timeouts on payments-wiki
  • 19:25 addshore@tin: Synchronized php-1.29.0-wmf.3/extensions/Wikidata: {{gerrit|322092}} T150948 Backporting fix for quantity precision issue. (duration: 02m 17s)
  • 19:07 krenair@tin: Synchronized wmf-config/interwiki.php: https://gerrit.wikimedia.org/r/#/c/322141/ (duration: 00m 50s)
  • 18:57 bearND: deployed mobileapps bf44547
  • 18:53 bearND: starting mobileapps deploy
  • 18:36 twentyafterfour: phabricator: deploy upstream fix for T150971 (upstream sha1: 7ebc47d906fe )
  • 18:16 twentyafterfour: deploy more phabricator hotfixes
  • 17:46 twentyafterfour: unbreak search
  • 17:40 twentyafterfour: phabricator: deploying hotfix for T150965
  • 17:34 godog: bounce gerrit on cobalt after https://gerrit.wikimedia.org/r/321398
  • 17:11 twentyafterfour: restart apache2 on iridium to clear lagged queries refs T150965
  • 17:07 ejegg: increased paypal job runner message limit to 5000
  • 17:02 cmjohnson: powering down ms-be1016 to reseat the raid card.
  • 16:52 mobrovac: change-prop deploying ed3711b
  • 16:51 demon@tin: Synchronized docroot/: Unifying most docroots (duration: 00m 50s)
  • 16:45 demon@tin: Synchronized w/static.php: code duplication stuffs (duration: 00m 49s)
  • 16:40 godog: upgrade nginx on prometheus and thumbor machines
  • 16:28 Reedy: Deleted centralauth.oathauth_users row for Horst
  • 16:14 moritzm: restarting app server canaries to pick up libxslt update
  • 15:52 ema: rebooting primary LVS hosts for kernel updates
  • 15:44 ori@tin: Synchronized php-1.29.0-wmf.3/extensions/NavigationTiming/modules/ext.navigationTiming.js: I8e8ec96f: Don't report stats when page visibility changes during page load (duration: 00m 48s)
  • 15:42 ori@tin: Synchronized php-1.29.0-wmf.2/extensions/NavigationTiming/modules/ext.navigationTiming.js: I8e8ec96f: Dont report stats when page visibility changes during page load ; scap sync-file php-1.29.0-wmf.3/extensions/NavigationTiming/modules/ext.navigationTiming.js I8e8ec96f: Dont report stats when page visibility changes during page load (duration: 00m 51s)
  • 15:24 jynus: applying schema change on s4 (page) T69223
  • 15:10 moritzm: installing libxslt security updates
  • 15:02 ema: rebooting secondary (inactive) LVS hosts for kernel updates
  • 12:50 moritzm: upgrading imagemagick on remaining image scalers (T141739)
  • 10:47 moritzm: installing trusty kernel updates
  • 10:22 elukey: cleanup on analytics1027 - Removed mysql-server-5.5 (not used) and ran apt autoremove (old kernels)
  • 10:20 moritzm: upgrading imagemagick on mw1293 (T141739)
  • 10:17 jynus: applying schema change on s5 (page) T69223
  • 10:07 moritzm: temporarily disable puppet on elastic* for staged merge of ferm change
  • 09:30 marostegui: Reboot db2049 for maintenance - https://phabricator.wikimedia.org/T150876
  • 09:19 elukey: rebooting mc1019->mc1036 (memcached/redis servers, not taking any traffic) for kernel upgrades
  • 07:57 moritzm: uploaded imagemagick 8:6.8.9.9-5+deb8u5+wmf1 to carbon (T141739)
  • 02:55 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Nov 17 02:55:38 UTC 2016 (duration 5m 41s)
  • 02:49 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.3) (duration: 11m 26s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.2) (duration: 08m 08s)
  • 02:04 dzahn@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,name=phab1001-vcs.eqiad.wmnet
  • 01:56 mutante: conftool-merge, created node phab1001-vcs.eqiad.wmnet for cluster phabricator/git-ssh, removed node iridium-vcs...
  • 01:29 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Beta Cluster only (duration: 00m 49s)
  • 01:23 mutante: scheduled downtime for iridium and services (phab)
  • 01:23 mutante: temp disable puppet on iridium (maintenance)
  • 01:08 mutante: renamed iridium-vcs.eqiad to phab1001-vcs.eqiad (phabricator ssh)
  • 00:53 dereckson@tin: Synchronized php-1.29.0-wmf.3/extensions/VisualEditor/lib/ve/src/ui/: Make $returnFocusTo a no-op in WindowManager (T150556) (duration: 00m 49s)
  • {{safesubst:SAL entry|1=00:48 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Beta Features: Update whitelist ([[Gerrit:321992]) (duration: 00m 49s)}}
  • 00:48 bblack: repool cp3039 - T150879
  • 00:38 bblack: depooling cp3039 for hw/bios work - T150879
  • 00:37 bblack: depooling cp3039 for hw/bios work
  • 00:35 dereckson@tin: Synchronized wmf-config/CirrusSearch-production.php: Increase cirrus interwiki loadtest to 50% (T149740) (duration: 00m 48s)
  • 00:31 dereckson@tin: Synchronized wmf-config/CommonSettings.php: Ban 100 most common passwords from ordinary accounts (duration: 00m 49s)
  • 00:18 mattflaschen@tin: Synchronized wikiversions-labs.json: Beta Cluster only (duration: 00m 53s)
  • 00:17 mattflaschen@tin: Synchronized dblists/all-labs.dblist: Beta Cluster only (duration: 00m 54s)

2016-11-16

  • 22:47 ejegg: updated CiviCRM from 5bdf00b to 27a9a2d
  • 22:42 mdholloway: mobileapps deployed 7b04c47
  • 22:39 mdholloway: starting mobileapps deployment
  • 21:45 cwd|afk: updated payments listeners from a0ae95e to cac1a19
  • 20:58 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to 1.29.0-wmf.3
  • 20:56 demon@tin: Synchronized private/: (no message) (duration: 00m 50s)
  • 20:50 demon@tin: Finished scap: llamas on the move! (duration: 05m 51s)
  • 20:44 demon@tin: Started scap: llamas on the move!
  • 20:33 reedy@tin: Synchronized wmf-config/CommonSettings.php: consistency after pulling to gerrit (duration: 00m 49s)
  • 19:08 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix notification icon path for foundationwiki (duration: 00m 49s)
  • 19:07 ejegg: updated CiviCRM from df50d2d to 5bdf00b
  • 18:01 ejegg: updated fundraising tool from d14d47a to 4f54cd8
  • 17:56 akosiaris@puppetmaster1001: conftool action : set/weight=5; selector: scb1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mobileapps'])
  • 17:56 akosiaris@puppetmaster1001: conftool action : set/weight=5; selector: scb1003.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mobileapps'])
  • 17:13 reedy@tin: Synchronized wmf-config/CommonSettings.php: change geoip name to stop upsetting ES (duration: 00m 48s)
  • 17:10 demon@tin: Synchronized docroot/foundation/: rm more fundraising junks (duration: 00m 54s)
  • 17:05 reedy@tin: Synchronized wmf-config/CommonSettings.php: Fix GeoIP (duration: 00m 49s)
  • 16:08 jynus: applying schema change on s7 (page) T69223
  • 16:03 paravoid: apt-get autoremove on analytics1028
  • 15:56 ori@tin: Synchronized wmf-config/InitialiseSettings.php: I506f17f6: Don't use AbuseFilterCachingParser on bgwiki (T148660) (duration: 00m 49s)
  • 15:47 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2066 - T150518 (duration: 00m 49s)
  • 15:20 ori@tin: Synchronized wmf-config/InitialiseSettings.php: I968050af3f: Re-enable AbuseFilterCachingParser everywhere (duration: 00m 50s)
  • 15:04 ema: rolling cache_upload upgrade to varnish 4.1.3-1wm4 and reboot with linux 4.4.2-3+wmf7
  • 14:59 zeljkof: EU SWAT finished! For real, this time.
  • 14:58 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Autopatrolled group for et.wikipedia.org (T150852) (duration: 00m 55s)
  • 14:40 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Autopatrolled group for et.wikipedia.org (T150852) (duration: 00m 51s)
  • 14:31 zeljkof: Starting EU SWAT, part two!
  • 14:20 zeljkof: EU SWAT finished
  • 14:18 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove patrol from autoconfirmed and reviewer for enwiki (T149019) (duration: 00m 49s)
  • 12:24 jynus: deploying unix_socket authentication to all core databases T150446
  • 12:05 moritzm: installing trusty kernel updates
  • 11:03 moritzm: rebooting labsdb1006 (OSM master) for kernel update
  • 11:01 ema: rolling cache_text upgrade to varnish 4.1.3-1wm4 and reboot with linux 4.4.2-3+wmf7
  • 10:52 moritzm: rebooting labsdb1007 (OSM slave) for kernel update
  • 10:49 tstarling@tin: Synchronized wmf-config/llama.php: (no message) (duration: 00m 48s)
  • 10:40 tstarling@tin: Synchronized wmf-config/llama.php: (no message) (duration: 00m 48s)
  • 10:35 tstarling@tin: Finished scap: (no message) (duration: 22m 34s)
  • 10:16 jynus: applying schema change on s2 (page) T69223
  • 10:12 tstarling@tin: Started scap: (no message)
  • 10:07 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2042 (duration: 00m 49s)
  • 09:44 mobrovac: parsoid deployed e41b235
  • 08:48 moritzm: installing libgd security updates on remaining app servers
  • 07:30 marostegui: Stopping replication in db2066 for maintenance - T150518
  • 02:52 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Nov 16 02:52:15 UTC 2016 (duration 5m 27s)
  • 02:46 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.3) (duration: 10m 41s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.2) (duration: 06m 08s)
  • 01:37 papaul: restbase201[0-2] - signing puppet certs, salt-key, initial run
  • 00:26 dereckson@tin: Synchronized wmf-config/CirrusSearch-production.php: Increase CirrusSearch interwiki load test to 25% (T149740) (duration: 00m 58s)
  • 00:25 dereckson@tin: Synchronized wmf-config/CommonSettings.php: Allow interface-editor & engineer users to use OATHAuth (T150807) (duration: 00m 59s)

2016-11-15

  • 23:21 papaul: restbase201[0-2] OS install
  • 22:37 matanya: renaming Веденей to Serzh Ignashevich - user with +50k edits
  • 21:14 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.29.0-wmf.3
  • 21:04 thcipriani@tin: Finished scap: testwiki to 1.29.0-wmf.3 and rebuild l10n cache (duration: 30m 20s)
  • 20:34 thcipriani@tin: Started scap: testwiki to 1.29.0-wmf.3 and rebuild l10n cache
  • 20:16 demon@tin: Synchronized scap/plugins/prep.py: More scap goodies for MW (duration: 00m 49s)
  • 20:09 demon@tin: Synchronized docroot/foundation/: Removing old 2007 donation stuff, broken (duration: 00m 49s)
  • 20:02 demon@tin: Synchronized w/query.php: Serve http 410 instead of 500 (duration: 00m 48s)
  • 19:59 demon@tin: Synchronized scap/plugins/: pep8 + gitignore, mostly no-op (duration: 00m 49s)
  • 19:54 thcipriani@tin: Synchronized wmf-config/CirrusSearch-production.php: SWAT: Revert "Revert "Setup CirrusSearch interwiki load test"" (T149740) PART III (duration: 00m 48s)
  • 19:53 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Revert "Setup CirrusSearch interwiki load test"" (T149740) PART II (duration: 00m 48s)
  • 19:51 thcipriani@tin: Synchronized wmf-config/CirrusSearch-interwikiSources.php: SWAT: Revert "Revert "Setup CirrusSearch interwiki load test"" (T149740) PART I (duration: 01m 43s)
  • 19:25 demon@tin: Synchronized wmf-config/InitialiseSettings-labs.php: for beta, no-op, completeness (duration: 00m 58s)
  • 19:05 thcipriani: starting branching for 1.29.0-wmf.3
  • 18:57 bd808: Archived oldest 3 months of SAL data; tried to optimize Module:SAL to speed up rendering
  • 18:26 ostriches: gerrit: back up, running 2.12.5-dirty now :)
  • 18:25 _joe_: uploaded calicoctl_1.0.0-beta-rc5~wmf1_amd64.deb to jessie-wikimedia T150434
  • 18:24 ostriches: gerrit: bringing down for a minute or two for quick point upgrade, T143089
  • 18:23 jynus: applying schema change on s6 (page) T69223
  • 18:22 _joe_: uploading calico/node:1.0.0-beta-rc5 to the docker registry T150434
  • 18:21 bblack: X
  • 16:06 ema: rolling cache_maps upgrade to varnish 4.1.3-1wm4 and reboot with linux 4.4.2-3+wmf7
  • 15:52 ema: upgrading cp2015 to varnish 4.1.3-1wm4 and rebooting with linux 4.4.2-3+wmf7
  • 14:22 zeljkof: EU SWAT finished!
  • 14:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RevisionSlider (non BF) on test & mediawiki wikis (T149724) (duration: 00m 57s)
  • 14:09 zeljkof: starting EU SWAT
  • 13:30 hashar: Jouncebot is back
  • 13:27 hashar: Attempting to restart jouncebot
  • 13:21 moritzm: installing dbus security updates on trusty systems
  • 13:13 moritzm: restarting app server canaries to pick up libgd security update
  • 12:39 moritzm: installing pillow security updates on eqiad app servers
  • 12:30 moritzm: installing libgd security updates
  • 12:01 marostegui: Deploy schema change labsdb1003 s4 commonswiki revision table T147305
  • 09:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1064 - T149079 (duration: 00m 51s)
  • 07:53 moritzm: install remaining curl security updates in eqiad and codfw
  • 07:47 marostegui: Deploy schema change s4 commonswiki.revision db1064 - T147305
  • 04:24 mutante: !log (test logging)
  • 04:24 gerrit: 316497 was merged and logging from puppetmaster1001 over to icinga still works
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.2) (duration: 06m 13s)
  • 01:12 mutante: deploying ghostscript regression update on Videoscalers (and manually on osmium)
  • 01:08 mutante: deploying ghostscript regression update on Imagescalers
  • 01:05 mutante: deploying ghostscript regression update on API appservers
  • 01:01 mutante: deploying ghostscript regression update on jobrunner-eqiad
  • 00:56 mutante: bromine - deleted some un-packed mediawiki release versions from home/csteipp/releasetools for disk space (back tp 80%)
  • 00:51 mutante: bromine - apt-get clean for disk space
  • 00:24 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/321118/ (duration: 00m 47s)

2016-11-14

  • 23:30 MaxSem: Assigned email for my bot, RoboMaxCyberSem
  • 23:28 cwd: updated smashpig payment listeners from 142e60b to a0ae95e
  • 23:18 godog: add 150G to labsdb1004:/srv/labsdb to get it out of warning threshold T150553
  • 23:07 godog: silence disk space alerts on labsdb1004 for 4h while investigating reoccurence - T150553
  • 21:26 demon@tin: Synchronized scap/plugins/patch.py: Patching tool for fun and profit (duration: 00m 49s)
  • 21:17 demon@tin: Synchronized wmf-config/: Coding style fixes (duration: 00m 49s)
  • 21:07 mutante: deploying regression update for ghostscript (DSA-3691-2) on all eqiad mw appservers
  • 21:05 demon@tin: Synchronized fonts/: For completeness. Also for co-master git sync (duration: 01m 09s)
  • 20:59 mutante: deploying regression update for ghostscript (DSA-3691-2) on all codfw mw appservers
  • 20:47 mutante: deploying regression update for ghostscript (DSA-3691-2) on MW API appservers
  • 20:21 demon@tin: Synchronized .gitignore: For completeness (duration: 00m 46s)
  • 20:15 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Allow 2FA for the abusefilter group if enabled on wiki (duration: 00m 53s)
  • 19:54 thcipriani: ran: mwscript extensions/Translate/scripts/ttmserver-export.php --wiki=frwiktionary
  • 19:53 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable translation memory of Translate for frwiktionary (T150146) (duration: 00m 47s)
  • 19:40 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Ban ten most popular passwords from fawiki (T150570) (duration: 00m 46s)
  • 19:17 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: MF Beta: Enable moving first paragraph before infobox (T149830) (duration: 00m 52s)
  • 19:01 demon@tin: Synchronized w/static.php: Removing useless function param (duration: 01m 00s)
  • 16:19 bblack: cache_upload - seamless nginx restart for libssl1.1 upgrade - T150561
  • 16:07 bblack: cache_text - seamless nginx restart for libssl1.1 upgrade - T150561
  • 16:04 Krenair: Setting new password on User:Ckoerner and requesting an unlock so user can recover account
  • 15:35 bblack: cache_maps - seamless nginx restart for libssl1.1 upgrade - T150561
  • 15:33 bblack: cache_misc - seamless nginx restart for libssl1.1 upgrade - T150561
  • 15:31 bblack: upgrade libssl1.1 package to 1.1.0c-1+wmf2 on cache clusters - T150561
  • 15:21 ori: statsv: deployed I01b0e885d; service now running with systemd watchdog supervision.
  • 15:13 bblack: uploaded libssl1.1 1.1.0c-1+wmf2 to jessie-wikimedia/backports - T150561
  • 15:11 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1239.*
  • 15:11 ori: statsv: deployed Ie471fa762
  • 15:09 _joe_: ran sync-common on mw1239, T148421
  • 14:54 zeljkof: EU SWAT finished
  • 14:52 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Abenaki language (abe) to Wikidata (T150633) (duration: 00m 48s)
  • 14:40 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Verify license tags for custom license in Commons UploadWizard (T140903) (duration: 00m 47s)
  • 14:28 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: HD logos for multiple wikis (T150618) (duration: 00m 48s)
  • 14:28 dcausse: elastic@eqiad T150232: reindex commonswiki (content and general indices only) (logs terbium:~dcausse/commons_reindex/cirrus_log)
  • 14:27 zfilipin@tin: Synchronized static/images/project-logos: SWAT: HD logos for multiple wikis (T150618) (duration: 00m 49s)
  • 14:25 zfilipin@tin: Synchronized static/images/project-logos/bgwiki-2x.png: SWAT: HD logos for multiple wikis (T150618) (duration: 00m 47s)
  • 14:24 zfilipin@tin: Synchronized static/images/project-logos/bgwiki-1.5x.png: SWAT: HD logos for multiple wikis (T150618) (duration: 00m 47s)
  • 14:23 zfilipin@tin: Synchronized static/images/project-logos/bawiki-2x.png: SWAT: HD logos for multiple wikis (T150618) (duration: 00m 47s)
  • 14:22 zfilipin@tin: Synchronized static/images/project-logos/bawiki-1.5x.png: SWAT: HD logos for multiple wikis (T150618) (duration: 00m 48s)
  • 14:20 zfilipin@tin: Synchronized static/images/project-logos/avwiki-2x.png: SWAT: HD logos for multiple wikis (T150618) (duration: 00m 51s)
  • 14:19 zfilipin@tin: Synchronized static/images/project-logos/avwiki-1.5x.png: SWAT: HD logos for multiple wikis (T150618) (duration: 01m 10s)
  • 14:14 dcausse: elastic@eqiad T150232: commonswiki reindex failed again
  • 14:06 ema: downgrading varnish on cp3043 to 4.1.3-1wm3
  • 13:39 marostegui: Stopping replication in db2066 for maintenance - T150518
  • 13:21 moritzm: upgrading pillow on thumbor1002 to 3.4.2 (latest version from backports with security fixes)
  • 12:48 moritzm: installing pillow security updates on non-mediawiki systems
  • 12:23 moritzm: installing pillow security updates on mediawiki canary servers
  • 12:03 moritzm: upgrading pillow on thumbor1001 to 3.4.2 (latest version from backports with security fixes)
  • 09:37 dcausse: elastic@eqiad T150232: reindexing commonswiki from terbium (logs in ~dcausse/commons_reindex/cirrus_log)
  • 09:11 marostegui: Deploy schema change s4 commonswiki templatelinks db1064 - T149079
  • 08:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064 - T149079 (duration: 03m 07s)
  • 08:21 marostegui: Deploy schema change labsdb1001 s4 commonswiki revision table (T147305)
  • 08:20 moritzm: installing curl security updates/rolling restart of app servers in eqiad
  • 07:52 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2066 - T150518 (duration: 03m 09s)
  • 02:41 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Nov 14 02:41:12 UTC 2016 (duration 4m 16s)
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.2) (duration: 25m 04s)

2016-11-13

  • 18:53 volans: rmmod acpi_pad on mira
  • 02:20 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Nov 13 02:20:58 UTC 2016 (duration 4m 18s)
  • 02:16 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.2) (duration: 05m 34s)
  • 01:24 Krenair: Spoken to User:Nirzardp for T150554, set a new password

2016-11-12

  • 19:53 tgr: deployed patch for T150554
  • 18:42 Krenair: done with my shell-granted sysop flag on foundationwiki, have removed it
  • 14:59 reedy@tin: Synchronized wmf-config/CommonSettings.php: Enable OATHAuth for all sysop, crat, oversight and checkuser (duration: 00m 47s)
  • 14:33 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable OATHAuth on fishbowl wikis, bump password requirements (duration: 00m 50s)
  • 14:26 Reedy: Created OATHAuth tables on all fishbowl wikis
  • 13:37 Krenair: `mwscript createAndPromote.php foundationwiki --sysop "Alex Monk (WMF)" --force` temporarily
  • 02:21 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Nov 12 02:21:40 UTC 2016 (duration 4m 29s)
  • 02:17 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.2) (duration: 05m 30s)

2016-11-11

  • 21:15 hashar: Restarted Jenkins. This time ZMQ managed to bind to port 8888
  • 21:11 hashar: jenkins: disabled/reenabled the ZMQ Event Publisher. Apparently it refused to start
  • 21:06 hashar: Restarted Jenkins
  • 14:03 mobrovac: restarting RESTBase to pick up https://gerrit.wikimedia.org/r/#/c/320529/
  • 14:02 moritzm: restarting hhvm on canary app servers to pick up libcurl update
  • 13:11 moritzm: installing curl security updates
  • 11:10 ema: cp3043 repooled with gethdr_extrachance=100 (T150503)
  • 10:59 ema: cp3043 depooled, testing https://phabricator.wikimedia.org/P4406 (T150503)
  • 10:51 elukey: restored mw1284 to its normal settings
  • 10:14 marostegui: Deploy alter table dbstore1002 s4 commonswiki.revision - T147305
  • 10:05 elukey: increasing apache log level on mw1284 (depooling, applying config manually, re-pooling with lower weight) for a 503 investigation
  • 09:39 marostegui: Deploy schema change s4 commonswiki.revision db1069 - T147305
  • 07:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1068 - T149079 (duration: 00m 48s)
  • 02:28 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Nov 11 02:28:24 UTC 2016 (duration 5m 14s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.2) (duration: 04m 56s)
  • 01:45 mutante: gerrit now has higher "packedGitLimit" of 2g, goal is to reduce Gerrit slowdowns
  • 01:39 mutante: gerrit restarting for config change 317322 (T148478)
  • 01:04 godog: revert swift ring change for ms-be1027
  • 00:23 godog: swift eqiad-prod: ms-be1027 to weight 1000 - T136631
  • 00:18 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Remove registered trademark symbol from officewiki footer (T95007) (duration: 00m 48s)
  • 00:15 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable {{NOINDEX}} as a noindex template on enwiki (2/2) (T149538) (duration: 00m 47s)
  • 00:13 catrope@tin: Synchronized wmf-config/CommonSettings.php: Enable {{NOINDEX}} as a noindex template on enwiki (1/2) (T149538) (duration: 00m 49s)

2016-11-10

  • 23:10 mutante: mw1185 - service hhvm restart
  • 23:00 maxsem@tin: Finished scap: https://gerrit.wikimedia.org/r/#/c/320864/ (duration: 22m 42s)
  • 22:38 maxsem@tin: Started scap: https://gerrit.wikimedia.org/r/#/c/320864/
  • 22:35 maxsem@tin: scap sync-l10n completed (1.29.0-wmf.2) (duration: 01m 01s)
  • 21:15 papaul: kafka2003 - signing puppet certs, salt-key, initial run
  • 21:09 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.2
  • 21:08 twentyafterfour@tin: Synchronized php-1.29.0-wmf.2/extensions/ORES/includes/ApiHooks.php: deploy I86e97b (duration: 00m 47s)
  • 20:54 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: Revert "all wikis to 1.29.0-wmf.2" (Fatal errors spike)
  • 20:49 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.2
  • 20:33 mutante: neon - deactivated puppet node, scheduled icinga downtime, shutdown server permanently (T125023)
  • 20:32 mutante: neon - shutdown -h now (scheduled 3 days downtime, nothing that looked worth saving in homes)
  • 20:28 mutante: neon - deactivate puppet node
  • 20:25 twentyafterfour@tin: Synchronized wmf-config/InitialiseSettings-labs.php: sync LABS: enable mapframe everywhere (I35709e) (duration: 00m 50s)
  • 19:37 dcausse: elastic@eqiad: reindexing commonswiki (logs in terbium.eqiad.wmnet:~dcausse/commons_reindex/cirrus_log) - T150232
  • 19:33 bblack: cache_*: restarting nginx for libssl update (seamless)
  • 19:31 bblack: upgrading libssl1.1 to 1.1.0c on other misc hosts...
  • 19:30 Pchelolo: update RESTBase to 6bfa0f75f
  • 19:24 Pchelolo: update RESTBase to 6bfa0f75f - canary on restbase1007
  • 19:22 bblack: upgrading openssl to 1.1.0c on cache_*
  • 19:19 Pchelolo: update RESTBase to 6bfa0f75f - staging
  • 19:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add PageViewInfo log channel (T129602) (duration: 00m 49s)
  • 19:09 thcipriani@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: Make beta PageViewInfo use the production pageview API (T129602) (labs-only-change) (duration: 00m 48s)
  • 19:02 cwd: updated payments from c1fa73c to 3b3c8ce
  • 18:57 moritzm: uploaded openssl 1.1.0c for jessie-wikimedia to carbon
  • 18:42 paravoid: upgrading (and restarting) nginx on sodium
  • 17:46 urandom: T133395: Convert final 25 RESTBase tables to TWCS
  • 17:34 cwd: deployed EVERYTHING... changed config tree to c6a7b17 and civicrm-authonly from 56eadab to df50d2d
  • 17:33 cwd: restarted slander
  • 17:32 cwd: restarted dash
  • 16:58 urandom: T133395: Performing next 25 RESTBase table conversions to TWCS
  • 15:09 marostegui: Deploy schema change s4 commonswiki.template links (db1068) - https://phabricator.wikimedia.org/T149079
  • 15:01 elukey: restored mw1284 to its settings
  • 14:53 jynus: applying schema change on s3 (page) T69223
  • 14:47 elukey: de-pooling mw1284 to raise mod_proxy_fcgi log level manually (temporary for an ongoing investigation)
  • 14:31 bblack: cache_text: upgrade nginx to 1.11.4-1+wmf14
  • 14:20 bblack: cache_upload: upgrade nginx to 1.11.4-1+wmf14
  • 14:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1068 - T149079 (duration: 00m 48s)
  • 13:54 bblack: restarting varnishes on cache_maps + cache_misc
  • 13:51 marostegui: Enabled gtid+ssl on db2010,db2012,db2030
  • 13:45 bblack: cache_misc: upgrade nginx to 1.11.4-1+wmf14
  • 13:29 bblack: cache_maps: upgrade nginx to 1.11.4-1+wmf14
  • 13:25 marostegui: Restarting mysql in misc shard slaves (only codfw - db2010,db2012,db2030) to apply a MySQL config - T149418
  • 13:20 gehel: restart wdqs1* for jvm update
  • 13:07 gehel: restart wdqs2* for jvm update
  • 12:01 ema: upgrading cp1068 (text-eqiad) to varnish 4 -- T131503
  • 11:45 ema: upgrading cp1067 (text-eqiad) to varnish 4 -- T131503
  • 11:25 ema: upgrading cp1066 (text-eqiad) to varnish 4 -- T131503
  • 11:08 ema: upgrading cp1065 (text-eqiad) to varnish 4 -- T131503
  • 10:43 ema: upgrading cp1055 (text-eqiad) to varnish 4 -- T131503
  • 10:29 ema: upgrading cp1054 (text-eqiad) to varnish 4 -- T131503
  • 10:26 _joe_: powercycling mw1280, unresponsive to ping, blank unresponsive console
  • 10:25 moritzm: rolling restart of zookeeper in eqiad to pick up java security update
  • 10:10 moritzm: rolling restart of zookeeper in codfw to pick up java security update
  • 09:53 ema: upgrading cp1053 (text-eqiad) to varnish 4 -- T131503
  • 09:44 marostegui: Deploy gtid_domain_id mysql flag for misc shards - https://phabricator.wikimedia.org/T149418
  • 09:43 elukey: restarting druid daemons on druid100[123] for openjdk updates
  • 09:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: wmf-config/db-codfw.php Depool db1059 - T149079. Repool db2048 T150334 (duration: 00m 50s)
  • 09:22 dcausse: elastic@codfw: reindexing commonswiki (logs in wasat.codfw.wmnet:~dcausse/commons_reindex/cirrus_log)
  • 09:06 moritzm: rolling restart of elasticsearch on logstash100[4-6] for picking up a Java security update
  • 08:55 moritzm: rebooting ruthenium for kernel update
  • 08:30 moritzm: rebooting copper for kernel update
  • 08:06 moritzm: rebooting bast3001 for kernel update
  • 08:01 marostegui: Deploy schema change s4 commonswiki.revision (dbstore1002) - T147305
  • 04:28 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Beta Cluster only (duration: 00m 51s)
  • 04:26 mattflaschen@tin: Synchronized wmf-config/CommonSettings-labs.php: Beta Cluster only (duration: 00m 59s)
  • 02:38 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Nov 10 02:38:24 UTC 2016 (duration 4m 36s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.2) (duration: 05m 52s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.1) (duration: 07m 23s)
  • 00:46 bblack: nginx-1.11.4-1+wmf14 uploaded to carbon jessie-wikimedia (only deployed to cp1008 for now) - T93927 - T148917 - T144523
  • 00:37 godog: remove files on iridium:/tmp older than 5d - T150396
  • 00:14 maxsem@tin: Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/313211/ (duration: 00m 52s)
  • 00:09 maxsem@tin: Synchronized wmf-config/CirrusSearch-labs.php: [labs only] https://gerrit.wikimedia.org/r/#/c/313037/ (duration: 00m 47s)
  • 00:06 maxsem@tin: Synchronized wmf-config/CirrusSearch-labs.php: [labs only] https://gerrit.wikimedia.org/r/#/c/313037/ (duration: 00m 48s)

2016-11-09

  • 23:38 godog: update cassandra aggregation scheme for 'count' metrics - T121789
  • 23:25 godog: silence lutetium flapping check_mysql for two days
  • 21:53 eileen1: jobs stopped, dedupe (*2) donations queue
  • 21:21 eileen1: update CiviCRM from 7ee2ce4 to df50d2d
  • 21:18 bearND: deployed mobileapps 106f4cd
  • 21:14 bearND: starting mobileapps deploy
  • 20:59 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.2
  • 20:32 urandom: T133395: Converting next 25 RESTBase tables to time-window compaction
  • 20:13 Krenair: created missing wikilove_log tables on azwiki and labtestwiki - T150321
  • 20:05 Krinkle: Killed statsv.py process on hafnium. Seems to have fixed it.
  • 20:05 Krinkle: statsv->graphite has been down for 9 hours since roughly 10AM UTC
  • 19:51 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable <mapframe> on ruwiki (T138057) (duration: 00m 48s)
  • 19:12 godog: upload cassandra-tools-wmf 1.0.0-1 to jessie-wikimedia on carbon - T150304
  • 18:47 marostegui: Stopping MySQL dbstore2001 - taking a snapshot - T149457
  • 18:37 jynus: partitioning db2042- it will have temporarily lag for 10-20 hours
  • 18:36 mutante: neon stopping nsca and apache
  • 18:33 mutante: neon (formerly icinga) remove from puppet, revoke cert, delete salt key, stop icinga service ...
  • 18:33 godog: deploy python-thumbor-wikimedia 0.1.29 to thumbor100[12]
  • 17:41 robh: the puppet failures on the frack hosts are known and have been reported to jeff
  • 17:34 jynus: rebooting db2048 for kernel upgrade
  • 16:18 ema: upgrading cp3043 (text-esams) to varnish 4 -- T131503
  • 15:58 ema: upgrading cp3042 (text-esams) to varnish 4 -- T131503
  • 15:47 urandom: T133395: Converting the next 25 RESTBase keyspaces to TWCS
  • 15:34 ema: upgrading cp3041 (text-esams) to varnish 4 -- T131503
  • 15:30 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2048 (duration: 00m 50s)
  • 15:23 ema: upgrading cp3040 (text-esams) to varnish 4 -- T131503
  • 15:11 ema: upgrading cp3033 (text-esams) to varnish 4 -- T131503
  • 15:07 hashar: restarting Jenkins (java update)
  • 14:59 gehel: clear zero sized log files on logstash* (leftover from disk space issues)
  • 14:58 ema: upgrading cp3032 (text-esams) to varnish 4 -- T131503
  • 14:56 zeljkof: EU SWAT finished
  • 14:53 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Cleanup unused config variables (T148853) (duration: 00m 48s)
  • 14:48 zfilipin@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: Cleanup unused config variables (T148853) (duration: 00m 47s)
  • 14:41 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: [cirrus] Activate BM25 on top 10 wikis: Step 3 (T147508) (duration: 00m 48s)
  • 14:32 ema: upgrading cp3031 (text-esams) to varnish 4 -- T131503
  • 14:29 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: [cirrus] Increase the number of shards to 15 for commonswiki_file (T148736) (duration: 00m 49s)
  • 14:28 zfilipin@tin: Synchronized tests/cirrusTest.php: SWAT: [cirrus] Increase the number of shards to 15 for commonswiki_file (T148736) (duration: 00m 58s)
  • 14:13 elukey: rebooting kafka1014.eqiad.wmnet for kernel and openjdk upgrades
  • 14:11 ema: upgrading cp3030 (text-esams) to varnish 4 -- T131503
  • 13:58 elukey: stopping kafka* daemons on kafka1014 to upgrade its fstab with UUID (T147879)
  • 13:46 elukey: rebooting kafka1012 for kernel and openjdk updates
  • 13:35 elukey: stopping kafka* daemons on kafka1012 to upgrade its fstab with UUID (T147879)
  • 13:02 kart__: Update cxserver to 17f9deb
  • 12:57 elukey: rebooting kafka1022 for kernel + openjdk updates
  • 12:14 hashar: CI gate for MediaWiki is back. Reverted an oojs-ui version bump that triggered tests failure but was not caught properly by CI. T150323
  • 11:17 hashar: CI gate for MediaWiki fails tests. On it. See https://phabricator.wikimedia.org/T150323
  • 11:10 moritzm: rebooting mw1162 for kernel update
  • 11:09 ema: finished upgrading cache_text ulsfo to varnish 4.1.3-1wm3 T150247
  • 11:08 hashar: contint1001 apt-get upgrade packages and purging unneeded ones (left over from a puppet manifest that is no more applied)
  • 10:52 elukey: restarting kafka* on kafka1013 for openjkd upgrades
  • 10:38 mobrovac: change-prop deploying e0040ac
  • 10:33 elukey: rebooting kafka1020 for kernel and openjdk upgrades
  • 10:21 moritzm: rebooting nescio for kernel update
  • 10:11 jynus: stopping and reimaging db2042 for upgrade
  • 10:04 ema: upgrading cache_text ulsfo to varnish 4.1.3-1wm3 T150247
  • 10:03 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2042 (duration: 00m 48s)
  • 09:55 moritzm: rebooting maerlant for kernel update
  • 09:35 moritzm: rebooting hydrogen for kernel update
  • 09:35 elukey: rebooting kafka1018 for kernel + openjdk upgrade
  • 08:43 moritzm: rebooting bast1001 for kernel update
  • 08:25 moritzm: rolling reboot of logstash1002/1003 for kernel update
  • 08:16 moritzm: restarted ntp on mw2128 (was stuck in XFAC state)
  • 04:12 moritzm: rebooting notebook1001/1002 for kernel update
  • 03:47 moritzm: installing java security updates on meitnerium/archiva
  • 03:44 apergos: rolling reboots of mw2097-2134 for new kernel
  • 03:44 mutante: gallium.wikimedia.org removed from DNS (T95757)
  • 02:56 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Nov 9 02:56:22 UTC 2016 (duration 5m 24s)
  • 02:51 moritzm: uploaded libuv 1.9.0 for jessie-wikimedia to carbon
  • 02:50 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.2) (duration: 10m 39s)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.1) (duration: 05m 42s)

2016-11-08

  • 23:25 eileen1: update CiviCRM from 63d35fc to 7ee2ce4
  • 22:40 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 wikis to 1.29.0-wmf.2
  • 22:34 twentyafterfour@tin: Finished scap: testwikis to 1.29.0-wmf.2 (duration: 49m 32s)
  • 22:22 moritzm: rebooting ms1001 for kernel update
  • 22:01 moritzm: rolling reboot of mc2* for kernel update
  • 21:56 Pchelolo: RESTBase update to 1d72b8abc
  • 21:44 twentyafterfour@tin: Started scap: testwikis to 1.29.0-wmf.2
  • 21:40 mutante: gallium, ex-CI server, shutdown -h now (the contents of your home dir have been copied to contint1001 in /home/gallium-home/) (T95757)
  • 21:39 mutante: gallium, ex-CI server, shutdown -h now (the contents of your home dir have been copied to contint1001 in /home/gallium-home/)
  • 21:30 Pchelolo: RESTBase update to 1d72b8abc - canary on restbase1007
  • 21:21 Pchelolo: RESTBase update to 1d72b8abc - staging
  • 20:55 godog: upload prometheus-memcached-exporter 0.3.0+ds1-1 to carbon - T147326
  • 20:31 apergos: rolling restarts of mw1218-1222 for new kernel
  • 20:09 mobrovac: change-prop deploying 0c29003
  • 19:39 ejegg: updated fundraising tools from 7ff719a to d14d47a
  • 19:29 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 (duration: 00m 48s)
  • 19:22 urandom: T133395: Converting 25 additional RESTBase tables to TWCS
  • 19:19 thcipriani@tin: Synchronized wmf-config/CommonSettings-labs.php: SWAT: LABS: fixed incorrect $wgGraphAllowedDomains (housekeeping sync) (duration: 02m 42s)
  • 19:12 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set timezone for bdwikimedia to "Asia/Dhaka" (T150252) (duration: 00m 47s)
  • 18:59 apergos: rolling reboots of mw1180-1188 for new kernel
  • 18:12 jynus: performing schema change templatelinks on db1089 T139090
  • 18:09 apergos: rolling reboots of mw1170-1179 for new kernel
  • 18:06 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 to safely apply pending schema change (duration: 01m 02s)
  • 17:54 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1080 (duration: 02m 45s)
  • 17:41 Krinkle: mwscript deleteEqualMessages.php --wiki ptwikinews (T45917)
  • 17:39 reedy@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Fix variable typo (duration: 00m 59s)
  • 17:27 apergos: rolling restarts of mw1209 - mw1216 for new kernel
  • 17:20 reedy@tin: Synchronized wmf-config/CommonSettings-labs.php: Graphs config (duration: 00m 47s)
  • 17:19 reedy@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Enable Revision Slider (duration: 00m 47s)
  • 17:19 ema: upgrade finished -> cache_text codfw to varnish 4.1.3-1wm3 T150247
  • 17:15 reedy@tin: Synchronized wmf-config/CommonSettings-labs.php: Add PageViewInfo, Remove dupe OATHAuth config (duration: 00m 47s)
  • 17:13 reedy@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Add PageViewInfo (duration: 00m 46s)
  • 17:12 reedy@tin: Synchronized wmf-config/extension-list-labs: Add PageViewInfo (duration: 00m 46s)
  • 16:31 apergos: rolling restart of mw1204-1208 for new kernel
  • 16:24 jynus: performing schema change templatelinks on db1080 T139090
  • 16:21 Krinkle: mwscript deleteEqualMessages.php --wiki kkwiki (T45917)
  • 16:18 ema: upgrading cache_text codfw to varnish 4.1.3-1wm3 T150247
  • 16:11 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db1052; depool db1080; reorganize trafic weight for s1 -second try (duration: 00m 46s)
  • 16:04 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db1052; depool db1080; reorganize trafic weight for s1 (duration: 00m 46s)
  • 15:23 hashar@tin: Synchronized php-1.29.0-wmf.1/extensions/Kartographer/modules/maplink/maplink.js: Search .mw-body instead of #content to support all the skins - T150148 (duration: 00m 47s)
  • 14:09 moritzm: rebooting chromium for kernel update
  • 13:58 hashar: European SWAT on hold while some memcached/elasticsearch issues are being figured out
  • 13:42 apergos: deferring reboots of mw1204-1216 and mw1170-1188 for a while
  • 13:06 moritzm: rolling reboot of restbase-test for kernel update
  • 12:52 ema: upgrading pinkunicorn to varnish 4.1.3-1wm3 T150247
  • 12:41 moritzm: restarted ntp on mw1194, stuck in XFAC state
  • 12:39 moritzm: rebooting iron for kernel update
  • 12:28 moritzm: restarted ntp on mw1166, stuck in XFAC state
  • 12:24 moritzm: depooling/rebooting/repooling scb1002 for kernel update
  • 12:23 moritzm: restarted ntp on hafnium, stuck in XFAC state
  • 12:12 moritzm: depooling/rebooting/repooling scb1001 for kernel update
  • 12:11 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool back db1051 and api servers to high load after hw issues (duration: 02m 45s)
  • 11:59 apergos: rolling restart of mw1170-1216 for new kernel
  • 11:54 apergos: restart of mw1240, 1253 for new kernel
  • 11:44 moritzm: rebooting logstash1001 for kernel update
  • 11:42 moritzm: rearmed keyholder on mira
  • 11:38 apergos: rolling reboot of mw1161, mw1163-1169 for new kernel
  • 11:37 jynus: running schema change on db1045 (pagelinks) T139090
  • 11:30 moritzm: rebooting mira for kernel update
  • 11:19 apergos: rolling restart of mw2080-2085 for new kernel
  • 11:15 jynus: running schema change on db2070 (pagelinks) T139090
  • 11:02 mark: Activated cr2-eqiad bgp group IX4
  • 10:53 apergos: rolling reboot of mw2090 - mw2096 for new kernel
  • 10:41 moritzm: rearmed keyholder on tin
  • 10:39 apergos: rebooting mw2086 - mw2089 for new kernel
  • 10:36 jynus: rebooting and upgrading db2012
  • 10:33 moritzm: rebooting tin for kernel update
  • 10:23 moritzm: restarted ntp on mw2075, stuck in XFAC state
  • 10:22 moritzm: rebooting hafnium for kernel update
  • 10:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1059 - T149079 T147305 (duration: 00m 57s)
  • 10:06 moritzm: rebooting rhenium for kernel update
  • 09:59 moritzm: rebooting oxygen for kernel update
  • 09:57 apergos: rebooting mw2075 - mw2079 for new kernel
  • 09:49 moritzm: rebooting install2001 for kernel update
  • 09:42 moritzm: rebooting bast2001 for kernel update
  • 09:24 moritzm: rebooting graphite1002 for kernel update
  • 08:44 marostegui: Deploy schema change s4 commonswiki.revision table - T147305
  • 08:21 moritzm: rolling reboot of swift backend servers in esams for kernel update
  • 08:09 moritzm: rolling reboot of parsoid in eqiad for kernel update
  • 08:04 elukey: rebooting stat1001 for kernel upgrades (will cause a brief unavail for analytics websites)
  • 07:30 marostegui: Deploy schema change s5 dewiki.revision on codfw master (db2023) - T148967
  • 07:10 _joe_: stopped logstash, removed large logfiles that were erroneously non-rotated, started logstash across the logstash cluster
  • 02:31 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Nov 8 02:31:30 UTC 2016 (duration 4m 16s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.1) (duration: 09m 52s)
  • 01:48 eileen1: update CiviCRM from 45a1b9a to 63d35fc
  • 01:27 eileen1: enabled GlobalCollect Recurring Donations Dedupe CiviCRM Contacts GlobalCollect audit file download Fundraiser Public Data Export
  • 01:23 eileen1: enabled Thank you mail send
  • 01:17 eileen1: disabled Public Data Export
  • 01:06 eileen1: disable GlobalCollect audit file
  • 01:05 eileen1: disable jobs http://localhost:9000/job/Dedupe%20CiviCRM%20contacts/ http://localhost:9000/job/Dedupe%20CiviCRM%20contacts%20(name-match)/ Thank you mail send
  • 01:04 eileen1: disable jobs GlobalCollect Recurring Donations
  • 00:51 godog: swift eqiad-prod: set weight for ms-be1021 sd[h-n] to 3000 - T139767
  • 00:34 dereckson@tin: Synchronized wmf-config/throttle.php: Nashville Science edit-a-thon (Vanderbilt library) (T150207) (duration: 00m 47s)
  • 00:12 dereckson@tin: Synchronized php-1.29.0-wmf.1/extensions/CentralNotice/: Handle banner loader errors on client (T149107) (duration: 00m 49s)
  • 00:03 Dereckson: projectcom.wikimedia.org wiki creation done

2016-11-07

  • 23:55 Dereckson: Created 'Mjohnson (WMF)' user account on projectcom.wikimedia.org as bureaucrat
  • 23:55 godog: delete parsoid from releases.wikimedia.org and varnish-ban on cache_misc
  • 23:50 Krinkle: mwscript deleteEqualMessages.php --wiki jawikinews (T45917)
  • 23:44 Dereckson: Created storage container for projectcomwiki (private)
  • 23:43 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Initial configuration for projectcom.wikimedia.org (duration: 00m 53s)
  • 23:42 dereckson@tin: rebuilt wikiversions.php and synchronized wikiversions files: Added projectcomwiki
  • 23:41 dereckson@tin: Synchronized dblists/: Added projectcomwiki (duration: 00m 48s)
  • 23:35 Dereckson: projectcomwiki database created
  • 23:32 Krinkle: mwscript deleteEqualMessages.php --wiki jawikibooks (T45917)
  • 23:22 Dereckson: Starting projectcom.wikimedia.org wiki creation
  • 23:22 Dereckson: ec.wikimedia.org wiki creation done
  • 23:19 Dereckson: Created storage container for ec.wikimedia (private)
  • 23:16 dereckson@tin: Synchronized wmf-config/interwiki.php: Update interwiki map for vote. and ec.wikimedia (Gerrit:320308) (duration: 00m 47s)
  • 23:15 Krinkle: mwscript deleteEqualMessages.php --wiki gawiktionary (T45917)
  • 22:51 reedy@tin: Synchronized php-1.29.0-wmf.1/includes/specials/: Deploy security fix T150044 (duration: 00m 54s)
  • 22:44 reedy@tin: Synchronized wmf-config/CommonSettings.php: Set wgOATHAuthAccountPrefix and Don't override message key in badpass log entries (duration: 00m 47s)
  • 22:39 mutante: Un nuevo wiki ha nacido. Bienvenido grupo de usuarios Ecuador Wikimedia. https://ec.wikimedia.org (T135521)
  • 22:35 dereckson@tin: Synchronized multiversion/MWMultiVersion.php: Add ec.wikimedia to MWMultiVersion (T135521) (duration: 00m 49s)
  • 22:25 Dereckson: Created tables for OATHAuth on ec.wikimedia
  • 22:22 dereckson@tin: Synchronized static/images/project-logos/: Logos for ec.wikimedia (T135521) (duration: 00m 48s)
  • 22:20 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: ec.wikimedia initial configuration (T135521) (duration: 00m 47s)
  • 22:18 mutante: gallium - stopped apache, stopped salt, removed zuul cronjob
  • 22:18 dereckson@tin: rebuilt wikiversions.php and synchronized wikiversions files: (no message)
  • 22:17 dereckson@tin: Synchronized dblists: (no message) (duration: 00m 53s)
  • 22:15 mutante: gallium - delete salt key, minion is stopped
  • 22:10 mutante: gallium - revoke puppet cert, deactivate node
  • 22:03 Dereckson: Starting ec.wikimedia.org wiki creation
  • 21:50 gehel: deploying latest wdqs gui and blazegraph
  • 21:38 bearND: deployed mobileapps 4202cbb
  • 21:37 Amir1: ores deployment c61b9c1 is done
  • 21:30 bearND: starting mobileapps deploy
  • 21:29 arlolra: updated Parsoid to version 2c2fe425
  • 21:18 arlolra: starting Parsoid deploy
  • 21:12 Amir1: deploying c61b9c1 from ORES to all nodes (T149730)
  • 21:09 Amir1: deploying c61b9c1 from ORES into canary nodes (T149730)
  • 20:44 urandom: T133395: restbase2001-b.codfw.wmnet: Performing user-defined compaction of la-169239-big-Data.db and la-172629-big-Data.db
  • 20:38 jynus: upgrading new labsdbs to mariadb 10.1.19
  • 20:32 bblack: repooling cp4018 (done experimenting)
  • 20:21 mutante: projectcom.wikimedia.org created in DNS (T143138)
  • 20:14 godog: cmjohnson1 is performing work on LVS in row D, there might be flaps
  • 19:58 ejegg: updated payments-wiki from ed98772 to c1fa73c
  • 19:42 ejegg: updated civicrm from bdc2786 to 45a1b9a
  • 18:00 urandom: T133395: Convert local_group_*_title__revisions.{data,idx_by_rev_ever} tables to time-window compaction
  • 17:56 bd808@tin: Synchronized php-1.29.0-wmf.1/includes/exception/MWExceptionHandler.php: MWExceptionHandler: Do not use 'exception' for custom log data (T150106) (duration: 00m 47s)
  • 17:40 jynus: performing schema change on s7 (imagelinks) T139090
  • 16:01 mark: Reactivated cr2-eqiad IX6 BGP group (ipv6 sessions)
  • 16:00 mark: Chris moved cr2-eqiad:xe-5/3/3 to xe-3/3/3
  • 15:54 mark: Disabling cr2-eqiad BGP groups IX4/IX6 (all Equinix Ashburn BGP sessions)
  • 15:44 elukey: started kafka-mirror-main-eqiad_to_analytics.service on kafka1012
  • 15:39 moritzm: rebooting radium for kernel update
  • 15:38 mark: Reenabling OSPF/OSPF3 on cr2-codfw:xe-5/0/1 after eqiad side port move to xe-3/2/3
  • 15:31 mark: Disabling OSPF/OSPF3 on cr2-codfw:xe-5/0/1 for eqiad side port move
  • 15:26 elukey: rebooting kafka1013 for kernel upgrades
  • 15:22 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2042 - T149553 (duration: 00m 49s)
  • 15:19 hashar: Restarting Jenkins (deadlock in beta cluster Jenkins jobs)
  • 15:14 mark: Reactivate cr2-eqiad BGP peering with pfw1-eqiad
  • 15:13 mark: Chris moved cr2-eqiad:xe-5/0/3 to xe-3/3/2
  • 15:10 hashar: European SWAT completed
  • 15:09 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: shortUrl for bdwikimedia and tcywiki T146014 and T150166 (duration: 01m 51s)
  • 15:08 mark: Deactivate cr2-eqiad BGP peering with pfw1-eqiad
  • 15:07 marostegui: Enabling gtid_domain_id db1020 (m2 master) - T149418
  • 15:07 mark: Reactivate cr1-eqiad BGP peering with pfw1-eqiad
  • 15:05 mark: Chris moved cr1-eqiad:xe-5/0/3 to xe-3/3/2
  • 15:03 hashar: T146014 mwscript extensions/ShortUrl/populateShortUrlTable.php --wiki=bdwikimedia (714 titles done)
  • 15:02 hashar: T150166 mwscript extensions/ShortUrl/populateShortUrlTable.php --wiki=tcywiki (1569 titles done)
  • 15:02 moritzm: rebooting mw1261-mw1265 (canary app servers) for kernel update
  • 15:01 hashar: T146014 mwscript sql.php --wiki=bdwikimedia /srv/mediawiki/php-1.29.0-wmf.1/extensions/ShortUrl/schemas/shorturls.sql
  • 15:01 hashar: T150166 mwscript sql.php --wiki=tcywiki /srv/mediawiki/php-1.29.0-wmf.1/extensions/ShortUrl/schemas/shorturls.sql
  • 15:00 mark: Deactivate cr1-eqiad BGP peering with pfw1-eqiad
  • 14:49 hashar: terbium: scap pull to add shortUrl tables to bdwikimedia and tcywiki
  • 14:42 hashar: fawiki: renaming user group 'autopatrol' to 'autopatrolled' for T139246 and T144699 with: mwscript migrateUserGroup.php --wiki=fawiki 'autopatrol' 'autopatrolled'
  • 14:42 hashar: fawiki Done! 417 users in group 'autopatrol' are now in 'autopatrolled' instead.
  • 14:40 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Rename 'autopatrol' to 'autopatrolled' on fawiki - T144699 T139246 (duration: 00m 47s)
  • 14:33 gehel: reboot maps-test* for kernel upgrade
  • 14:30 hashar@tin: Synchronized wmf-config: (no message) (duration: 00m 53s)
  • 14:10 hashar@tin: Synchronized php-1.29.0-wmf.1/extensions/Kartographer/extension.json: Fix monobook <maplink> (missing debounce dep) T145521 (duration: 00m 47s)
  • 13:56 gehel: reboot wdqs1* for kernel upgrade
  • 13:52 bblack: depooling cp4018 nginx+varnish-fe services for debugging
  • 13:36 gehel: reboot wdqs2* for kernel upgrade
  • 13:34 hashar: Flushed nodepool instances. It is bringing up fresh one now.
  • 13:26 moritzm: rebooting labnodepool1001 for kernel update
  • 13:19 hashar: shutting down Nodepool (labnodepool1001.eqiad.wmnet reboot)
  • 13:06 moritzm: rebooting scandium for kernel update
  • 12:09 jynus: performing schema change on s6 (imagelinks) T139090
  • 12:00 moritzm: rebooting wtp1001 for kernel update
  • 11:40 ema: cp3043: repool varnish-be and varnish-be-rand (T149881)
  • 11:33 moritzm: rebooting cassandra test hosts (cerium, praseodymium, xenon) for kernel update
  • 10:49 moritzm: rebooting mw1017/mw1099 for kernel update
  • 10:26 moritzm: rebooting cp1008 for kernel update
  • 10:19 moritzm: rebooting bast4001 for kernel update
  • 10:07 jynus: performing schema change on s5 (imagelinks) T139090
  • 08:46 moritzm: uploaded linux-meta 1.11 to carbon (pointing to the new Linux ABI package)
  • 08:44 marostegui: stopping mysql on db2042 - maintenance- T149553
  • 08:39 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2042 for maintenance - T149553 (duration: 00m 50s)
  • 08:30 marostegui: Deploy schema change on s4 master (db2019) commonswiki.revision - T147305
  • 07:02 _joe_: removing old logfiles on logstash hosts
  • 02:21 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Nov 7 02:21:02 UTC 2016 (duration 4m 18s)
  • 02:16 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.1) (duration: 05m 39s)

2016-11-06

  • 22:13 Dereckson: Run namespacesDupe maintenance script on gl.wikisource (T150143)
  • 10:13 elukey: removing logstash.log.1 from logstash100[123] to free some space
  • 02:22 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Nov 6 02:22:41 UTC 2016 (duration 4m 30s)
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.1) (duration: 06m 03s)

2016-11-05

  • 21:40 bd808: Deleted huge logstash1003:/var/log/logstash/logstash.log.1 log file; disk full
  • 21:39 bd808: Deleted huge logstash1002:/var/log/logstash/logstash.log.1 log file; disk full
  • 21:36 bd808@tin: Synchronized wmf-config/InitialiseSettings.php: logstash: Temporarily disable EventBus channel (T150106) (duration: 00m 50s)
  • 19:54 bd808: ELK stack problems are related to Elasticsearch index mapping. Some events are being rejected for not matching the expected mappings and that is filling up the disk on the logstash injestion hosts
  • 19:45 bd808: Forced several puppet runs on logstash1001 until things stopped changing; out of disk seemed to have messed up apt upgrades
  • 19:38 bd808: Elasticsearch on logstash1001 won't restart due to missing /etc/elasticsearch/scripts directory
  • 19:23 bd808: Restarted logstash on logstash1001
  • 19:14 bd808: Deleted huge logstash1001:/var/log/logstash/logstash.log.1 log file; disk full and difficult to debug with no free space on /
  • 02:21 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Nov 5 02:21:37 UTC 2016 (duration 4m 36s)
  • 02:17 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.1) (duration: 05m 25s)

2016-11-04

  • 22:43 godog: stop puppet on einsteinium and tegment to avoid log spam - T150061
  • 21:14 urandom: T133395: Starting user-defined compaction of local_group_wikipedia_T_parsoid_html.data, files la-169018-big-Data.db and la-171488-big-Data.db
  • 21:06 godog: compress huge daemon.log on einsteinium into /srv/
  • 18:11 moritzm: uploaded new jessie linux package based on 4.4.30 to carbon
  • 18:01 paravoid: moving mc1033-mc1036 from asw-d-eqiad to asw2-d-eqiad
  • 17:54 paravoid: reactivating cr1-eqiad:ae4 and its subinterfaces (VRRP bug seems to have been worked around)
  • 17:44 paravoid: moved cr1-eqiad:ae4 links from asw-d-eqiad:ae1 to to asw2-d-eqiad:ae1
  • 16:38 ema: upgrading cp4018 (text-ulsfo) to varnish 4 -- T131503
  • 16:22 ema: upgrading cp4017 (text-ulsfo) to varnish 4 -- T131503
  • 16:01 ema: upgrading cp4016 (text-ulsfo) to varnish 4 -- T131503
  • 15:37 ema: upgrading cp4010 (text-ulsfo) to varnish 4 -- T131503
  • 15:36 paravoid: set up 4x10G (ae0) links between asw-d-eqiad<->asw2-d-eqiad
  • 15:35 marostegui: reimage dbstore2002 - T150017
  • 15:20 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Remove wikitech bot group (duration: 00m 47s)
  • 15:17 reedy@tin: Synchronized wmf-config/CommonSettings.php: Simplify some wikitech config (duration: 00m 47s)
  • 15:16 reedy@tin: Synchronized wmf-config/wikitech.php: Stop double loading OATHAuth now, remove commented config (duration: 00m 47s)
  • 15:15 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Normalise wikitech OATHAuth loading config (duration: 00m 48s)
  • 15:06 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable OATHAuth on all private wikis (duration: 00m 49s)
  • 15:04 reedy@tin: Synchronized wmf-config/CommonSettings.php: Raise password requirements for private wikis, Abuse filter editors on enwiki, and make minimum bot password length to 8 (duration: 00m 47s)
  • 15:02 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: stage wmgElevateDefaultPasswordPolicy (duration: 00m 48s)
  • 14:49 ema: upgrading cp4009 (text-ulsfo) to varnish 4 -- T131503
  • 14:14 ema: upgrading cp4008 (text-ulsfo) to varnish 4 -- T131503
  • 11:33 mobrovac: restarting zotero
  • 10:58 moritzm: installing tar security updates
  • 10:21 moritzm: upgrading memcached on swift frontend servers in esams
  • 10:00 jynus: stopping db2011 for backup and reimage
  • 09:59 moritzm: upgrading memcached on swift frontend servers in codfw
  • 09:54 moritzm: upgrading memcached on jessie graphite systems
  • 09:26 _joe_: rebooting copper to allow enabling the memory cgroup
  • 09:10 marostegui: Reimage db2034 - T149553
  • 07:20 jynus: disabling alerting for slave lag fleet-wide for 1 hour to deploy new alerting script
  • 06:52 _joe_: restarted manually varnish text-backend on cp3041 - failing automatic restarts with "no space left on device"
  • 02:31 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Nov 4 02:31:26 UTC 2016 (duration 4m 39s)
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.1) (duration: 09m 08s)
  • 01:54 madhuvishy: Manually reimaging labstore2003 (T149870)
  • 01:35 catrope@tin: Synchronized php-1.29.0-wmf.1/extensions/Thanks: Avoid breakage after Flow uninstallation (duration: 00m 47s)
  • 01:18 catrope@terbium: scap failed: IOError [Errno 13] Permission denied: u'/srv/mediawiki-staging/wmf-config/ExtensionMessages-1.29.0-wmf.1.php' (duration: 00m 20s)
  • 01:17 catrope@terbium: Started scap: (no message)
  • 00:48 catrope@tin: Synchronized dblists/: Disable Flow on enwiki (T148611) (duration: 01m 04s)

2016-11-03

  • 23:23 thcipriani@tin: Synchronized php-1.29.0-wmf.1/extensions/EventBus/EventBus.php: SWAT: Add logging and check for empty JSON encoded body (T148251) (duration: 00m 47s)
  • 23:11 thcipriani@tin: Synchronized portals: SWAT: Bumping portals to master (T146807) (duration: 00m 52s)
  • 23:10 thcipriani@tin: Synchronized portals/prod/wikipedia.org/assets: SWAT: Bumping portals to master (T146807) (duration: 00m 51s)
  • 22:16 ejegg: updated CiviCRM from 4ff64af to bdc2786
  • 20:33 urandom: T133395: Enabling unchecked_tombstone_compaction and setting tombstone_threshold = .6 on "local_group_wikipedia_T_parsoid_html".data
  • 20:26 bblack: codfw cache_text - all nodes v4 and pooled - T131503
  • 20:10 bblack: codfw cache_text - all pooled nodes are v4 (2x still depooled-but-upgraded) - T131503
  • 20:04 ejegg: enabled donations queue consumer
  • 20:02 ejegg: disabled donations queue consumer
  • 20:01 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.1
  • 19:32 ejegg: updated CiviCRM from 954bab4 to 4ff64af
  • 19:31 mobrovac: change-prop deploying f107669
  • 19:28 twentyafterfour@tin: Synchronized php-1.29.0-wmf.1/extensions/PageImages/includes/ApiQueryPageImages.php: T149849 (duration: 00m 47s)
  • 19:23 mobrovac: restbase restarting to re-include wikidata domains for T149114
  • 19:17 Krenair: <twentyafterfour> !log In order to unblock the train for group2: deploying https://gerrit.wikimedia.org/r/#/c/319643/ refs T149059, T149849
  • 18:46 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix comment refs (T148327) (duration: 00m 47s)
  • 18:39 thcipriani@tin: Synchronized wmf-config: SWAT: LABS: Enable Map (GeoJSON) data on Commons (T149548) (housekeeping only sync) (duration: 00m 50s)
  • 18:33 thcipriani@tin: Synchronized php-1.29.0-wmf.1/extensions/Kartographer/includes/ApiQueryMapData.php: SWAT: Fix warning (T149923) (duration: 00m 47s)
  • 18:26 ema: repooling cp2016 (T131503)
  • 18:21 thcipriani@tin: Synchronized php-1.28.0-wmf.23/extensions/EventBus/EventBus.php: SWAT: Log more EventBus HTTP request/response context for HTTP errors (T148251) (duration: 00m 52s)
  • 18:19 thcipriani@tin: Synchronized php-1.29.0-wmf.1/extensions/EventBus/EventBus.php: SWAT: Log more EventBus HTTP request/response context for HTTP errors (T148251) (duration: 00m 49s)
  • 18:02 hashar: Added security rule for "puppet3-diffs" labs project to allow ssh connection from contint1001 instead of gallium
  • 16:20 jynus: stopping and upgrading labsdb1009,10,11 (also disabling temporarily puppet)
  • 16:16 papaul: OS install on labstore2001
  • 16:13 mutante: mw1205 - service hhvm restart
  • 15:59 jynus@tin: Synchronized wmf-config/db-eqiad.php: mariadb: Reduce db1051 load, it has hardware issues (duration: 00m 47s)
  • 15:41 moritzm: installing memcached security updates on graphite hosts
  • 15:39 mobrovac: scb in eqiad enabled puppet back
  • 15:26 mobrovac: scb in eqiad disabling puppet
  • 14:57 akosiaris: failover icinga from tegmen to einsteinium
  • 14:17 moritzm: exim reenabled on fermium after mailman update
  • 14:14 moritzm: temporarily stop exim on fermium for mailman update
  • 14:08 bblack: cache_text: nginx lossless restarts for libssl update - T144626 - T148917
  • 14:00 mobrovac: restbase rolling restart for T149114
  • 13:56 bblack: cache_upload: nginx lossless restarts for libssl update - T144626 - T148917
  • 13:49 bblack: cache_maps + cache_misc: nginx lossless restarts for libssl update - T144626 - T148917
  • 13:42 bblack: cp*: upgrade libssl1.1 to 1.1.0b-1+wmf2 (but no nginx restart yet) - T144626 - T148917
  • 13:36 bblack: cp1065: upgrade libssl1.1 to 1.1.0b-1+wmf2 - T144626 - T148917
  • 13:31 mobrovac: change-prop deploying a1bd739
  • 13:06 hashar@tin: Synchronized wmf-config/CommonSettings.php: Add wmgRevisionSliderBetaFeature (default true) (duration: 00m 46s)
  • 13:05 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add wmgRevisionSliderBetaFeature (default true) (duration: 00m 47s)
  • 13:04 paravoid: re-enabling cr2-esams:xe-0/1/3 + cr2-eqiad:xe-4/1/3 (esams-eqiad link)
  • 12:09 marostegui: Deploying schema change s4 commonswiki.revision only codfw - https://phabricator.wikimedia.org/T147305
  • 11:54 mobrovac: change-prop deploying 15eae87
  • 11:52 mobrovac: restbase deploy end of 1ec3b129
  • 11:48 akosiaris: reenable puppet across the fleet on hosts that I had disabled it. https://gerrit.wikimedia.org/r/#/c/316032/1 merged successfully
  • 11:32 mobrovac: restbase deploy start of 1ec3b129
  • 11:16 ema: depooling cp2016, cp2007, cp2019, cp2023: not caching properly (T131503)
  • 10:55 akosiaris: disable puppet throughout the fleet. merging https://gerrit.wikimedia.org/r/#/c/316032/1
  • 09:37 moritzm: uploaded openssl 1.1.0b-1+wmf2 for jessie-wikimedia to apt.wikimedia.org (adding the read_ahead bugfix and dropping the chapoly_pref patch)
  • 09:36 marostegui: Deploy schema change s5 dewiki.revision - only codfw https://phabricator.wikimedia.org/T148967
  • 09:24 ema: upgrading cp2016 (text-codfw) to varnish 4 -- T131503
  • 09:17 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2003.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=apertium'])
  • 09:17 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2004.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=apertium'])
  • 09:16 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=apertium'])
  • 09:16 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb1003.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=apertium'])
  • 09:00 hashar: contint1001: preliminary transfer of jenkins history from gallium using rsync
  • 08:51 hashar: gallium: unmounted /var/lib/jenkins/tmpfs freeing 512MBytes. Artifact from the past freeing up 512MBytes of memory
  • 08:47 ema: upgrading cp2007 (text-codfw) to varnish 4 -- T131503
  • 08:01 ema: upgrading cp2019 (text-codfw) to varnish 4 -- T131503
  • 07:56 jynus: restarting replication codfw -> eqiad on s1
  • 07:54 ema: repool cp2019 varnish-be, currently depooled for no valid reason
  • 07:51 jynus: stopping mysql on db1042
  • 07:42 ema: upgrading cp2023 (text-codfw) to varnish 4 -- T131503
  • 07:38 _joe_: rolling restart of pybal in esams
  • 07:30 _joe_: restarting pybal on lvs2005
  • 07:18 jynus: stopping and debugging db1073
  • 06:39 yuvipanda: attempting manual re-image of labstore2004
  • 06:28 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove references to db1042 (duration: 00m 46s)
  • 06:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1052; Depool db1073; Remove references to db1042 (duration: 00m 47s)
  • 03:44 Krenair: wikitech-static package updates
  • 03:05 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Nov 3 03:05:05 UTC 2016 (duration 5m 36s)
  • 02:59 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.1) (duration: 11m 28s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.28.0-wmf.23) (duration: 12m 08s)
  • 00:28 dereckson@tin: Synchronized php-1.29.0-wmf.1/resources/src/mediawiki.widgets/mw.widgets.TitleWidget.js: Follow-up Id0021594: Remove extra code for redlink suggestions (T149130) (duration: 00m 46s)
  • 00:23 dereckson@tin: Synchronized php-1.29.0-wmf.1/extensions/Kartographer/styles/: Set font size to 14px for both static and interactive maps (T149860) (duration: 00m 47s)

2016-11-02

  • 23:44 bblack: disabling port xe-4/1/3 on cr2-eqiad (wave to esams, level3, other side of earlier disable)
  • 23:37 bblack: disabling port xe-0/1/3 on cr2-esams (wave to eqiad, level3)
  • 22:45 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/319447/ (duration: 00m 47s)
  • 22:10 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/319415/ (duration: 00m 47s)
  • 21:52 ejegg: updated fundraising tools from f83e392 to 7ff719a
  • 21:46 hashar: Restarting Jenkins due to deadlock with the beta cluster jobs
  • 21:41 maxsem@tin: Synchronized php-1.29.0-wmf.1/extensions/Kartographer/: https://gerrit.wikimedia.org/r/#/c/319411/1 (duration: 00m 49s)
  • 21:35 bblack: pooling cp4016 - T149843
  • 21:09 bblack: pooling cp1055 - T149843
  • 21:09 twentyafterfour@tin: Synchronized php-1.29.0-wmf.1/includes/parser/CoreParserFunctions.php: Deploy https://gerrit.wikimedia.org/r/#/c/319400/ refs T149840, T149059 (duration: 00m 51s)
  • 20:49 mdholloway: deployed mobileapps 0ced96c
  • 20:46 mdholloway: starting mobileapps deployment
  • 20:35 bblack: depool cp1055 - T149843
  • 20:34 bblack: depool cp4016 - T149843
  • 20:27 arlolra: updated Parsoid to version 173d7e32 (T149241, T119228, T141723, T141905, T147742, T48580, T133320)
  • 20:09 arlolra: starting Parsoid deploy
  • 20:04 chasemp: maintain-views --databases tcywiki --debug on labsdb1001 and 1003
  • 20:03 chasemp: maintain-views --databases wikimania2017wiki --debug on labsdb1001 and 1003
  • 19:51 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.1
  • 19:47 chasemp: maintain-views --databases olowiki on labsdb1001 and 1003 to create view
  • 19:16 XenoRyet: updated DonationInterface from e86f23a to ed98772
  • 19:08 paravoid: rebooting asw2-d-eqiad
  • 18:56 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: MF Beta: Do not move first paragraph before infobox (T145216) (duration: 00m 49s)
  • 18:31 thcipriani@tin: Synchronized php-1.28.0-wmf.23/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: SWAT: Turn off Cirrus AB test on zh and ja (T147499) (duration: 00m 46s)
  • 18:29 thcipriani@tin: Synchronized php-1.29.0-wmf.1/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: SWAT: Turn off Cirrus AB test on zh and ja (T147499) (duration: 00m 47s)
  • 18:22 andrewbogott: rebooting labvirt1013
  • 18:17 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add maiwiki HD logos (T149790) PART II (duration: 00m 47s)
  • 18:15 thcipriani@tin: Synchronized static/images/project-logos: SWAT: Add maiwiki HD logos (T149790) PART I (duration: 00m 47s)
  • 18:11 bd808: Sending final Tool Labs survey reminder emails from silver (T147336)
  • 18:09 thcipriani@tin: Synchronized portals: SWAT: Updating wikipedia.org portal (T128546) (T135441) (duration: 00m 47s)
  • 18:09 thcipriani@tin: Synchronized portals/prod/wikipedia.org/assets: SWAT: Updating wikipedia.org portal (T128546) (T135441) (duration: 00m 47s)
  • 18:02 andrewbogott: rebooting labvirt1012
  • 17:52 gehel: deploying new GUI on wdqs
  • 17:32 godog: clean syslog/daemon.log on lithium, spam from mtail
  • 16:51 chasemp: copy labstore2001 tools backup to 2003 and others backup to 2004 for emergency maint
  • 16:08 _joe_: banning dbtree.wikimedia.org on cache_misc, T149357
  • 15:48 jynus: restarting db1069 to apply latest wiki list configuration
  • 15:48 volans: re-armed keyholder after it's upgrade on tin, mira and their deployment-prep equivalents
  • 15:15 moritzm: rebooting sodium for kernel update
  • 15:01 marostegui: Deploy schema change s5 dewiki.revision - only codfw T148967
  • 15:01 moritzm: rebooting wasat for kernel update
  • 14:36 moritzm: installing django security updates on Ubuntu servers
  • 14:13 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: MF Beta: Enable moving first paragraph before infobox - T145216 (duration: 00m 47s)
  • 14:12 moritzm: rebooting labvirt1014 for kernel update
  • 13:58 hashar: European SWAT complete
  • 13:56 hashar@tin: Finished scap: ZeroBanner / ZeroPortal extensions.json fix (duration: 22m 01s)
  • 13:34 hashar@tin: Started scap: ZeroBanner / ZeroPortal extensions.json fix
  • 13:27 hashar@tin: Synchronized php-1.29.0-wmf.1/resources/src/mediawiki/page/gallery.css:
  • 12:32 reedy@tin: Synchronized php-1.29.0-wmf.1/includes/EditPage.php: Fix regression from 1.28.0-wmf.23 T149473 (duration: 00m 47s)
  • 12:11 hoo: Updated Wikidata's property suggester with data from Monday's json dump and applied the T132839 workarounds
  • 10:51 moritzm: installing mailman security update
  • 10:35 mobrovac: scb starting back CP and re-enabling puppet
  • 10:33 moritzm: rolling restart of cassandra on restbase in eqiad completed
  • 08:43 marostegui: Stopping mysql dbstore2002 for maintenance - T149457
  • 08:37 moritzm: rolling restart of cassandra on restbase in eqiad to pick up new Java security updates
  • 08:34 mobrovac: scb10ox stopping puppet and CP for Cassandra restarts
  • 08:32 elukey: restarted cassandra-metrics-collector on aqs100[456] for jvm upgrades
  • 08:10 mobrovac: change-prop deploying a28f9ba
  • 07:19 marostegui: Stopping MySQL db2011 for maintenance - T149099
  • 06:00 mutante: re-enable puppet on bromine after gerrit 319268
  • 02:50 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Nov 2 02:50:27 UTC 2016 (duration 5m 26s)
  • 02:45 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.1) (duration: 10m 43s)
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.28.0-wmf.23) (duration: 05m 43s)
  • 01:46 ejegg: enabled CiviCRM queue consumers and dedupe jobs
  • 01:06 ejegg: disabled ingenico audit parser
  • 00:48 ejegg: updated CiviCRM from 38a6c26 to 954bab4
  • 00:20 eileen1: enable recurring global collect job - query now dead, 26 contributions run but not in civi
  • 00:13 eileen1: kill recurring global collect job for now (while query dies on server)

2016-11-01

  • 23:25 ejegg: disabled donations queue consumer and dedupe for drush updb
  • 23:22 ejegg: updated CiviCRM from 56eadab to 38a6c26
  • 21:55 andrewbogott: rebooting labvirt1013
  • 21:34 andrewbogott: rebooting labvirt1012
  • 21:16 andrewbogott: rebooting labvirt1011
  • 20:59 mutante: stop/start eventlogging on eventlog1001 (after adding IPv6 address appeared to make it stop and removing it again)
  • 20:58 andrewbogott: rebooting labvirt1010
  • 20:33 andrewbogott: rebooting labvirt1005 and labvirt1009
  • 20:09 andrewbogott: rebooting labvirt1008
  • 20:09 andrewbogott: rebooting labvirt1006
  • 19:44 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to php-1.29.0-wmf.1
  • 19:37 thcipriani@tin: Finished scap: testwiki to 1.29.0-wmf.1 and rebuild l10n cache (duration: 27m 56s)
  • 19:22 andrewbogott: rebooting labvirt1004, labvirt1007
  • 19:09 thcipriani@tin: Started scap: testwiki to 1.29.0-wmf.1 and rebuild l10n cache
  • 18:48 jynus@tin: Synchronized wmf-config/db-eqiad.php: pool new enwiki api servers to 100% after initial warm-up (duration: 00m 49s)
  • 18:37 andrewbogott: rebooting labvirt1003
  • 18:27 jynus@tin: Synchronized wmf-config/db-eqiad.php: increase api resources for enwiki -high api load (duration: 00m 48s)
  • 18:16 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable wikilove on bn.wikisource (duration: 01m 44s)
  • 18:00 andrewbogott: rebooting labvirt1002
  • 17:59 godog: ban releases.wikimedia.org/debian from cache_misc to fetch Packages/Release again
  • 17:34 madhuvishy: Rebooting host labstore2001
  • 17:03 thcipriani: starting branch cut for 1.29.0-wmf.1
  • 15:59 godog: graphite-carbon restart after merging https://gerrit.wikimedia.org/r/#/c/316810/
  • 15:58 cmjohnson1: checking all serial cables to row D in eqiad.
  • 15:49 Reedy: Created wikilove tables on bnwikisource for T149683
  • 15:41 mark: Installed nmap on iron
  • 15:01 moritzm: upgrading/rolling restart of remaining wtp nodes in eqiad to nodejs 4.6
  • 14:50 ori: Local-hacking some JavaScript changes on mw1099 to debug T146510
  • 14:35 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb1003.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=cxserver'])
  • 14:08 Reedy: created OATHAuth tables on all private wikis
  • 13:53 Dereckson: scap pull @ tin to sync /srv/mediawiki locally
  • 13:32 Dereckson: sync-portal: Synchronized portals/, purged URLs (Gerrit:319054)
  • 13:27 Dereckson: Synchronized wmf-config/CommonSettings-labs.php: Labs: fix $wgJsonConfigInterwikiPrefix and set isLocal=false for tabular data (Gerrit:319024 + Gerrit:319036, no-op in prod) (duration: 00m 57s)
  • 12:47 gehel: rolling restart of cassandra on maps1* for jvm upgrade
  • 12:32 chasemp: mgmt powercycle of labstore1004
  • 12:05 moritzm: upgrading wtp1001 to nodejs 4.6
  • 12:01 bblack: upgrading cache_text nginx => 1.11.4-1+wmf13
  • 11:13 gehel: rolling restart of cassandra on maps2* for jvm upgrade
  • 11:04 gehel: rolling restart of cassandra on maps-test* for jvm upgrade
  • 10:54 akosiaris: rebooting einsteinium
  • 10:06 moritzm: installing openjdk security fixes on restbase2, rolling restart of cassandra
  • 02:30 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Nov 1 02:30:34 UTC 2016 (duration 4m 16s)
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.28.0-wmf.23) (duration: 09m 08s)

2016-10-31

  • 23:46 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: MF Beta: Do not move first paragraph before infobox (T145216) (T149389) (duration: 00m 46s)
  • 23:40 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Turn off revision number in graph img srv (duration: 00m 46s)
  • 23:35 thcipriani@tin: Synchronized wmf-config/CommonSettings-labs.php: SWAT: LABS: Enable tabular data lua support (T148745) (housekeeping sync) (duration: 00m 46s)
  • 23:32 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Removed unused wmgUseGraphWithNamespace support PART II (duration: 00m 45s)
  • 23:31 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Removed unused wmgUseGraphWithNamespace support PART I (duration: 00m 47s)
  • 23:26 ejegg: increased payments-wiki session timeout
  • 23:26 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create patroller usergroup for enwiki (T149019) (duration: 00m 46s)
  • 23:21 mutante: disabled puppet on bromine temp. issue with reprepo config for releases
  • 21:08 reedy@tin: Synchronized wmf-config/CommonSettings.php: Enable OATHAuth on officewiki (duration: 00m 48s)
  • 21:07 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable OATHAuth on officewiki (duration: 00m 47s)
  • 20:34 arlolra: updated Parsoid to version e503e801 (T149504)
  • 20:16 bblack: upgrading cache_maps to nginx-1.11.4-1+wmf13
  • 20:15 arlolra: starting Parsoid deploy
  • 19:17 elukey: restarted varnishkafka-webrequest on cp2018 and cp3045 (CRITICALs in icinga, librdkafka errors logged for kafka1018.eqiad.wmnet)
  • 19:12 yurik@tin: Synchronized wmf-config/InitialiseSettings.php: touch and sync - logs are flooded (duration: 00m 46s)
  • 18:55 yurik@tin: Synchronized wmf-config: labs syncup https://gerrit.wikimedia.org/r/#/c/318883 (duration: 00m 49s)
  • 17:59 ottomata: kafka preferred-prelica-election for analytics-eqiad to promote kafka1018 as leader
  • 17:27 mark: Chris moved cr2-eqiad:xe-5/0/[0-2] and xe-5/1/2 to xe-3/1/[0-3]
  • 17:05 chasemp: reboot labstore1004
  • 17:00 ottomata: kafka preferred replica election on main-eqiad kafka cluster to promote kafka1003 as leader for its preferred partitions
  • 15:44 mark: Chris moved cr2-eqiad:xe-5/1/[0-3] to xe-3/1/[0-3]
  • 15:35 mark: Disabled ports cr2-eqiad:xe-5/1/[0-3] (row A-D uplinks)
  • 15:28 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2034 for maintenance - T149553 (duration: 00m 46s)
  • 15:21 Reedy: created oathauth_users table on officewiki T135889
  • 15:17 reedy@tin: Synchronized php-1.28.0-wmf.23/extensions/WikimediaMaintenance/createExtensionTables.php: Add OATHAuth (duration: 00m 46s)
  • 14:49 ottomata: adding kafka1003 in as replicas for active main-eqiad topics
  • 14:32 moritzm: rebooting labstore2004 for kernel update
  • 14:12 moritzm: rebooting labstore2003 for kernel update
  • 14:12 ottomata: adding kafka1003 as kafka broker in main-eqiad cluster
  • 14:00 Reedy: that deploy was was "Show changes from last 14 days in watchlist in cswiki T148327 "
  • 14:00 reedy@tin: Synchronized wmf-config/: (no message) (duration: 00m 50s)
  • 13:59 moritzm: powercycling labnet1002
  • 13:58 reedy@tin: Synchronized docroot/noc/: nocnocnoc (duration: 00m 45s)
  • 13:57 reedy@tin: Synchronized wmf-config/: Remove old ContactPage files (duration: 00m 47s)
  • 13:54 reedy@tin: Synchronized wmf-config/CommonSettings.php: Use MetaContactPages (duration: 00m 48s)
  • 13:52 moritzm: powercycling labcontrol1002
  • 13:52 reedy@tin: Synchronized wmf-config/MetaContactPages.php: Stage new file (duration: 00m 46s)
  • 13:46 reedy@tin: Synchronized wmf-config/throttle.php: Throttle rule for T149443 (duration: 00m 46s)
  • 13:39 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable newusermessage on kkwiki T149563 (duration: 00m 55s)
  • 13:35 moritzm: rebooting labstore2001 for kernel update
  • 13:31 moritzm: rebooting labstore1002 for kernel update
  • 13:06 moritzm: rebooting ganeti1001 for kernel update
  • 12:32 bblack: upgrading nginx to 1.11.4-1+wmf13 on cache_misc - T148917
  • 12:31 moritzm: migrating nodes from ganeti1001 for kernel reboot
  • 12:27 moritzm: failover ganeti1002 as new master in eqiad
  • 12:08 bblack: upgrading nginx to 1.11.4-1+wmf13 on cache_upload - T148917
  • 12:07 bblack: upgrading nginx to 1.11.4-1+wmf13 on cache_upload
  • 11:49 bblack: uploaded nginx-1.11.4-1+wmf13 to carbon jessie-wikimedia (logfile spam fixup)
  • 11:32 moritzm: updating parsoid in codfw to nodejs 4.6.0
  • 11:03 jmm@tin: Synchronized wmf-config/ProductionServices.php: Reenabled poolcounter1001 after maintenance (duration: 00m 45s)
  • 11:00 elukey: restarting cassandra on aqs100[456] for OpenJDK upgrades
  • 10:48 moritzm: rebooting poolcounter1001 for kernel update
  • 10:40 moritzm: temporarily disabled poolcounter1001 for maintenance
  • 10:40 jmm@tin: Synchronized wmf-config/ProductionServices.php: disabled poolcounter1001 for maintenance (duration: 00m 47s)
  • 10:08 _joe_: uploaded mcrouter 0.24.0-1 to jessie-wikimedia T132317
  • 08:17 moritzm: rebooting rdb2* for kernel update
  • 07:56 jynus: stopping replication on db1057 (s1-master) from codfw for codfw maintenance
  • 07:43 elukey: powercycled cp2010 (not reachable via ssh, com2 console showed a frozen screen)
  • 07:10 marostegui: Deploying schema change s1 enwiki codfw (db2016 - master) - T147166
  • 05:04 madhuvishy: Upgraded systemd on notebook1002 to 230-7~bpo8+2 from backports
  • 04:48 madhuvishy: Upgraded systemd notebook1001 to 230-7~bpo8+2 from backports
  • 02:59 yuvipanda: start reimaging notebook1001 for T149543
  • 02:20 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Oct 31 02:20:21 UTC 2016 (duration 4m 16s)
  • 02:16 l10nupdate@tin: scap sync-l10n completed (1.28.0-wmf.23) (duration: 05m 12s)

2016-10-30

  • 16:35 jynus: powercycle es2019 after crash T149526
  • 13:54 gehel: disabling completion suggester crons to leave place for terbium reboot
  • 02:32 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Oct 30 02:32:14 UTC 2016 (duration 4m 38s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.28.0-wmf.23) (duration: 09m 01s)

2016-10-29

  • 22:26 reedy@tin: Synchronized php-1.28.0-wmf.23/includes/EditPage.php: Fix for T149473 (duration: 00m 49s)
  • 14:25 bblack: nginx-1.11.4-1+wmf12 uploaded to carbon for jessie-wikimedia
  • 11:10 jynus: performing schema change on s4 (imagelinks) T139090
  • 08:06 apergos: reboot dataset1001 for new kernel
  • 02:30 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Oct 29 02:30:52 UTC 2016 (duration 4m 16s)
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.28.0-wmf.23) (duration: 09m 06s)

2016-10-28

  • 23:19 mutante: re-enabled puppet on phab2001 temp, ran puppet. removed 10.64.31.186/21 from eth0, stopped puppet again
  • 20:42 bd808: Sending Tool Labs survey reminder emails from silver (T147336)
  • 19:24 yurik: deployed kartotherian https://gerrit.wikimedia.org/r/#/c/318575/ - caching is still broken
  • 17:15 mutante: contint1001 - removed php5-* packages (https://puppet-compiler.wmflabs.org/4502/contint1001.wikimedia.org/)
  • 16:43 hasharAway: gallium contint1001: apt-get remove --purge doxygen graphviz
  • 15:44 chasemp: toolschecker seems to have come up wonky, restarting service
  • 15:23 hashar: Restarted nodepool
  • 15:09 andrewbogott: rebooting labservices1001
  • 15:02 andrewbogott: rebooting labnet1001
  • 14:56 moritzm: upgrading openjdk-8/cassandra restart on restbase staging hosts
  • 14:38 moritzm: various reboots of multatuli for systemd tests
  • 13:47 moritzm: uploaded firejail 0.9.44 to carbon
  • 13:42 hoo@tin: Synchronized wmf-config/Wikibase-labs.php: For consistency (duration: 00m 46s)
  • 12:42 jynus: restarting and upgrading labsdb1004
  • 10:46 moritzm: migrating nodes from ganeti1002 for kernel reboot (earlier entry was a typo)
  • 10:46 moritzm: migrating nodes from ganeti1003 for kernel reboot
  • 10:28 moritzm: migrating nodes from ganeti1003 for kernel reboot
  • 10:25 ema: upgrading python-varnishapi to v50.18 on all v4 cache hosts
  • 10:00 moritzm: migrating nodes from ganeti1004 for kernel reboot
  • 09:39 jynus: stopping slave on mariadb labsdb1005 for labsdb1004 reimporting
  • 09:24 hashar@tin: Synchronized /srv/mediawiki-staging/php-1.28.0-wmf.23/extensions/CirrusSearch: https://gerrit.wikimedia.org/r/#/c/318505/ for T149254 (fix log spam/fatal/warnings) (duration: 00m 56s)
  • 09:20 hashar: Pulling CirrusSearch patch https://gerrit.wikimedia.org/r/#/c/318505/ on mw1099 for T149254 (fix log spam/fatal/warnings)
  • 09:10 moritzm: rebooting pool counters in codfw for kernel update
  • 09:06 marostegui: Deploying schema change s1.enwiki - only codfw - T147166
  • 08:01 jynus: applying schema change (imagelinks) to s3 wikis T139090
  • 07:17 moritzm: installing PHP security updates on jessie
  • 02:32 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Oct 28 02:32:31 UTC 2016 (duration 5m 12s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.28.0-wmf.23) (duration: 09m 07s)
  • 01:32 eileen1: update civicrm from 31bea5b to 56eadab
  • 00:20 bd808: Testing logging to SAL via stashbot

2016-10-27

  • 23:40 logmsgbot: yurik@tin Synchronized php-1.28.0-wmf.23/extensions/Kartographer/modules/dialog/dialog.js: https://gerrit.wikimedia.org/r/#/c/318457/ (duration: 00m 47s)
  • 22:08 godog: "cassandra" graphite machines LV at 90% used, add 300G via lvresize
  • 21:55 yurik: deployed kartotherian
  • 21:42 yurik: about to deploy kartotherian update
  • 21:10 urandom: T133395: Altering mobileapps keyspaces to use time-windowed compaction
  • 19:27 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.23
  • 19:25 twentyafterfour: Bumping all wikis to 1.28.0-wmf.23 refs T147517
  • 19:18 ejegg: updated payments-wiki from df4c72d to e86f23a
  • 19:07 logmsgbot: maxsem@tin Finished scap: https://gerrit.wikimedia.org/r/#/c/318343/ (duration: 37m 38s)
  • 19:06 ejegg: updated SmashPig from f5f49d7 to 142e60b
  • 18:30 logmsgbot: maxsem@tin Started scap: https://gerrit.wikimedia.org/r/#/c/318343/
  • 18:26 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.23/includes/parser/Parser.php: Remove tracking category stuff that accidentally slipped into 61adc1e14 - T149310 (duration: 00m 46s)
  • 18:14 logmsgbot: thcipriani@tin Synchronized wmf-config/throttle.php: SWAT: Get rid of ip/IP tolerance for throttle rules (T131469) (duration: 00m 46s)
  • 18:09 gehel: wdqs deployment of latest GUI
  • 17:54 godog: upgrading prometheus to 1.2.1 in codfw/eqiad
  • 16:56 moritzm: uploaded gerrit 2.12.5 to apt.wikimedia.org
  • 16:43 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: scb1003.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=cxserver'])
  • 16:43 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: mw1241.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=appserver', 'service=apache2'])
  • 16:33 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=no; selector: mw1241.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=appserver', 'service=apache2'])
  • 16:25 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=no; selector: scb1003.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=cxserver'])
  • 16:18 hashar: Restarted Jenkins Gearman client due to a deadlock with the beta cluster jobs
  • 15:26 akosiaris: start confd on puppetmaster1001 and puppetmaster2001
  • 15:02 akosiaris: disable puppet across the fleet for puppetmaster kernel upgrades
  • 14:26 moritzm: migrating nodes from ganeti2001 for kernel reboot
  • 14:24 moritzm: restarted ntp on ganeti2006 (stuck in XFAC state)
  • 14:17 moritzm: failover of ganeti2002 to new master node in codfw
  • 13:59 moritzm: migrating nodes from ganeti2002 for kernel reboot
  • 13:42 hashar: European SWAT completed
  • 13:39 moritzm: migrating nodes from ganeti2003 for kernel reboot
  • 13:35 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.23/extensions/CirrusSearch/includes/HTMLCompletionProfileSettings.php: Fix comp suggest pref page (duration: 00m 48s)
  • 13:34 gehel: maps / postgres replication checks in error after deployment of https://gerrit.wikimedia.org/r/#/c/315271/ (T147194) - replication is working, only check is failing - icinga is silenced
  • 13:33 gehel: postgres replication checks in error after deployment of https://gerrit.wikimedia.org/r/#/c/315271/ (T147194) - replication is working, only check is failing - icinga is silenced
  • 13:32 moritzm: migrating nodes from ganeti2004 for kernel reboot
  • 13:17 moritzm: migrating nodes from ganeti2005 for kernel reboot
  • 13:10 logmsgbot: hashar@tin Synchronized wmf-config/throttle.php: T146600 T149200 (duration: 00m 53s)
  • 12:53 moritzm: uploaded openjdk-8 8u111 for jessie-wikimedia to apt.wikimedia.org
  • 12:36 moritzm: migrating nodes from ganeti2006 for kernel reboot
  • 12:35 gehel: disabling puppet on maps servers for deployment of https://gerrit.wikimedia.org/r/#/c/315271/
  • 12:20 moritzm: restarted ntp on conf1001 (stuck in XFAC state)
  • 12:04 gehel: restart elasticsearch on relforge to activate GC logs - T134853
  • 11:33 _joe_: stopping all cache-related services on esams spares cp3012-22
  • 10:30 moritzm: rebooting conf1003 for kernel update
  • 10:26 moritzm: rebooting conf1002 for kernel update
  • 10:20 moritzm: rebooting conf1001 for kernel update
  • 10:02 gehel: reboot maps eqiad cluster for kernel update
  • 09:54 moritzm: rolling reboot of zookeeper cluster in codfw for kernel update
  • 09:45 gehel: reboot maps codfw cluster for kernel update
  • 09:42 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: wtp2019.codfw.wmnet (tags: ['dc=codfw', 'cluster=parsoid', 'service=parsoid'])
  • 09:42 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=no; selector: wtp2019.codfw.wmnet (tags: ['dc=codfw', 'cluster=parsoid', 'service=parsoid'])
  • 09:41 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: wtp2019.codfw.wmnet (tags: ['dc=codfw', 'cluster=parsoid', 'service=parsoid'])
  • 09:15 jynus: applying schema change (imagelinks) to s2 wikis T139090
  • 08:24 moritzm: rolling reboot of logstash cluster for kernel update
  • 07:47 marostegui: Deploying ALTER table s4 commonswiki.templatelinks - https://phabricator.wikimedia.org/T149079 (db2065 only)
  • 07:26 _joe_: creating darmstadtium on ganeti, T148961
  • 07:24 marostegui: Deploying schema change db2034- enwiki.change_tag/tag_summary - T147166
  • 07:15 marostegui: Removed /srv/s5.sql.gz (54G - may 2015) from db1045 to free up some space
  • 06:57 marostegui: Deploy schema change s5 dewiki.revision only codfw - T148967
  • 03:00 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Oct 27 03:00:35 UTC 2016 (duration 5m 20s)
  • 02:55 logmsgbot: l10nupdate@tin scap sync-l10n completed (1.28.0-wmf.23) (duration: 10m 36s)
  • 02:32 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.23/extensions/OAuth/: deploy fix for T149194 (duration: 00m 47s)
  • 02:31 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.22/extensions/OAuth/: deploy fix for T149194 (duration: 00m 51s)
  • 02:28 logmsgbot: l10nupdate@tin scap sync-l10n completed (1.28.0-wmf.22) (duration: 10m 28s)
  • 01:46 eileen1: disable GlobalCollect Recurring Donations
  • 01:21 mutante: mw1208 - service hhvm restart
  • 00:42 ejegg: updated civicrm from 586433b to 31bea5b
  • 00:32 mutante: palladium - removed from DNS
  • 00:30 mutante: palladium - re-shutdown

2016-10-26

  • 23:57 logmsgbot: dereckson@tin Synchronized docroot/noc/conf/: Update noc.wikimedia.org dblist and config files (duration: 00m 45s)
  • 23:52 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.23/extensions/UploadWizard/resources/details/uw.DateDetailsWidget.js: Unbreak Flickr uploads (T149259) (duration: 00m 48s)
  • 23:46 madhuvishy: tools reenabled puppet across proxy hosts. /.well-known/healthz now live on tools-proxy T143638
  • 23:41 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: For $wmgGalleryOptions, use isset() (Gerrit:318223) (duration: 00m 45s)
  • 23:22 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Fix for current Undefined variable: wmgGalleryOptions issue (duration: 00m 48s)
  • 23:19 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Test setting gallery config differently on Beta Cluster enwiki (T141349, 2/2) (duration: 00m 45s)
  • 23:18 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings-labs.php: Test setting gallery config differently on Beta Cluster enwiki (T141349, 1/2, no-op in prod) (duration: 00m 49s)
  • 21:08 hashar: gallium: stopped rsync server
  • 21:02 jynus: restarting mysql and rebooting db1035
  • 20:55 jynus: restarting mariadb on db2011 to test configuration change
  • 20:50 hashar: syncing /var/lib/jenkins from gallium to contint1001 . rsync server spawned on gallium in a term, contint1001 using rsync --bwlimit=5m --delete --info=progress2 -az rsync://gallium.wikimedia.org/jenkins /var/lib/jenkins
  • 20:49 Pchelolo: RESTBase deploy e835f9b8
  • 20:46 Pchelolo: RESTBase deploy e835f9b8 - canary on restbase1007
  • 20:43 arlolra: reverted Parsoid to version 63f1e151
  • 20:41 Pchelolo: RESTBase deploy e835f9b8 - staging
  • 20:36 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.23
  • 20:31 arlolra: reverting Parsoid to version 63f1e151
  • 20:31 arlolra: updated Parsoid to version ede4353
  • 20:28 logmsgbot: twentyafterfour@tin Synchronized php-1.28.0-wmf.23/extensions/GlobalBlocking/: Deploy fix for T149232 to unblock the train refs T147517 (duration: 00m 51s)
  • 20:14 arlolra: starting Parsoid deploy
  • 19:14 logmsgbot: hoo@tin Synchronized wmf-config/Wikibase-labs.php: For consistency (duration: 00m 45s)
  • 18:47 logmsgbot: hoo@tin Synchronized wmf-config/Wikibase-labs.php: For consistency (duration: 00m 47s)
  • 18:39 mark: Reactivated BGP to AS6461 on cr1-eqiad
  • 18:38 mark: Chris moved cr1-eqiad:xe-5/3/1 to xe-3/3/1
  • 18:31 mark: Disabling BGP session to AS6461 on cr1-eqiad, preparing for port migration
  • 18:27 mark: Chris is moving cr1-eqiad and cr2-eqiad xe-5/3/0 to xe-3/3/0 (both sides)
  • 18:20 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.23/extensions/GeoData/: https://gerrit.wikimedia.org/r/#/c/318138/ (duration: 00m 47s)
  • 18:19 mark: Chris is moving cr1-eqiad and cr2-eqiad xe-5/2/0 to xe-3/2/0 (both sides)
  • 18:12 mark: Disabling cr1-eqiad:xe-5/2/0
  • 18:10 logmsgbot: bd808@tin Synchronized wmf-config/CommonSettings.php: wikitech: Re-enable OAuth management interfaces T149150 (duration: 00m 46s)
  • 18:04 ema: cache_text - finished rolling downtimed reboots for kernel update
  • 17:23 cwd: updated smashpig from daba8c0 to f5f49d7
  • 16:58 mark: Shutting down cr1-eqiad:xe-5/0/[0-2] (part of aggregated links to rows A-C switches)
  • 16:58 mark: Chris moved cr1-eqiad:xe-5/2/1 to xe-3/0/3
  • 16:54 mark: Chris moved cr1-eqiad:xe-5/1/[0-3] to xe-3/1/[0-3]
  • 16:49 mark: Shutting down cr1-eqiad:xe-5/1/[0-3] (part of aggregated links to rows A-D switches)
  • 16:39 moritzm: restarted ntp on mc2008 (stuck in XFAC state)
  • 16:36 jynus: stopping db2011 to replace disks T149099
  • 15:59 mark: Disabling cr1-eqiad:ae4; VRRP conflict
  • 15:58 mark: Reenabling cr1-eqiad:ae4
  • 15:57 mark: Restored cr1-eqiad:ae4
  • 15:37 ema: cache_text - start rolling downtimed reboots for kernel update (~3 hours to completion)
  • 15:27 ema: cp1054 reboot for kernel update
  • 15:16 bblack: restarting grrrit-wm
  • 14:34 moritzm: rearmed keyholder on mira after reboot
  • 14:24 bblack: cache_upload - start rolling downtimed reboots for kernel update (~4 hours to completion)
  • 14:19 moritzm: rolling reboot of ocg cluster for kernel update
  • 14:10 logmsgbot: demon@tin Synchronized w: replacing wiki.phtml with a symlink (duration: 00m 47s)
  • 13:39 hashar: European SWAT deploy completed
  • 13:37 hashar: mw2098 is all set now after I ran "scap pull". It is properly in tin:/etc/dsh/group/mediawiki-installation
  • 13:30 hashar: mw2098: scap pull . It failed yesterday on reboot and is back in pull
  • 13:28 hashar: mw2098 spurts bunch of Notice: Undefined variable: wmgWatchlistDefault in /srv/mediawiki/wmf-config/CommonSettings.php on line 1871
  • 13:25 logmsgbot: hashar@tin Synchronized wmf-config/CommonSettings.php: Remove obsolete config values (duration: 00m 46s)
  • 13:24 logmsgbot: hashar@tin Synchronized wmf-config/CommonSettings-labs.php: LABS: Enable Tabular data on Commons - T148745 (duration: 00m 45s)
  • 13:23 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.23/extensions/Kartographer: T149145: Fix empty groups params T149154: Fix external links (duration: 00m 57s)
  • 13:11 logmsgbot: hashar@tin Synchronized wmf-config/CommonSettings.php: wikitech: Set wgMWOAuthCentralWiki = false - T149150 (duration: 00m 47s)
  • 13:07 moritzm: rebooting labtest hosts for kernel update
  • 12:36 marostegui: Deploy schema change s5 dewiki.revision only codfw - T148967
  • 12:31 moritzm: rebooting mira for kernel update
  • 12:20 _joe_: turned off mw1152, removed salt/puppet data, T149185
  • 12:19 bblack: moving git-ssh LVS from low-traffic -> high-traffic2 - T143915
  • 12:05 bblack: moving ocg LVS from high-traffic2 -> low-traffic - T143915
  • 10:40 bblack: rebooting eqiad lvs primaries (lvs100[1-3])
  • 10:19 bblack: rebooting esams lvs primaries (lvs300[12])
  • 10:13 bblack: rebooting ulsfo lvs primaries (lvs400[12])
  • 10:02 bblack: rebooting codfw lvs primaries (lvs200[1-3])
  • 09:52 jynus: starting schema change (imagelinks) on s1 T139090
  • 08:43 elukey: increasing the AQS cassandra system_auth keyspace replication from 1 to 6 (and running nodetool-{a,b} repair system_auth on all nodes)
  • 08:29 elukey: downgraded memcached on mc2009 to the Debian Jessie version (was part of a performance experiment)
  • 08:25 dcausse: elastic@eqiad reindexing enwiki (take 3) with BM25 from wasat.codfw.wmnet T147508 (logs in ~dcausse/bm25_reindex/cirrus_log)
  • 08:06 marostegui: Stoppping replication on db2058 - using it to clone another host - T146261
  • 07:47 moritzm: bounced ntp on oxygen (stuck in XFAC state)
  • 07:45 moritzm: rebooting mc* servers in codfw for kernel update
  • 07:20 moritzm: rebooting oxygen for kernel update
  • 07:14 marostegui: Deploying ALTER table s4 commonswiki.templatelinks - db2051 - T149079
  • 07:10 moritzm: rebooting nescio for kernel update
  • 06:34 moritzm: repooled mw2098 (was previously down for hardware check)
  • 03:00 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Oct 26 03:00:28 UTC 2016 (duration 5m 28s)
  • 02:55 logmsgbot: l10nupdate@tin scap sync-l10n completed (1.28.0-wmf.23) (duration: 10m 13s)
  • 02:28 logmsgbot: l10nupdate@tin scap sync-l10n completed (1.28.0-wmf.22) (duration: 09m 38s)
  • 01:36 mutante: lead - (formerly gerrit) - shutdown -h now (T147905)
  • 01:13 mutante: palladium - shutdown -h now
  • 00:10 mutante: restarted icinga-wm, now there is /var/log/icinga/irc.log, it should talk now, but doesnt
  • 00:04 mutante: let icinga own /var/log/icinga on einsteinium, restart icinga

2016-10-25

  • 23:45 logmsgbot: addshore@mira Synchronized wmf-config/InitialiseSettings.php: gerrit:293243 Add a project namespace on tg.wikipedia (duration: 00m 47s)
  • 23:41 logmsgbot: addshore@mira Synchronized wmf-config/CommonSettings.php: gerrit:315121 Stop adding Category:Uploaded_with_UploadWizard (duration: 00m 47s)
  • 23:39 logmsgbot: addshore@mira Synchronized wmf-config/InitialiseSettings.php: gerrit:318013 Enable static maps on testwiki (duration: 00m 48s)
  • 23:26 mutante: Submitted 'deactivate node' for palladium.eqiad.wmnet
  • 23:21 mutante: removed palladium from puppet (T147320). puppet node clean
  • 23:02 logmsgbot: maxsem@mira Synchronized php-1.28.0-wmf.23/extensions/ZeroBanner/: https://gerrit.wikimedia.org/r/#/c/318004/ (duration: 00m 50s)
  • 23:00 logmsgbot: maxsem@mira Synchronized php-1.28.0-wmf.23/extensions/ZeroPortal/: https://gerrit.wikimedia.org/r/#/c/318004/ (duration: 01m 32s)
  • 22:49 ejegg: updated CiviCRM from 844495a to 586433b
  • 22:34 Pchelolo: revert RESTBase to f9017adc
  • 22:33 ejegg: updated payments-wiki from 1e8f6a2 to df4c72d
  • 22:02 Pchelolo: RESTBase update to 3e53f00e
  • 21:55 Pchelolo: RESTBase update to 3e53f00e - staging
  • 21:54 ejegg: updated SmashPig from 31c0757 to daba8c0
  • 21:51 logmsgbot: maxsem@mira Synchronized php-1.28.0-wmf.23/extensions/Graph/: https://gerrit.wikimedia.org/r/#/c/317989/ (duration: 01m 23s)
  • 20:49 ejegg: updated civicrm from 9e28869 to 844495a
  • 20:42 logmsgbot: maxsem@mira Synchronized wmf-config/mobile.php: https://gerrit.wikimedia.org/r/#/c/317879/ (duration: 00m 47s)
  • 20:32 ostriches: gerrit doing a quick reboot, config pick up
  • 20:16 logmsgbot: filippo@mira Synchronized wmf-config/ProductionServices.php: Put back potassium as poolcounter1002 (duration: 00m 51s)
  • 20:14 logmsgbot: demon@mira Synchronized php-1.28.0-wmf.23/extensions/CiteThisPage/SpecialCiteThisPage.php: T149112 (duration: 01m 39s)
  • 19:41 dcausse: elastic@eqiad reindexing enwiki with BM25 from terbium T147508 (logs in ~dcausse/bm25_reindex/cirrus_log)
  • 19:23 hasharAway: Python PyPi mirror has some issue. Impacts all CI jobs relying on tox https://status.python.org/
  • 19:22 twentyafterfour: phabricator is back from reboot and it appears that all is well
  • 19:19 twentyafterfour: twentyafterfour@iridium: The system is going down for reboot NOW!
  • 19:11 twentyafterfour: rebooting iridium (phabricator) in ~ 3 minutes
  • 19:11 logmsgbot: demon@mira rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.23
  • 19:03 logmsgbot: ebernhardson@mira Synchronized php-1.28.0-wmf.22/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: (no message) (duration: 00m 47s)
  • 18:46 logmsgbot: demon@mira Finished scap: Moving testwiki to wmf.23 for l10n bootstrap (duration: 42m 45s)
  • 18:44 godog: restbase eqiad rolling reboot for kernel update
  • 18:43 Alex: <godog> !log restbase eqiad rolling reboot for kernel update
  • 18:03 logmsgbot: demon@mira Started scap: Moving testwiki to wmf.23 for l10n bootstrap
  • 17:27 moritzm: rebooting phab2001 for kernel update
  • 17:24 moritzm: rebooting notebook1002 for kernel update
  • 17:23 jynus: powercycling db2015, unresponsive
  • 17:22 ostriches: gerrit: quick reboot, picking up logging config changes for jvm
  • 17:19 logmsgbot: filippo@mira Synchronized wmf-config/ProductionServices.php: Put helium back in service during potassium reimage (duration: 01m 34s)
  • 16:08 gehel: restarting ferm on elastic2020
  • 16:04 gehel: delete dangling indices on elasticsearch codfw: jawiki_general_first, jawiki_content_first, zhwiki_general_first and zhwiki_content_first
  • 15:21 ori: Synchronized wmf-config/throttle.php: I049bd463: Use correct IP for Vanderbilt 2016-10-25 edit-a-thon throttle exception (T149063) (duration: 01m 20s)
  • 14:32 gehel: reboot of wdqs cluster eqiad for kernel upgrade
  • 14:24 moritzm: rebooting maerlant for kernel update
  • 14:20 gehel: reboot of wdqs cluster codfw for kernel upgrade
  • 14:14 elukey: removed logstash filter for Apache (https://logstash.wikimedia.org/app/kibana#/dashboard/apache2log) - T144005
  • 14:01 _joe_: refreshing puppet facts
  • 13:34 moritzm: rebooting rcstream servers for kernel update
  • 13:11 moritzm: rebooting etcd1006 for kernel update
  • 13:07 hashar: European SWAT complete
  • 13:06 logmsgbot: hashar@mira Synchronized wmf-config/throttle.php: Nashville Architecture edit-a-thon (Vanderbilt library) throttle rule - T149063 (duration: 02m 07s)
  • 13:03 moritzm: rebooting etcd1005 for kernel update
  • 12:57 moritzm: rebooting etcd1004 for kernel update
  • 12:54 moritzm: repooled maerlant (was depooled for some reason, possibly forgotten to repool after maintenance)
  • 12:50 moritzm: rebooting etcd1003 for kernel update
  • 12:42 moritzm: rebooting etcd1002 for kernel update
  • 12:30 moritzm: rebooting etcd1001 for kernel update
  • 12:24 elukey: rebooting druid100[123] for kernel upgrades
  • 11:56 moritzm: rebooting hydrogen for kernel update
  • 11:53 dcausse: elastic@eqiad reindexing top10 wikis with BM25 from terbium T147508 (logs in ~dcausse/bm25_reindex/cirrus_log)
  • 11:53 moritzm: rolling reboot of mc2* for kernel update
  • 11:31 moritzm: rebooting copper for kernel update
  • 11:16 moritzm: bounced ntp on hassium (stuck in XFAC state)
  • 11:14 logmsgbot: marostegui@mira Synchronized wmf-config/db-eqiad.php: Repool db1059 - T146261 (duration: 01m 22s)
  • 10:37 moritzm: rebooting acamar for kernel update
  • 10:11 elukey: reimaging mc103[1-6] to Jessie
  • 10:09 logmsgbot: marostegui@mira Synchronized wmf-config/db-eqiad.php: Depool db1059 to clone another host from it - T146261 (duration: 01m 36s)
  • 09:58 moritzm: rearmed keyholder on tin
  • 09:50 moritzm: rebooting tin for kernel update
  • 09:45 moritzm: rebooting achernar for kernel update
  • 09:43 marostegui: Deploying ALTER table s4 commonswiki.templatelinks - T149079 (db2058 only)
  • 09:23 moritzm: rebooting hassium for kernel update
  • 09:09 moritzm: rebooting hassaleh for kernel update
  • 08:49 marostegui: Stopping replication db2058 s4 - using it to clone another host - T146261
  • 08:13 akosiaris: reimaging tegmen
  • 08:05 dcausse: elastic@codfw reindexing jawiki, thwiki and zhwiki T147498 (logs in terbium:~dcausse/bm25_reindex/cirrus_log)
  • 08:05 moritzm: rebooting chromium for kernel update
  • 07:40 moritzm: rebooting netmon1001 for kernel update
  • 07:07 moritzm: rebooting tungsten for kernel update
  • 07:05 gehel: rebooting elasticsearch relforge cluster for kernel update
  • 06:56 moritzm: rebooting wezen for kernel update
  • 06:52 gehel: rebooting elastic1035 for kernel update
  • 06:52 moritzm: rebooting osmium for kernel update
  • 02:39 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Oct 25 02:39:53 UTC 2016 (duration 5m 12s)
  • 02:34 logmsgbot: l10nupdate@tin scap sync-l10n completed (1.28.0-wmf.22) (duration: 10m 46s)
  • 02:33 logmsgbot: ori@mira Synchronized php-1.28.0-wmf.22/resources/src/mediawiki/mediawiki.js: I1d61f4dcf: mw.loader: Fix off-by-one error in splitModuleKey() (duration: 02m 15s)

2016-10-24

  • 23:24 logmsgbot: dereckson@mira Synchronized wmf-config/InitialiseSettings.php: Set collation to uca-no-u-kn on no.wikipedia (146675, T148488) (duration: 00m 50s)
  • 23:14 logmsgbot: dereckson@mira Synchronized wmf-config/InitialiseSettings.php: Remove duplicated wmgCirrusSearchClusterOverrides entry (duration: 00m 50s)
  • 23:07 dapatrick: Deployed patch for T148600 to wmf22
  • 22:10 ejegg: updated civicrm from 3b76de5 to 9e28869
  • 21:43 godog: rolling-restart of ms-fe in codfw/eqiad for kernel update
  • 21:01 Amir1: rollbacking ores to 8bbd3ab
  • 20:49 Amir1: deploying 0caa589 to all ores nodes
  • 20:44 Amir1: deploying 0caa589 on ores canary node
  • 20:42 arlolra: updated Parsoid to version 63f1e151 (T139032, T146612, T141905)
  • 20:29 bearND: deployed mobileapps f872894
  • 20:26 bearND: starting mobileapps deploy
  • 20:25 gehel: stopping elasticsearch eqiad cluster restart for the night.
  • 20:18 arlolra: starting Parsoid deploy
  • 20:06 gehel: powering on elastic2020 (no idea why it is powered off)
  • 19:34 ejegg: updated SmashPig from f9e185b to 31c0757
  • 19:30 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: [cirrus] Activate BM25 on top 10 wikis: Step 2 (take 2) (T147508) (duration: 00m 50s)
  • 19:22 logmsgbot: thcipriani@mira Synchronized php-1.28.0-wmf.22/extensions/CirrusSearch/includes/SearchConfig.php: SWAT: Add wgContentNamespaces to the list of vars loaded by SearchConfig (T148840) (duration: 00m 58s)
  • 19:16 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix capitalization for change 317387 (T148328) (duration: 00m 51s)
  • 19:11 logmsgbot: thcipriani@mira Synchronized wmf-config/CommonSettings.php: Fix capitalization for change 317387 (T148328) PART II (duration: 00m 50s)
  • 19:10 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: Fix capitalization for change 317387 (T148328) PART I (duration: 00m 50s)
  • 19:03 logmsgbot: thcipriani@mira Synchronized php-1.28.0-wmf.22/resources/src/mediawiki/mediawiki.js: SWAT: resourceloader: Make cache-eval in mw.loader.work asynchronous (T142129) (duration: 00m 52s)
  • 18:32 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Switch bs, hr and uk wikis to numeric collation (T148682) (duration: 00m 50s)
  • 18:19 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Set $wgCategoryCollation to uca-hr for Croatian wikipedia (T148749) (duration: 00m 57s)
  • 18:06 ejegg: updated payments-wiki settings to f4b79f0
  • 18:05 bblack: downgrading nginx(+linked openssl implicitly) on cp*
  • 17:02 gehel: deplyoing latest GUI for WDQS
  • 16:22 ejegg: updated SmashPig from e28b2cd to f9e185b
  • 15:04 bblack: enabling/running puppet on caches for 8x varnish ports changes - T107749
  • 14:57 paravoid: restarting ferm on es2015
  • 14:54 bblack: starting ferm server on eeden, radon
  • 14:41 logmsgbot: gehel@puppetmaster1001 conftool action : set/pooled=yes; selector: dc=eqiad,cluster=maps,service=kartotherian,name=maps1002.eqiad.wmnet
  • 14:38 logmsgbot: dereckson@mira Synchronized wmf-config/CommonSettings.php: Toggle wgDefaultUserOptions['watchdefault'] on for cs.wikipedia, off elsewhere (T148328, 2/2) (duration: 00m 50s)
  • 14:36 logmsgbot: dereckson@mira Synchronized wmf-config/InitialiseSettings.php: Toggle wgDefaultUserOptions['watchdefault'] on for cs.wikipedia, off elsewhere (T148328, 1/2) (duration: 00m 54s)
  • 14:36 bblack: disabling puppet on all caches ahead of port# work, to test - T107749 / https://gerrit.wikimedia.org/r/#/c/317405
  • 14:29 yurik: re-deployed current kartotherian to all servers (maps1002 & maps-test* were stale)
  • 14:11 marostegui: Deploy schema change s5 dewiki.revision - only codfw T148967
  • 14:03 logmsgbot: l10nupdate@mira ResourceLoader cache refresh completed at Mon Oct 24 14:03:07 UTC 2016 (duration 6m 17s)
  • 13:56 logmsgbot: dereckson@mira scap sync-l10n completed (1.28.0-wmf.22) (duration: 10m 46s)
  • 13:42 bblack: restarting all varnish frontends (serially per-cluster with proper depooling, etc)
  • 13:20 elukey: reimaging mc120[89] and mc1030
  • 13:18 Dereckson: Started manually l10nupdate, as it didn't run for 6 days, and more especially to fix T148921 user-facing issue.
  • 13:13 logmsgbot: dereckson@mira Synchronized wmf-config/throttle.php: Edit-a-thon BDA (Poitiers) throttle rule (T148852) (duration: 01m 13s)
  • 10:47 elukey: reimaged mc102[56], currently doing mc1027
  • 10:21 _joe_: rebooting kubernetes1002
  • 09:20 mobrovac: change-prop deploying c7feda2
  • 09:09 mobrovac: restbase deploy end of f9017ad
  • 08:55 akosiaris: rebooting cobalt (gerrit) for kernel upgrades
  • 08:53 elukey: reimaging mc1024
  • 08:46 mobrovac: restbase deploy start of f9017ad
  • 08:38 gehel: continue rolling restart of elasticsearch eqiad cluster
  • 08:38 hashar: Restarting gallium (Jenkins/Zuul) for kernel upgrades
  • 08:36 akosiaris: rebooting labnodepool1001 for kernel upgrades
  • 08:36 akosiaris: rebooting scandium for kernel upgrades
  • 08:33 hashar: rebooting contint1001
  • 08:20 elukey: reimaging mc1023.eqiad.wmnet
  • 07:46 elukey: reimaging mc1022.eqiad.wmnet (T137345)
  • 07:09 marosteg1i: Deploying alter table s1.enwiki on codfw - T147166

2016-10-22

  • 15:37 logmsgbot: oblivian@puppetmaster1001 conftool action : set/pooled=no; selector: name=cp1052.eqiad.wmnet
  • 15:02 logmsgbot: bblack@puppetmaster1001 conftool action : set/pooled=yes; selector: name=cp1052.eqiad.wmnet
  • 15:02 bblack: repool cp1052 - T148891
  • 14:52 bblack: rebooted cp1052 - T148891
  • 14:27 bblack: depooled cp1052 (cache_text@eqiad, ethernet linkdown for unknown reasons)
  • 12:34 marostegui: Stopping replication in db2055 to use it to clone another host - T146261

2016-10-21

  • 23:45 mutante: depooling maps1002 (by running "depool" on the server itself)
  • 23:35 yurik: maps1002.eqiad is running older/incorrect/misbehaving software for some reason, restart didn't help. Need to depool
  • 22:17 mutante: cp4006,cp4014 gzipped some logs in home for disk space
  • 22:08 mutante: cp4006, cp4014 were running out of disk, apt-get clean
  • 21:40 mutante: phab2001 that IP was also on iridium/phab1001, it should not be hardcoded in puppet, causing issues in T143363
  • 21:37 mutante: phab2001 - ip addr del 10.64.32.186/21 dev eth0
  • 21:06 bblack: restarting varnish backends (depooled, etc) for eqiad cache_upload: cp1049, cp1072, cp1074
  • 19:50 cmjohnson1: dataset1001 array 1 swap failed disk slot 4
  • 19:40 cmjohnson1: labvirt1005 swapping disk 0
  • 19:40 gehel: routing traffic for cache-maps in codfw -> eqiad
  • 19:29 gehel: running puppet on eqiad cache nodes to activate maps traffic redirection
  • 19:06 gehel: shutting down cassandra on maps2004, seems to have lost data
  • 18:22 ejegg: updated SmashPig from d1ca063 to e28b2cd
  • 16:45 ejegg: updated fundraising tools from 09ae6e2 to f83e392
  • 16:33 mutante: rebooting planet1001 - *.planet.wm.org will be right back
  • 16:30 mutante: rebooting planet2001
  • 16:05 elukey: reimaging mc1021 with wmf-auto-reimage (T137345)
  • 15:28 elukey: reimaging mc1019 with wmf-auto-reimage (T137345)
  • 14:50 elukey: reimaging mc1020 with wmf-auto-reimage (T137345)
  • 14:31 _joe_: rebooting all kubernetes worker nodes in production
  • 14:31 moritzm: rolling reboot of thumbor* for kernel update
  • 14:30 marostegui: Stopping replication on db2055 to use it to clone another host - T146261
  • 13:55 bblack: restart isc-dhcp-server on carbon
  • 13:55 moritzm: rolling reboot of thumbor* for kernel update
  • 13:40 moritzm: completed rolling reboot of restbase in codfw
  • 13:14 marostegui: Deploying schema change S6 ruwiki for table ores_model - T147734
  • 12:24 moritzm: rebooting ruthenium for kernel update
  • 12:02 moritzm: rebooting bromine for kernel update
  • 11:28 gehel: starting rolling restart of elasticsearch eqiad cluster
  • 11:05 moritzm: rebooting hafnium for kernel update
  • 10:49 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: mariadb: pool db1053 as the new rc special slave after maintenance (duration: 01m 00s)
  • 10:36 marostegui: Deploying schema change S2 several wikis for table ores_model - T147734
  • 10:28 bblack: rebooting radon (ns0)
  • 10:22 moritzm: rolling reboot of restbase in codfw for kernel update
  • 10:09 marostegui: Deploying schema change S7 fawiki.ores_model - T147734
  • 10:04 moritzm: rebooting seaborgium (labs LDAP server) for kernel update
  • 09:51 marostegui: Deploying schema change S5 wikidatawiki.ores_model - T147734
  • 09:48 moritzm: rebooting neon (icinga host) for kernel update
  • 09:35 marostegui: Deploying schema change S1 enwiki.ores_model in eqiad - T147734
  • 09:32 elukey: rebooting kafka100[12] for kernel upgrades (EventBus hosts)
  • 09:26 moritzm: rebooting krypton for kernel update
  • 09:18 godog: start rolling reboot of ms-be machines in eqiad for kernel update
  • 09:15 moritzm: rebooting meitnerium (archiva.wikimedia.org) for kernel update
  • 09:13 jynus: reviewing and applying new watchdog events to all core dbs T148790
  • 09:06 moritzm: rebooting serpens (labs LDAP server) for kernel update
  • 08:49 moritzm: rebooting ununpentium (RT) for kernel update
  • 08:40 marostegui: Deploying schema change S1 enwiki.ores_model in codfw - T147734
  • 08:38 moritzm: rebooting radium (tor relay) for kernel update
  • 08:35 moritzm: rebooting aluminium (url_downloader for eqiad) for kernel update
  • 08:25 moritzm: rebooting alsafi (url_downloader for codfw) for kernel update
  • 08:23 jynus: applying events_coredb_slave.sql to db1070
  • 08:12 moritzm: rebooting bast1001 for kernel update
  • 08:05 moritzm: rolling reboot of swift backend servers in codfw
  • 07:52 moritzm: rebooting bohrium (hosting piwik) for kernel update
  • 07:20 elukey: rebooting stat100[234] for kernel upgrades
  • 06:26 elukey: restarting stat1001 for kernel upgrades (will cause a brief outage for some analytics websites like analytics.w.o and pivot.w.o)

2016-10-20

  • 23:51 bblack: rebooting eeden (ns2) for kernel
  • 23:48 logmsgbot: dereckson@mira Synchronized php-1.28.0-wmf.22/extensions/CentralNotice: Bump CentralNotice version to fix T145738 and T145447 (Gerrit:317077) (duration: 00m 54s)
  • 23:45 logmsgbot: dereckson@mira Synchronized php-1.28.0-wmf.22/includes/cache/MessageCache.php: Use checkKeys for large messages (T144952) (duration: 00m 50s)
  • 23:37 bblack: rolling restarts of citoid on scb* (for recdns update)
  • 23:30 logmsgbot: dereckson@mira Synchronized php-1.28.0-wmf.22/extensions/UploadWizard/resources/ui/steps/uw.ui.Upload.js: Fix a weird ghost "or" for non-Flickr users (Gerrit:317013) (duration: 01m 31s)
  • 22:52 bd808: Finished sending Tool Labs survey emails from silver (T147336)
  • 21:52 ejegg: updated fundraising tools from f6d200d to 09ae6e2
  • 21:18 ejegg: updated SmashPig from 961fc4c to d1ca063
  • 21:01 bd808: Started sending Tool Labs survey emails from silver
  • 18:39 logmsgbot: dereckson@mira Synchronized wmf-config/InitialiseSettings.php: Revert "[cirrus] Activate BM25 on top 10 wikis: Step 2" (duration: 00m 54s)
  • 18:30 logmsgbot: dereckson@mira Synchronized wmf-config/InitialiseSettings.php: Activate Cirrus BM25 algo on top 10 wikis (step 2, T147508) (duration: 00m 50s)
  • 18:12 XenoRyet: updated payments wiki from 27b464f to 1e8f6a2
  • 18:12 logmsgbot: dereckson@mira Synchronized wmf-config/InitialiseSettings.php: Enable Visual Editor for all users remaining phase 6 Wikipedias (T142589) (duration: 00m 50s)
  • 18:11 mutante: mailing list server back, normal operation
  • 18:08 mutante: rebooting fermium (lists.wm.org)
  • 18:08 logmsgbot: dereckson@mira Synchronized wmf-config/CommonSettings.php: wikitech: Fix Undefined variable: wgMWOAuthCentralWiki (Gerrit:316981) (duration: 01m 26s)
  • 17:50 dcausse: warming up elastic@codfw from wasat.codfw.wmnet (take 3)
  • 17:42 urandom: T133395, T113805: Starting a primary-range, incremental repair of local_group_wiktionary_T_parsoid_html.data on restbase2001.codfw.wmnet
  • 17:38 mutante: rebooting kraz - short downtime of irc.wikimedia.org please prepare to reconnect your clients if they dont automatically do it
  • 17:35 apergos: reboot of last few stragglers for mw* hosts in codfw/eqiad: mw2152 mw2079 mw1239
  • 17:29 mutante: rebooting install2001
  • 17:00 apergos: rolling reboot of video scalers in codfw/eqiad: mw1259 mw1260 mw2152 mw2246
  • 16:48 apergos: rolling reboot of testservers in codfw/eqiad: mw1017 mw1099 mw2017 mw2099
  • 16:45 mutante: rebooting install1001
  • 16:44 logmsgbot: gehel@puppetmaster1001 conftool action : set/pooled=yes; selector: dc=eqiad,cluster=logstash,service=kibana
  • 16:35 godog: reboot graphite1001 for kernel upgrade
  • 16:30 apergos: rolling reboots for jobrunners in eqiad: mw1161-1169, mw1299-1306
  • 16:26 gehel: deploying new LVS service for kibana - T132458
  • 16:25 godog: reboot graphite1003 for kernel upgrade
  • 16:08 moritzm: bounced ntp on mw2089/mw2241 (XFAC state)
  • 15:59 mutante: short downtime of ganglia web ui
  • 15:59 mutante: rebooting uranium
  • 15:36 apergos: rolling reboots for jobrunners in codfw: mw2080-2085, mw2153-mw2162, mw2247-2250
  • 15:14 apergos: rolling reboot of image scalers for codfw, eqiad: mw2086-2089, mw2148-2151, mw1293-1298
  • 15:10 ottomata: restarted statsv on hafnium
  • 14:55 moritzm: bounced ntp on mw2196/mw2197 (XFAC state)
  • 14:34 moritzm: rebooting rutherfordium for kernel update
  • 14:27 logmsgbot: filippo@puppetmaster1001 conftool action : set/pooled=no; selector: name=prometheus1001.eqiad.wmnet
  • 14:26 logmsgbot: filippo@puppetmaster1001 conftool action : set/pooled=yes; selector: name=prometheus1002.eqiad.wmnet
  • 14:24 akosiaris: bounce ntpd on bast4001
  • 14:20 moritzm: rebooting auth* servers
  • 14:20 ottomata: starting rolling restart of analytics-eqiad kafka brokers to apply kernel update
  • 14:18 logmsgbot: filippo@puppetmaster1001 conftool action : set/pooled=no; selector: name=prometheus2001.codfw.wmnet
  • 14:18 logmsgbot: filippo@puppetmaster1001 conftool action : set/pooled=yes; selector: name=prometheus2002.codfw.wmnet
  • 14:17 apergos: rolling reboot of remaining app servers in codfw: mw2221-2245, and in eqiad: mw1261-1275
  • 14:11 logmsgbot: jmm@puppetmaster1001 conftool action : set/pooled=inactive; selector: mw2098.codfw.wmnet
  • 14:09 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: mariadb: move db1053 from s1 to s4 (duration: 02m 06s)
  • 13:38 moritzm: restarting mx1001 for kernel update
  • 13:22 moritzm: restarting francium for kernel update
  • 13:15 godog: rolling reboot of prometheus machines for kernel update
  • 13:14 moritzm: restarting ms1001 for kernel update
  • 13:10 elukey: force failover from temporary Hadoop Master node (an1002) to its stanby (an1001) to restore the standard configuration
  • 13:05 elukey: correction: force failover for Hadoop Master node (an1001) to its stanby (an1002) and rebooting an1001 for kernel upgrades
  • 12:59 elukey: force failover for Hadoop Master node (an1002) to its stanby (an1002) and rebooting an1001 for kernel upgrades
  • 12:59 moritzm: ferm on baham (failed to start due to failing DNS resolution in early boot)
  • 12:52 moritzm: restarting mx2001 for kernel update
  • 12:48 moritzm: bounced ntp on mw2116 (XFAC state)
  • 12:39 elukey: restarting an1003 for kernel upgrades (oozie/hive master)
  • 12:35 moritzm: bounced ntp on baham (was stick in INIT phase)
  • 12:31 apergos: more app server rolling restarts for codfw: mw2163-2199
  • 12:29 apergos: more API server rolling restarts for eqiad: mw1221-1235, 1276-1290
  • 12:27 apergos: more APP server rolling restarts for eqiad: mw1209-1216, 128-1220, 1236-38, 1240-1258
  • 12:12 moritzm: restarting bast2001 for kernel update
  • 12:11 apergos: retaction. those are app servers, not starting them yet
  • 12:10 apergos: more api server rolling restarts for eqiad: mw1209-1216, 128-1220, 1236-38, 1240-1258
  • 12:08 moritzm: bounced ntp on mw2206 (XFAC state)
  • 12:05 bblack: correction: rebooting baham / ns1.wikimedia.org for kernel
  • 12:04 bblack: rebooting baham / ns2.wikimedia.org for kernel
  • 11:53 elukey: rebooting an1027 (camus job launcher) for kernel upgrades
  • 11:48 moritzm: bounced ntp on mw2101 and mw2147 (XFAC state)
  • 11:48 bblack: depool cp1047 (cache_maps eqiad)
  • 11:23 apergos: rolling restarts of more api servers in codfw: mw2200 - 2220
  • 11:17 elukey: rebooting all the Analytics Hadoop nodes for kernel upgrades
  • 11:07 mobrovac: change-prop restarting in codfw after kafka kernel upgrade
  • 10:58 apergos: rolling reboots for first batch of app servers in eqiad: mw1170-1188
  • 10:50 elukey: rebooting kafka200[12] for kernel upgrades (Kafka main-codfw non live cluster)
  • 10:38 apergos: rolling restarts on first batch of api servers in eqiad: mw1189-1208
  • 10:21 apergos: while the first batch of codfw api servers trundle along, starting rolling reboots for appservers in codfw starting with mw2090-2098, 2100-2119
  • 10:20 moritzm: removing a few older kernels on analytics1036, was short of disk space in /boot partition
  • 10:05 elukey: rebooting the Analytics Hadoop cluster for kernel upgrades
  • 09:50 jynus: stop sql thread replication for db1053 and applying partitioning as a "special slave"
  • 09:32 godog: rolling restart of graphite machines for kernel upgrade
  • 09:16 apergos: restarts of mw2075,6,7 done, starting rolling restarts shortly of 8,9, 2120-2147
  • 08:57 akosiaris: rebooting wtp10{02,06,12,13,17,22} for kernel upgrade
  • 08:57 elukey: rebooting eventlog2001 for kernel upgrades (EL spare host)
  • 08:54 elukey: rebooting eventlog1001 for kernel upgrades (Eventlogging host)
  • 08:53 moritzm: rebooting bast4001 for kernel update
  • 08:49 moritzm: rebooting restbase-test* for kernel upgrade
  • 08:43 akosiaris: rebooting wtp10{01,03,04,05,18,23} for kernel upgrade
  • 08:34 akosiaris: rebooting wtp10{07,08,09,10,19,24} for kernel upgrade
  • 08:32 elukey: rebooting aqs100[456] for kernel upgrades (one at the time, de-pool/reboot/pool)
  • 08:31 elukey: rebooting aqs100[123] for kernel upgrades (one at the time, de-pool/reboot/pool)
  • 08:25 akosiaris: rebooting wtp10{10,14,15,16,20,21} for kernel upgrade
  • 08:19 akosiaris: reboot the rest of the wtp20XX hosts for kernel upgrade
  • 08:15 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=no; selector: wtp2019.codfw.wmnet (tags: ['dc=codfw', 'cluster=parsoid', 'service=parsoid'])
  • 08:10 akosiaris: reboot wtp20{03,05,08,09,12,15,17,18,20} for kernel upgrade
  • 08:09 mobrovac: change-prop deploying 3a11886
  • 07:52 moritzm: rebooting bast3001 for kernel update
  • 07:51 gehel: start of elasticsearch codfw rolling restart
  • 07:32 moritzm: rebooting snapshot1001 for kernel update
  • 07:27 moritzm: rebooting snapshot1005-1007 for kernel update
  • 01:17 logmsgbot: legoktm@mira Synchronized wmf-config/InitialiseSettings.php: Revert Enable AbuseFilterCachingParser by default - T148673 (duration: 00m 51s)
  • 00:23 bblack: restarting pybal on lvs1002 for new recdns IP

2016-10-19

  • 23:58 logmsgbot: krenair@mira Synchronized php-1.28.0-wmf.22/extensions/OpenStackManager: https://gerrit.wikimedia.org/r/#/c/316909/ (duration: 01m 00s)
  • 23:41 mutante: Host mw1239 is not in mediawiki-installation dsh group
  • 23:35 logmsgbot: dereckson@mira Synchronized wmf-config/InitialiseSettings.php: Reverting votewiki back to English (T148352) (duration: 00m 50s)
  • 23:20 logmsgbot: dereckson@mira Synchronized wmf-config/InitialiseSettings.php: Switching 10 more wikis to numeric category collation (T146675) (duration: 00m 59s)
  • 21:02 bearND: deployed mobileapps 2551db4
  • 20:59 bearND: starting mobileapps deploy
  • 20:11 logmsgbot: demon@mira Synchronized docroot/mediawiki/keys/: adding tylers key (duration: 01m 09s)
  • 20:04 ejegg: disabled CiviCRM generic dedupe job
  • 20:00 ejegg: enabled CiviCRM major gifts dedupe
  • 19:56 bblack: installing new kernel packages on lvs:primary
  • 19:30 bblack: upgrading nginx+openssl on remaining cache nodes (eqiad+esams/text+upload) - T144523
  • 19:26 bblack: installing new kernel packages on lvs:secondary
  • 19:26 bblack: installing new kernel packages on authdns
  • 19:18 bblack: installing new kernel packages on cp*
  • 18:33 bblack: restarting stuck Jenkins
  • 18:02 urandom: T133395: RESTBase: Altering keyspace local_group_wikipedia_T_parsoid_html.data to enable time-window compaction
  • 17:41 logmsgbot: ori@mira Synchronized wmf-config/InitialiseSettings.php: I6d28e534: Disable AbuseFilterCachingParser on bgwiki (T148660) (duration: 00m 50s)
  • 17:34 moritzm: rebooting xenon/cerium/praseodymium to new kernels
  • 17:33 bblack: cp1008 / pinkunicorn reboot
  • 17:30 logmsgbot: ori@mira Synchronized wmf-config/InitialiseSettings.php: Ieb8cdab9: Enable AbuseFilterCachingParser by default (duration: 01m 01s)
  • 17:15 elukey: depooled mw1239.eqiad.wmnet to allow hw investigation (T148421) (was done today but didn't logged properly)
  • 16:36 bblack: depooling cp3009 (esams cache_misc), possible HW issues
  • 15:58 Mark Issuing secure erase on cp3021 sdb
  • 14:17 akosiaris@puppetmaster1001:conftool action : set/pooled=yes; selector: scb1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=graphoid'])
  • 14:14 mobrovac: [done] scb expanding and deploying services to scb[12]00[1234] change-prop citoid cxserver graphoid mathoid mobileapps
  • 14:02 chasemp: bdsync tools from labstore1001 to labstore2001
  • 14:00 mobrovac: scb expanding and deploying services to scb[12]00[1234] change-prop citoid cxserver graphoid mathoid mobileapps
  • 13:36 mafk: dereckson@mira Synchronized wmf-config/InitialiseSettings.php: Create a 'templateeditor' user group at en.wiktionary plus adittional configuration (T148007) (duration: 02m 33s)
  • 13:31 bblack: upgrading nginx on ulsfo text+upload caches - T144523
  • 13:21 mobrovac: change-prop stopping instances on scb100[12] so that scb1003 picks up more load
  • 13:21 logmsgbot: dereckson@mira Synchronized wmf-config/InitialiseSettings.php: Create a new namespace "Príloha" for skwikt (T148563) (duration: 00m 50s)
  • 13:14 mobrovac: change-prop stopping instance on scb1004 so that scb1004 picks up more load
  • 13:14 logmsgbot: dereckson@mira Synchronized wmf-config/InitialiseSettings.php: Create a 'templateeditor' user group at en.wiktionary plus adittional configuration (T148007) (duration: 02m 33s)
  • 13:11 ema: stopping varnishlog service on v4 cp hosts and removing log file
  • 12:52 bblack: upgrading nginx on codfw text+upload caches - T144523
  • 12:24 marostegui: Stopping db2055 to clone another host - T146261
  • 12:17 akosiaris: update cr{1,2}-eqiad configuration to add tegmen+einsteinium
  • 12:01 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: scb2004.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=ores'])
  • 12:01 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: scb2003.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=ores'])
  • 12:00 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: scb1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 12:00 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: scb1003.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 11:55 marostegui: Deploying schema change db2055 - S1 enwiki.change_tag - T147166
  • 11:35 bblack: upgrading nginx on cp2002 (codfw upload canary) - T144523
  • 11:30 bblack: upgrading nginx on cp2001 (codfw text canary) - T144523
  • 10:27 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=no; selector: scb1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mobileapps'])
  • 10:27 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=no; selector: scb1003.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mobileapps'])
  • 10:05 _joe_: converting owner of files for l10nupdate usermod on tin
  • 10:02 _joe_: ran usermod -u 10002 l10nupdate on tin
  • 09:50 logmsgbot: marostegui@mira Synchronized wmf-config/db-eqiad.php: Repool db1064 after finishing the ALTER table - T147305 (duration: 01m 08s)
  • 09:22 moritzm: installing rsyslog bugfix updates
  • 09:20 marostegui: Deploying schema change on db1069 S4 instance commonswiki revision table - T147305
  • 08:15 marostegui: Stopping db2062.codfw.wmnet to use it to clone another server - T146261
  • 08:14 moritzm: installing tor security update on radium
  • 08:12 moritzm: installing quagga security updates
  • 07:31 _joe_: disabled profiling on mw1189, hhvm keeps crashing
  • 06:50 _joe_: installing jemalloc with memory profiling enabled on mw1189

2016-10-18

  • 23:04 Dereckson: This full scap pulled three changes of the EU SWAT: gerrit:316069 TimedMediaHandler, gerrit:316585 MobileFrontend, gerrit:315901 ULS
  • 23:03 logmsgbot: demon@mira Finished scap: bringing full cluster back into sync (duration: 25m 13s)
  • 22:38 logmsgbot: demon@mira Started scap: bringing full cluster back into sync
  • 22:28 logmsgbot: demon@mira Synchronized README: Bringing co-masters back in sync (duration: 13m 10s)
  • 21:37 mutante: added Dpatrick to WMF LDAP group
  • 18:32 logmsgbot: dereckson@mira Synchronized wmf-config/LabsServices.php: Elastic@deployment-prep: Remove deployment-elastic08 from the cluster (no-op in prod, labs only) (duration: 00m 47s)
  • 18:30 logmsgbot: dereckson@mira Synchronized wmf-config/CirrusSearch-labs.php: Elastic@deployment-prep: force the number of replicas to 1 max (no-op in prod, labs only) (duration: 01m 18s)
  • 17:55 dcausse: warming up elastic@codfw from wasat.codfw.wmnet
  • 17:34 jynus: stopping mysql, cloning db1064->db1053; upgrading
  • 17:01 bblack: upgrading nginx on cache_maps - T144523
  • 16:57 ejegg: updated payments-wiki from b4ad60e to 27b464f
  • 15:47 godog: eqiad-prod: ms-be1022 to weight 3000 T136631
  • 15:16 andrewbogott: upgrading puppetmaster on labtestcontrol2001 to trusty/3.8.5
  • 15:06 bblack: upgrading nginx on all remaining cache_misc (eqiad, esams) - T144523
  • 14:54 bblack: upgrading nginx on all cache_misc @ codfw - T144523
  • 14:54 chasemp: rsync tools from labstore1001 to labstore1004
  • 14:43 bblack: upgrading nginx on all cache_misc @ ulsfo - T144523
  • 14:40 marostegui: Shutting down es2015 for hardware maintenance - T147769
  • 14:21 bblack: upgrading nginx on cp4001 (cache_misc ulsfo) as prod canary
  • 14:18 bblack: uploading nginx-1.11.4+wmf3 to carbon jessie-wikimedia - T144523
  • 13:58 jynus: restarting and upgrading db2049 and es2019 to test new config
  • 13:53 jynus: applying new init.d script on all mariadb 10 servers
  • 12:52 elukey: mw1169 back in service after reimage (MW Jobrunner)
  • 11:55 elukey: removed /etc/mysql/conf.d/research-client.cnf from stat1002 (root:root perms, not supposed to be there but only on stat1003)
  • 11:37 elukey: reimaging mw1169 to Debian Jessie (MW Jobrunner)
  • 10:40 elukey: mw1168.eqiad.wmnet back in service after reimage (MW Jobrunner)
  • 09:29 elukey: reimaging mw1168 to Debian Jessie (MW Jobrunner)
  • 09:25 elukey: varnishkafka restarting in upload/misc/maps with new settings (https://gerrit.wikimedia.org/r/316306)
  • 09:18 gehel: upgrade nodejs to 4.6.0 on maps2* servers
  • 08:56 moritzm: reimaging tin to jessie
  • 08:53 marostegui: Deploying ALTER table on S4 commonswiki (db1064 — last host) - T147305
  • 08:42 jynus: clone db1052 -> db1053, will perform maintenance (db restarts, reboots on both) at the same time
  • 07:57 logmsgbot: marostegui@mira Synchronized wmf-config/db-eqiad.php: Depool db1064 as it needs an ALTER table and pool db1068 temporarily to serve vslow and dump service - T147305 (duration: 02m 53s)
  • 03:19 mutante: restarted grrrit-wm
  • 03:18 mutante: gerrit has logs now in /var/log/gerrit/
  • 03:15 mutante: restarting gerrit for logging config change
  • 02:37 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Oct 18 02:37:01 UTC 2016 (duration 5m 49s)
  • 02:31 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.22) (duration: 10m 20s)
  • 00:48 bblack: restarting API hhvms with >40% mem usage via salt every 10 minutes in a loop from here forward. screen session on neodymium, named api-hhvm-restarts
  • 00:39 mutante: restarted hhvm on mw1281 (was at 47.7% usage)
  • 00:31 bblack: restarting hhvm on API nodes where it's using >30% mem
  • 00:22 bblack: restarting hhvm on *API* nodes where it's using >50% mem
  • 00:22 bblack: restarting hhvm on nodes where it's using >50% mem
  • 00:05 mutante: restarted hhvm on mw1194,mw1197,mw1198

2016-10-17

  • 23:27 Pchelolo: running import deletions script on restbase1007
  • 22:26 mutante: restarted gerrit on cobalt
  • 22:07 Pchelolo: running restriction import script on restbase1007
  • 20:59 mutante: tegmen - stopped duplicate icinga-wm (ircecho)
  • 20:53 mutante: maintenance servers, terbium and wasat, now have IPv6 connectivity
  • 20:33 bearND: deployed mobileapps 13fa4b4
  • 20:32 Krenair: updated status.wm.o apache config on wikitech-static box to correctly serve static assets again (T148438)
  • 20:30 bearND: starting mobileapps deploy
  • 19:31 cwd: disabled all dedupe jobs besides "contacts"
  • 18:38 gehel: deploying latest gui and binaries for wdqs
  • 18:35 Jeff_Green: switch payments-listener back to eqiad
  • 18:17 _joe_: dumping core on mw1194
  • 17:32 Jeff_Green: switch payments-listener to codfw
  • 17:20 _joe_: restarting lvs on lvs1003/1006 for the api change
  • 16:42 ottomata: restarting hadoop nodemanagers 1 at a time
  • 16:18 ori: Restarted HHVM on API cluster EQIAD
  • 15:33 ottomata: rebootting analytics1030
  • 15:13 elukey: ran kafka preferred-replica-election to allow kafka1018 to be back as broker replica leader
  • 14:38 elukey: mw1167 back in service after reimage (MW Jobrunner)
  • 14:30 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.22/extensions/EducationProgram/includes/Events/EditEventCreator.php: Id02366ef: Fix-up for Ia3d767e86 (duration: 00m 52s)
  • 14:06 logmsgbot: ori@tin Synchronized wmf-config/InitialiseSettings.php: I8562f8e1: Enable AbuseFilterCachingParser on metawiki and commonswiki (duration: 00m 56s)
  • 13:06 elukey: reimage mw1167 to Debian (MW Jobrunner)
  • 12:31 marostegui: Stopping MySQL db2055 (S1-codfw) to import S1 to dbstore2001 - T146261
  • 11:39 akosiaris: T148830 poweroff sca1001, sca1002, sca2001, sca2002
  • 11:38 jynus: stopping db1048 for general upgrade & reconfiguration
  • 10:57 godog: deploy thumbor 0.1.28 to thumbor100[12]
  • 10:38 moritzm: uploaded openssl 1.1.0b1+wmf1 for jessie-wikimedia to carbon (patched to be co-installable with our default 1.0.2 packages, build against libssl11-dev to use openssl 1.1)
  • 10:31 mobrovac: citoid deploying df4c92e
  • 10:04 mobrovac: mathoid deploying 52f345b
  • 09:51 akosiaris: T148380 disable puppet on sca1001, sca1002, deactivate them on puppetmasters
  • 09:02 godog: reset power on ms-be1025 - off and no logs to be found on ilo
  • 08:54 jynus: stopping, upgrading and restarting es2014
  • 08:16 _joe_: restarting hhvm on mw1175, stuck in HPHP::FastCGISession::blockingWriteStdOut after OOM
  • 08:15 elukey: upgrading nodejs on aqs100[56]
  • 08:10 jynus: disabling notifications of es2014 before it pages
  • 07:49 marostegui: Stopping MySQL in db2057.codfw.wmnet to use it to clone another server
  • 07:15 marostegui: Dropping memory tables hitcounter, _counters from S7 - T132837
  • 02:26 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Oct 17 02:26:33 UTC 2016 (duration 4m 56s)
  • 02:21 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.22) (duration: 07m 34s)

2016-10-16

  • 20:58 Amir1: ladsgroup@scb[12]00[12]: sudo service celery-ores-worker restart
  • 14:05 Amir1: mwscript resetUserEmail.php --wiki=fawiki Ebrambot <email removed>
  • 10:36 _joe_: restarting hhvm on mw120[0-8]
  • 02:26 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Oct 16 02:26:45 UTC 2016 (duration 5m 41s)
  • 02:21 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.22) (duration: 06m 50s)

2016-10-15

  • 06:32 logmsgbot: tstarling@mira Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 02m 30s)
  • 02:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Oct 15 02:31:13 UTC 2016 (duration 5m 37s)
  • 02:25 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.22) (duration: 05m 41s)

2016-10-14

  • 21:49 yurik: deployed & restarted kartotherian, all's good now
  • 21:47 yurik: about to sync kartotherian fix
  • 21:47 ejegg: updated SmashPig from 2306010 to 961fc4c
  • 21:43 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings-labs.php: https://gerrit.wikimedia.org/r/#/c/316004/ - no-op here, only labs reads this file. just keeping it in sync (duration: 02m 07s)
  • 19:51 Jeff_Green: flip payments-listener back to eqiad
  • 19:43 matt_flaschen: Manual DB update for https://www.wikidata.org/wiki/User_talk:Doror and https://fr.wikipedia.org/wiki/Discussion_utilisateur:Robur15 . T148057
  • 18:08 Jeff_Green: flip payments-listener service from eqiad to codfw
  • 17:46 ejegg: enabled pending queue consumer
  • 17:42 ejegg: disabled pending queue consumer
  • 17:34 ejegg: updated SmashPig from 3c3d115 to 2306010
  • 15:52 Jeff_Green: authdns-update to add payments-listener-codfw A record
  • 12:48 dcausse: reindexing top 10 wikipedias with BM25 on elastic@codfw from terbium (logs in ~dcausse/bm25_reindex/cirrus_log/) (T147508)
  • 12:36 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=no; selector: sca1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=sca', 'service=zotero'])
  • 12:36 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=no; selector: sca1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=sca', 'service=zotero'])
  • 12:36 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=no; selector: sca2002.codfw.wmnet (tags: ['dc=codfw', 'cluster=sca', 'service=zotero'])
  • 12:36 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=no; selector: sca2001.codfw.wmnet (tags: ['dc=codfw', 'cluster=sca', 'service=zotero'])
  • 11:20 mobrovac: change-prop deploying 6dbdaa1
  • 11:18 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: sca2003.codfw.wmnet (tags: ['dc=codfw', 'cluster=sca', 'service=zotero'])
  • 11:17 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: sca2004.codfw.wmnet (tags: ['dc=codfw', 'cluster=sca', 'service=zotero'])
  • 11:17 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: sca1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=sca', 'service=zotero'])
  • 11:17 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: sca1003.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=sca', 'service=zotero'])
  • 11:02 marostegui: Stopping MySQL db2055 (S1-codfw) to import S1 to dbstore2001 - T146261
  • 11:00 elukey: mw1166 back in service after reimage (MW Jobrunner)
  • 10:28 jynus: stopping and restarting mysql at dbstore2001 for misc tests T146261
  • 09:23 elukey: reimaging mw1166 to Debian Jessie (MW Jobrunner)
  • 08:59 elukey: mw1161 back in service after reimage (MW Jobrunner, scap proxdy)
  • 07:47 elukey: reimaging mw1161 to Debian Jessie (MW Jobrunner, scap proxy)
  • 07:17 marostegui: Dropping hitcounter, _counter memory tables in S7 on db1041 (master) - T132837
  • 07:13 moritzm: upgrading hhvm in codfw to latest 3.12.x bugfix release
  • 02:39 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Oct 14 02:39:27 UTC 2016 (duration 6m 20s)
  • 02:36 Ezarate: mdesploy@eswikinews
  • 02:33 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.22) (duration: 12m 03s)
  • 00:17 logmsgbot: reedy@mira Synchronized php-1.28.0-wmf.22/extensions/PageAssessments: [extensions/PageAssessments] Only update assessment data when talk pages are saved (duration: 00m 51s)

2016-10-13

  • 23:59 logmsgbot: reedy@mira rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.22 take 3
  • 23:50 logmsgbot: reedy@mira Synchronized php-1.28.0-wmf.22/extensions/ZeroPortal: Revert extenson registration (duration: 00m 49s)
  • 23:49 logmsgbot: reedy@mira Synchronized php-1.28.0-wmf.22/extensions/ZeroBanner: Revert extenson registration (duration: 00m 50s)
  • 23:47 logmsgbot: reedy@mira Synchronized wmf-config/mobile.php: Back to pre deploy state (duration: 00m 49s)
  • 23:19 logmsgbot: reedy@mira rebuilt wikiversions.php and synchronized wikiversions files: all wikis back to .21
  • 23:08 logmsgbot: reedy@mira Synchronized wmf-config/mobile.php: Only set remote config if not zerowiki (duration: 01m 15s)
  • 22:53 logmsgbot: reedy@mira Synchronized wmf-config/mobile.php: Revert my hack (duration: 00m 49s)
  • 22:47 logmsgbot: reedy@mira Synchronized php-1.28.0-wmf.22/extensions/JsonConfig/: array_merge_recursive (duration: 00m 50s)
  • 22:41 ejegg: updated SmashPig from e89f1b5 to 3c3d115
  • 22:30 logmsgbot: reedy@mira Synchronized php-1.28.0-wmf.22/extensions/JsonConfig/: less array + array more array_merge (duration: 00m 57s)
  • 22:01 logmsgbot: reedy@mira rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.22 take 2
  • 21:54 logmsgbot: reedy@mira Synchronized wmf-config/mobile.php: Load wgJsonConfigs in callback (duration: 00m 56s)
  • 21:42 mutante: gerrit is back
  • 21:40 mutante: gerrit is restarting for config change 315571
  • 21:20 bblack: updating nodejs on ocg1003
  • 21:15 bblack: updating nodejs on ocg1002
  • 21:05 bblack: attempting nodejs upgrade on ocg1001
  • 20:26 matt_flaschen: Ran manual DB updates for T148057.
  • 20:17 logmsgbot: thcipriani@mira rebuilt wikiversions.php and synchronized wikiversions files: Rollback 1.28.0-wmf.22 from group2
  • 20:12 logmsgbot: thcipriani@mira rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.22
  • 19:57 logmsgbot: ori@tin Synchronized wmf-config/InitialiseSettings.php: I8f6eb9f6af: Enable AbuseFilterCachingParser on testwiki and mediawikiwiki (duration: 00m 51s)
  • 19:31 yurik: deployed and restarted tilerator[ui] https://gerrit.wikimedia.org/r/#/c/315707/
  • 19:04 jynus: updating dns for labsdb1002 and m3-slave
  • 18:49 jynus: setting phabricator db in read only mode for master failover
  • 18:49 mobrovac: restbase deploy end of d510090
  • 18:44 mutante: contint1001 - systemctl mask jenkins.service
  • 18:43 jynus: setting up circular replication db1043 <-> db1048
  • 18:39 mutante: contint1001 - stop jenkins service
  • 18:32 mobrovac: restbase deploy start of d510090
  • 18:10 logmsgbot: thcipriani@mira Synchronized php-1.28.0-wmf.22/extensions/EventBus/EventBus.hooks.php: SWAT: Do not set the performer property if the user is not available. (T147977) (duration: 01m 38s)
  • 18:09 twentyafterfour: deployed https://phabricator.wikimedia.org/D413 on iridium and restarted apache
  • 17:46 bblack: forced ocsp stapling update on all caches, just in case
  • 17:33 bblack: pushing new intermediate to caches - T148045
  • 17:21 cmjohnson1: powering off mw1001-1148 to be decommissioned (except mw1017 and mw1099) per T141522
  • 17:20 yurik: deployed kartotherian https://gerrit.wikimedia.org/r/#/c/315701/
  • 17:17 bblack: disabling puppet on all cache nodes
  • 15:37 logmsgbot: marostegui@mira Synchronized wmf-config/db-eqiad.php: wmf-config/db-codfw.php db1053 got moved to another rack so updating its IP - T147774 (duration: 00m 50s)
  • 15:31 moritzm: installing libdbd-mysql-perl security updates
  • 15:26 urandom: T133395: RESTBase: Altering keyspace local_group_wikimedia_T_parsoid_html.data to enable time-window compaction
  • 15:18 logmsgbot: marostegui@mira Synchronized wmf-config/db-eqiad.php: Repool db1068 after it was out for an ALTER table - T147305 (duration: 00m 58s)
  • 14:45 moritzm: installing nspr security updates
  • 14:32 marostegui: Shutting down MySQL in db1053, it is going to be moved to another rack - T147774
  • 14:18 cwd|afk: updated smashpig from 28ba033 to e89f1b5
  • 13:22 hashar: European SWAT completed
  • 13:20 moritzm: rolling restart of restbase in eqiad to pick up new nodejs
  • 13:19 logmsgbot: hashar@mira Synchronized php-1.28.0-wmf.22/includes/api/ApiPurge.php: ApiPurge: Set the triggering user for the LinksUpdate T147516 T147977 (duration: 00m 52s)
  • 13:09 logmsgbot: hashar@mira Synchronized wmf-config/InitialiseSettings.php: Adding language name configuration for Wikidata T146707 (duration: 00m 53s)
  • 12:53 marostegui: Dropping hitcounter, _counter memory tables in S6 (frwiki jawiki ruwiki) - T132837
  • 11:14 elukey: mw1165 (MW Jobrunner) back in service after reimage
  • 10:31 hoo: Ran (updated) T132839-Workarounds.sh from my home in terbium
  • 09:54 marostegui: Deploying schema change on commonswiki.revision - db1068 - T147305
  • 09:49 elukey: reimaging mw1165 to Debian Jessie (MW Jobrunner)
  • 09:45 logmsgbot: marostegui@mira Synchronized wmf-config/db-eqiad.php: Depool db1068 for an ALTER table - T147305 (duration: 04m 58s)
  • 09:15 marostegui: Stopping MySQL in db2057.codfw.wmnet to use it to clone another server
  • 09:09 moritzm: reimaging wasat to jessie
  • 08:35 elukey: restarting aqs on aqs1004 to pick up the new nodejs package
  • 08:31 moritzm: updating app server canaries to new hhvm package
  • 07:25 marostegui: Dropping hitcounter, _counter memory tables in S6 (frwiki jawiki ruwiki) on db1050 (master) - T132837
  • 07:14 marostegui: Dropping hitcounter, _counter memory tables in S5 (dewiki, wikidatawiki) - T132837
  • 06:52 moritzm: installing ghostscript security updates
  • 03:01 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Oct 13 03:01:31 UTC 2016 (duration 6m 42s)
  • 02:54 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.22) (duration: 12m 47s)
  • 02:25 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.21) (duration: 08m 24s)

2016-10-12

  • 23:41 ejegg: updated civicrm settings to revision a7c7919
  • 23:38 logmsgbot: maxsem@mira Synchronized php-1.28.0-wmf.22/extensions/Kartographer/: https://gerrit.wikimedia.org/r/#/c/315603/ (duration: 01m 03s)
  • 23:08 logmsgbot: dereckson@mira Synchronized wmf-config/InitialiseSettings.php: Raise abuse filter emergency threshold for es.wikibooks (T145765) (duration: 01m 19s)
  • 21:47 ejegg: updated SmashPig from ac6a0f0 to 28ba033
  • 21:29 ejegg: updated SmashPig from 00772cd to ac6a0f0
  • 21:20 ejegg: updated SmashPig from 94c7f0d to 00772cd
  • 21:03 ejegg: disabled pending queue consumer
  • 20:57 ejegg: updated SmashPig on all hosts from fa0267b to 94c7f0d
  • 20:48 ejegg: updated SmashPig on listener host from fa0267b to 94c7f0d
  • 19:27 logmsgbot: thcipriani@mira rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.22
  • 19:14 cmjohnson1: disconnecting production cable from old pay-lvs1002 (replaced with new) T147932
  • 18:39 logmsgbot: ebernhardson@mira Synchronized wmf-config/CirrusSearch-common.php: SWAT T66829 Prefer articles in a users language on multilingual wikis (duration: 00m 51s)
  • 18:37 logmsgbot: ebernhardson@mira Synchronized wmf-config/InitialiseSettings.php: SWAT T66829 Prefer articles in a users language on multilingual wikis (duration: 00m 50s)
  • 18:28 logmsgbot: ebernhardson@mira Synchronized wmf-config/CirrusSearch-common.php: SWAT wgCirrusSimilarityProfile -> wgCirrusSearchSimilarityProfile (duration: 00m 53s)
  • 18:27 logmsgbot: ebernhardson@mira Synchronized wmf-config/InitialiseSettings.php: SWAT wgCirrusSimilarityProfile -> wgCirrusSearchSimilarityProfile (duration: 00m 49s)
  • 18:19 ejegg: updated fundraising tools from cbf4dcd to f6d200d
  • 18:17 logmsgbot: ebernhardson@mira Synchronized wmf-config/InitialiseSettings.php: SWAT T138310 Re-enable Flow beta feature on frwikiquote (duration: 00m 50s)
  • 18:13 logmsgbot: ebernhardson@mira Synchronized php-1.28.0-wmf.22/extensions/EventBus/EventBus.hooks.php: SWAT Dont set added/removed properties if they are empty (duration: 00m 52s)
  • 18:10 logmsgbot: ebernhardson@mira Synchronized wmf-config/CirrusSearch-common.php: SWAT T147508 Activate BM25 on top 10 wikis: Step 1 (duration: 00m 50s)
  • 18:08 logmsgbot: ebernhardson@mira Synchronized wmf-config/InitialiseSettings.php: SWAT cirrussearch config updates (duration: 01m 10s)
  • 17:36 urandom: T133395: RESTBase: Altering keyspace local_group_wiktionary_T_parsoid_html.data to enable time-window compaction
  • 17:10 robh: gerrit: system rebooted (cobalt) to enable HT, system back online as of a few minutes ago
  • 17:00 ostriches: gerrit: stopping momentarily for system reboot
  • 15:51 urandom: T133395: Restarting Cassandra instances in RESTBase, eqiad, rack 'd'
  • 15:37 gehel: upgrading nodejs to 4.6.0 on maps1* servers
  • 15:10 urandom: T133395: Restarting Cassandra instances in RESTBase, eqiad, rack 'b'
  • 14:51 urandom: T133395: Restarting Cassandra instances on restbase1011.eqiad.wmnet
  • 14:47 bblack: traffic cache nginxes: seamless upgrade-restart for new openssl lib
  • 14:45 elukey: uploaded zuul 2.5.0-8-gcbc7f62-wmf4precise1 to precise-wikimedia/third-party (T145057)
  • 14:38 elukey: uploaded zuul 2.5.0-8-gcbc7f62-wmf4jessie1 to jessie-wikimedia/third-party (T145057)
  • 14:35 hashar: zuul-merger on scandium restarted. CI is resumed.
  • 14:34 urandom: T133395: Restarting Cassandra instances on restbase1010.eqiad.wmnet
  • 14:31 akosiaris: disable puppet on neon. Merging https://gerrit.wikimedia.org/r/315510
  • 14:28 elukey: install zuul_2.5.0-8-gcbc7f62-wmf4jessie1_amd64.deb on scandium - T145057
  • 14:27 hashar: stopped zuul-merger on scandium pausing CI as a result. Snipe upgrade going on
  • 14:22 urandom: T133395: Restarting Cassandra instances on restbase1007.eqiad.wmnet
  • 14:04 urandom: T133395: Restarting Cassandra instances on restbase2009.codfw.wmnet
  • 14:02 moritzm: installing imagemagick security updates
  • 14:01 bblack: upgrading openssl + confctl on cp*
  • 13:58 hashar: Upgrading Zuul zuul_2.5.0-8-gcbc7f62 wmf3..wmf4
  • 13:52 moritzm: restart restbase on restbase1007 to pick up new nodejs
  • 13:27 moritzm: rolling restart of restbase in codfw to pick up new nodejs
  • 13:23 hashar: European SWAT completed.
  • 13:21 logmsgbot: hashar@mira Synchronized wmf-config/abusefilter.php: Send abusefilter hit notifications from es.wikibooks to UDP T147744 (duration: 00m 52s)
  • 13:14 logmsgbot: hashar@mira Synchronized wmf-config/InitialiseSettings.php: Create 'massmessage-sender' group for tr.wikipedia T147740 (duration: 03m 42s)
  • 12:23 elukey: mw1164 back in service (MW Jobrunner)
  • 11:15 elukey: reimaing mw1164 to Debian Jessie (MW Jobrunner)
  • 10:10 godog: upgrade grafana on krypton to 3.1.1-1470047149 T146354
  • 09:20 kart_: Update cxserver to da7d4f6 (T146731)
  • 09:09 moritzm: upgrading nodejs on etherpad1001
  • 08:56 elukey: mw1163 (MW Jobrunner) back in service after the reimage
  • 08:53 moritzm: upgrading nodejs on ruthenium
  • 08:34 moritzm: installing c-ares security updates
  • 07:38 elukey: reimaging mw1163 to Debian (MW Jobrunner)
  • 06:42 moritzm: reimaging mw1099 (test application server) to jessie
  • 03:14 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Oct 12 03:13:59 UTC 2016 (duration 7m 12s)
  • 03:06 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.22) (duration: 12m 50s)
  • 02:36 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.21) (duration: 12m 50s)
  • 00:02 logmsgbot: ebernhardson@mira Synchronized php-1.28.0-wmf.22/extensions/CirrusSearch/: SWAT CirrusSearch Add completion support to ClusterOverride, Remove position_increment_gap on source_text.trigram (duration: 00m 58s)
  • 00:00 ebernhardson: pulled cirrus changes (315440, 315441) to mw1099

2016-10-11

  • 23:54 logmsgbot: ebernhardson@mira Synchronized php-1.28.0-wmf.22/includes/ForkController.php: SWAT T147881 Call destroy method that actually exists instead of one that doesnt anymore. (duration: 00m 52s)
  • 23:52 logmsgbot: ebernhardson@mira Synchronized php-1.28.0-wmf.22/extensions/VisualEditor/modules/ve-mw/init/targets/ve.init.mw.DesktopArticleTarget.init.js: SWAT T147890 Only enable VE tabs if VE is available (duration: 00m 50s)
  • 23:48 ebernhardson: pulled ve update (315424) to mw1099
  • 23:41 logmsgbot: ebernhardson@mira Synchronized php-1.28.0-wmf.21/extensions/CentralAuth/: SWAT T147029 Add ignorestatus option for fixing stuck renames (duration: 00m 53s)
  • 23:34 logmsgbot: ebernhardson@mira Synchronized w/robots.php: SWAT robots.php: Use WikiPage instead of Article class (duration: 00m 50s)
  • 23:31 ebernhardson: pulled config change (314790) to mw1099
  • 23:30 logmsgbot: ebernhardson@mira Synchronized wmf-config/InitialiseSettings.php: SWAT T143829 Disable bottom language button in Minerva (duration: 00m 50s)
  • 23:29 ebernhardson: pulled config change (315450) to mw1099
  • 23:28 ebernhardson: pulled config change (315314) to mw1099
  • 23:24 ebernhardson: pulled config change (315314) to mw1099
  • 23:23 ejegg|afk: updated fundraising tools from 206799d to cbf4dcd
  • 23:23 logmsgbot: ebernhardson@mira Synchronized php-1.28.0-wmf.21/extensions/MobileFrontend/includes/skins/SkinMinerva.php: SWAT: Fix logic of MinervaBottomLanguageButton T143829 (duration: 00m 50s)
  • 23:22 logmsgbot: ebernhardson@mira Synchronized php-1.28.0-wmf.22/extensions/MobileFrontend/includes/skins/SkinMinerva.php: SWAT: Fix logic of MinervaBottomLanguageButton T143829 (duration: 00m 50s)
  • 23:18 ebernhardson: pulled MobileFront update to mw1099
  • 23:17 logmsgbot: ebernhardson@mira Synchronized wmf-config/CirrusSearch-common.php: Set defaults for wgCirrusSearchClusterOverrides (duration: 00m 56s)
  • 23:15 logmsgbot: ebernhardson@mira Synchronized wmf-config/InitialiseSettings.php: Set defaults for wgCirrusSearchClusterOverrides (duration: 00m 53s)
  • 23:12 ebernhardson: pulled config change to m21099
  • 23:11 mutante: lead - revoke puppet cert, node clean
  • 22:45 ejegg: disabled donation queue consumer
  • 22:44 ejegg: enabled donation queue consumer
  • 22:43 ejegg: updated CiviCRM from 17fab4d to 8682821
  • 22:07 ejegg: disabled donations queue consumer
  • 22:04 ejegg: updated fundraising tools from 112b3fa to 206799d
  • 21:53 ejegg: updated fundraising tool from cc76283 to 112b3fa
  • 21:44 ejegg: disabled paypal audit parser
  • 21:16 urandom: T133395: Restarting Cassandra instances on restbase2006.codfw.wmnet
  • 21:14 moritzm: repooling all services on scb1001 after earlier revert to nodejs 4.4.6
  • 21:10 urandom: T133395: Restart Cassandra on restbase2005-c.codfw.wmnet
  • 21:06 urandom: T133395: Restart Cassandra on restbase2005-b.codfw.wmnet
  • 21:01 Jeff_Green: replaced pay-lvs1001
  • 20:58 urandom: T133395: Restart Cassandra on restbase2005-a.codfw.wmnet
  • 20:57 ejegg: enabled PayPal audit parsing job
  • 20:55 ejegg: updated fundraising tools from 6e36fd5 to cc76283
  • 20:33 ottomata: repooled scb1001 for mobileapps
  • 20:06 Pchelolo: repooling mobileapps on scb1001
  • 20:03 urandom: T133395: Restarting Cassandra: restbase2008-c
  • 20:01 logmsgbot: thcipriani@mira rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.28.0-wmf.22
  • 19:55 urandom: restarting restbase in codfw
  • 19:51 logmsgbot: thcipriani@mira Finished scap: testwiki to 1.28.0-wmf.22 and rebuild l10n cache (duration: 45m 28s)
  • 19:06 logmsgbot: thcipriani@mira Started scap: testwiki to 1.28.0-wmf.22 and rebuild l10n cache
  • 18:59 urandom: restarting restbase: restbase2004.codfw.wmnet
  • 18:55 elukey: kafka1018 back in service after maintenance
  • 18:40 logmsgbot: thcipriani@mira Synchronized wmf-config/CommonSettings.php: SWAT: Enable $wgPageTriageNoIndexUnreviewedNewArticles on all wikis that have PageTriage (T147544) PART II (duration: 00m 52s)
  • 18:33 urandom: T133395: Restarting Cassandra in RESTBase (codfw) to apply https://gerrit.wikimedia.org/r/314603
  • 18:26 urandom: T133395: Starting dumps (3) in RESTBase Staging
  • 18:20 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable $wgPageTriageNoIndexUnreviewedNewArticles on all wikis that have PageTriage (T147544) PART I (duration: 00m 50s)
  • 18:18 mutante: cobalt (new gerrit) run reviewer-count cron, works now
  • 18:17 mutante: lead (old gerrit) manually remove reviewer-count cron, puppet is disabled
  • 18:13 urandom: T133395: Restarting Cassandra in RESTBase Staging to apply https://gerrit.wikimedia.org/r/314603
  • 18:13 logmsgbot: thcipriani@mira Synchronized dblists/visualeditor-nondefault.dblist: SWAT: Enable the visual editor for logged-in users on remaining phase 6 Wikipedias (T142589) PART II (duration: 00m 59s)
  • 18:11 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable the visual editor for logged-in users on remaining phase 6 Wikipedias (T142589) PART I (duration: 01m 56s)
  • 18:10 twentyafterfour: updated php on iridium
  • 18:09 urandom: T133395: Restarting xenon.eqiad.wmnet to apply https://gerrit.wikimedia.org/r/314603
  • 17:51 bearND: deployed mobileapps fc900fc
  • 17:29 bearND: starting mobileapps deploy
  • 17:25 Amir1: ladsgroup@terbium:~$ mwscript extensions/ORES/maintenance/CleanDuplicateScores.php on eight wikis (T145356)
  • 17:20 mutante: mw1272 kernel: [10254957.470558] BUG: Bad page map in process hhvm
  • 17:16 elukey: rebooting kafka1018
  • 17:14 mutante: mw1272 reboot
  • 17:05 thcipriani: starting branch cut for 1.28.0-wmf.22
  • 14:58 hashar: Upgrading Zuul on gallium 2.5.0-8-gcbc7f62-wmf2precise1 2.5.0-8-gcbc7f62-wmf3precise1 (merely a noop for zuul scheduler) T147070
  • 14:57 hashar: Upgrading Zuul on gallium 2.5.0-8-gcbc7f62-wmf2precise1 2.5.0-8-gcbc7f62-wmf3precise1 (merely a noop for zuul scheduler)
  • 14:28 elukey: upgraded zuul on scandium (T147073)
  • 14:20 urandom: T133395: Restarting xenon.eqiad.wmnet to apply https://gerrit.wikimedia.org/r/314603
  • 13:56 hashar: European SWAT is done.
  • 13:33 hashar: mira: purging portals URLs for jan_drewniak_ : cat /srv/mediawiki-staging/portals/urls-to-purge.txt | mwscript purgeList.php
  • 13:14 logmsgbot: hashar@mira Synchronized portals: (no message) (duration: 01m 01s)
  • 13:13 logmsgbot: hashar@mira Synchronized portals/prod/wikipedia.org/assets: (no message) (duration: 01m 46s)
  • 12:47 Amir1: on terbium ^
  • 12:47 Amir1: mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=ptwiki --logwiki=metawiki "Zhyar Merlin" "Zhiar Merlin"
  • 12:44 elukey: restarted keyholder-proxy on mira
  • 12:39 moritzm: nodejs reverted to 4.4.6 on scb1001, depooling for service restarts
  • 12:30 elukey: rearming the keyholder on mira
  • 12:27 mobrovac: change-prop scb1001: disabled puppet to try and debug why change-prop master is failing on node v4.6.0
  • 12:14 moritzm: upgrading nodejs on scb2001 to 4.6.0
  • 11:38 elukey: decomissioning the old AQS cluster - aqs100[123] for good https://gerrit.wikimedia.org/r/#/c/314542/
  • 11:37 moritzm: repooling scb1001
  • 11:20 jynus: stopping and starting mysql on labsdb1008 (not active) for new package/config testing
  • 11:18 elukey: reimaging mw1162.eqiad.wmnet to Debian (MW Jobrunner)
  • 11:14 moritzm: depooling scb1001 for service restarts
  • 11:11 moritzm: upgrading nodejs on scb1001 to 4.6.0
  • 11:00 logmsgbot: hashar@mira Synchronized README: testing deploy from mira (duration: 02m 38s)
  • 10:01 moritzm: switching deployment server to mira
  • 08:31 marostegui: Removing Not needed file from dbstore1001 to free up space (/srv/tmp/db1064.tar.gz.enc)
  • 07:56 moritzm: reimaging mw1017 to jessie (test application server in eqiad)
  • 06:50 moritzm: installing django security updates
  • 06:17 marostegui: Deploying schema change S4 commonswiki.revision - T147305
  • 02:38 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Oct 11 02:38:48 UTC 2016 (duration 4m 45s)
  • 02:34 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.21) (duration: 15m 28s)

2016-10-10

  • 21:29 logmsgbot: reedy@tin Synchronized wmf-config/InitialiseSettings.php: Add upload_by_url right to Commons bots (duration: 00m 50s)
  • 21:26 logmsgbot: reedy@tin Synchronized wmf-config/InitialiseSettings.php: Fix typo in group name. Add message-format logging group (duration: 00m 50s)
  • 21:20 logmsgbot: reedy@tin Synchronized wmf-config/CommonSettings.php: More wfLoadExtension, no config changes (duration: 00m 49s)
  • 21:12 logmsgbot: reedy@tin Synchronized wmf-config/CommonSettings.php: Remove some legacy cruft that is unused (duration: 00m 50s)
  • 21:08 logmsgbot: reedy@tin Synchronized wmf-config/: Re-enable OAuth on Wikitech T147804 (duration: 00m 52s)
  • 20:39 Reedy: Created up to date oauth tables on wikitech
  • 20:38 Reedy: Dropped oauth tables from wikitech
  • 20:36 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Allow Commons 'crats to manage accountcreator group (T144689) (duration: 00m 50s)
  • 20:16 Amir1: deploying 8bbd3ab to all ores nodes (T146680)
  • 20:09 Amir1: deploying 8bbd3ab to ores canary nodes (T146680)
  • 18:29 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.21/extensions/UploadWizard/resources/controller/uw.controller.Details.js: Don't show warning confirmation dialog when there are no warnings (T147659) (duration: 00m 48s)
  • 18:25 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.21/extensions/ORES/maintenance/CleanDuplicateScores.php: Fixup maintenance/CleanDuplicateScores.php (duration: 00m 54s)
  • 18:23 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.21/includes/Linker.php: Do not normalise external links to special pages (T147685) (duration: 01m 07s)
  • 18:22 logmsgbot: dereckson@tin scap aborted: file php-1.28.0-wmf.21/includes/Linker.php Do not normalise external links to special pages (T147685) (duration: 00m 03s)
  • 18:22 logmsgbot: dereckson@tin Started scap: file php-1.28.0-wmf.21/includes/Linker.php Do not normalise external links to special pages (T147685)
  • 17:20 gehel: upgraded maps1* to postgis 2.3.0 - T144763
  • 15:50 mobrovac: mathoid deploying adb8e548
  • 14:13 moritzm: upgraded PHP on bohrium/piwik.wikimedia.or
  • 14:03 gehel: reimage maps-test200[34] - T147194
  • 13:38 marostegui: Dropping hitcounter, _counter memory tables in S4 - db1040 (master) - T132837
  • 13:20 zeljkof: ending EU SWAT!
  • 13:15 logmsgbot: zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Activate subphrases autocomplete on wikisources, mw.org and wikitech (T146208) (duration: 00m 50s)
  • 13:12 gehel: reimage maps-test2002 - T147194
  • 12:47 moritzm: uploaded nodejs 4.6.0 for jessie-wikimedia to carbon
  • 12:44 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Restore original weight for db1082 after its RAID controller firmware - T145533 (duration: 00m 55s)
  • 12:06 logmsgbot: hoo@tin Synchronized php-1.28.0-wmf.21/extensions/Wikidata: Update Wikibase, add EntityHandler::supportsCategories (T147748) (duration: 02m 25s)
  • 11:52 godog: swift eqiad-prod: ms-be1022 to weight 2000 T136631
  • 11:33 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Increase weight for db1082 after its RAID controller firmware - T145533 (duration: 00m 49s)
  • 10:52 marostegui: Dropping hitcounter, _counter memory tables in S2 - db1069 - T132837
  • 10:42 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1082 with some small weight after its RAID controller firmware - T145533 (duration: 00m 50s)
  • 10:04 jynus: running ALTER TABLE search_documentfield ENGINE=InnoDB, FORCE; on phabricato db replica (db1043)
  • 09:39 marostegui: Dropping hitcounter, _counter memory tables in S2 - db1063- T132837
  • 09:32 moritzm: rolling reboot of swift frontend servers in eqiad for kernel security update
  • 09:30 moritzm: pruning older, unused kernel images on labstore1003
  • 08:57 marostegui: Dropping hitcounter, _counter memory tables in S2 - dbstore1002 - T132837
  • 08:56 jynus: reboot db1043 to test new mysql configuration and general upgrade- proxy will complain
  • 08:53 marostegui: Dropping hitcounter, _counter memory tables in S2 - dbstore1001 - T132837
  • 08:53 godog: reboot graphite2001 and graphite1001 for trusty kernel upgrade
  • 08:40 hoo: Populated the sites/ site_identifiers tables on olowiki (T146614)
  • 08:38 marostegui: Dropping hitcounter, _counter memory tables in S2 - dbstore2002 - T132837
  • 08:12 akosiaris: clear mx1001's queues from backscatter spam T147173
  • 07:34 marostegui: Deploying schema change on S4 codfw only commonswiki.revision - T147305
  • 07:29 moritzm: installing php security updates on jessie systems
  • 06:15 marostegui: db1082: Upgrading RAID controller firmware
  • 06:13 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1082 to upgrade its RAID controller firmware - T145533 (duration: 00m 50s)
  • 05:42 jynus: reseting slave on es2 eqiad master (es1015)
  • 05:11 logmsgbot: jynus@tin Synchronized wmf-config/db-codfw.php: mariadb: Depool es2015 (master, crashed); replaced by es2016 (duration: 00m 49s)
  • 04:50 jynus: changing topology of es2 @ codfw
  • 02:30 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Oct 10 02:30:47 UTC 2016 (duration 4m 51s)
  • 02:25 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.21) (duration: 09m 40s)

2016-10-09

  • 09:07 elukey: chmod o+r /var/lib/varnish/frontend/_.vsm and /var/lib/varnish/cp2008/_.vsm on cp2008 to avoid gmond errors
  • 09:01 jynus: dropping unneded files on db1026 to mitigate disk issues for the next week
  • 08:45 elukey: powercycling cp2008, no ssh and mgmt console frozen
  • 02:27 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Oct 9 02:27:12 UTC 2016 (duration 4m 36s)
  • 02:22 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.21) (duration: 08m 59s)

2016-10-08

  • 19:48 bd808: cp2008 Strongswan failures for both ipv4 and ipv6 across a larg number (all?) hosts
  • 09:40 elukey: masked the kafka systemd unit on kafka1018 and re-enabling puppet
  • 09:10 apergos: puppet disabled on kafka1018, leave broker down, bad disk /dev/sdi (see dmesg for sample errors)
  • 02:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Oct 8 02:31:15 UTC 2016 (duration 6m 20s)
  • 02:24 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.21) (duration: 09m 09s)

2016-10-07

  • 23:16 Krenair: test morebots
  • 23:15 logmsgbot: thcipriani@tin Synchronized w/robots.php: Revert change to robots.php (duration: 00m 49s)
  • 22:22 mutante: etcd servers have puppet issue with Etcd_user[root]
  • 22:04 ejegg: updated payments-wiki from 1fbe171 to b4ad60e
  • 20:30 bblack: lead.wikimedia.org: replaced by cobalt functionally, please leave it untouched for now with puppet disabled!
  • 19:46 mutante: deleted old /var/lib/gerrit2/ data on cobalt, syncing from lead
  • 19:45 mutante: rsyncing /var/lib/gerrit2 from lead to cobalt
  • 19:30 mutante: removed gerrit IPs from cobalt interfaces
  • 19:29 mutante: disabled puppet on lead and cobalt
  • 19:21 mutante: re-enabling puppet on cobalt
  • 19:21 mutante: removed gerrit IP from lead's interface, v4 and v6
  • 19:09 mutante: rsyncing gerrit data one more time from lead to cobalt
  • 19:08 cmjohnson1: db1065 swapping failed disk slot 9 T147396
  • 19:07 ostriches: stopped puppet on lead
  • 19:07 mutante: stopping gerrit on lead
  • 19:02 mutante: cobalt, disabled puppet, removed service IP from interface
  • 17:28 mutante: rsyncing gerrit data from lead to cobalt
  • 16:53 jynus: testing img_metadata nuking for T145953 and T147015 (backups on neodymium)
  • 16:25 gehel: reimage maps-test2001 - T147194
  • 16:14 akosiaris: build python-irclib for jessie and upload it to apt.wikimedia.org jessie-wikimedia/main
  • 16:06 moritzm: updated hhvm package for jessie to 3.12.9
  • 14:41 moritzm: uploaded openssl 1.0.2j for jessie-wikimedia to carbon
  • 14:41 kart_: Update cxserver to fa2f715 (T147552)
  • 13:56 elukey: reimaing mw123[45] to Debian Jessie (last two api appservers)
  • 12:39 elukey: reimaging mw123[23] to Debian Jessie
  • 12:09 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1082 with its original weight - T145533 (duration: 00m 52s)
  • 11:17 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1082 with a bit less weight than usual to start with - T145533 (duration: 00m 55s)
  • 10:52 moritzm: reimaging mw1238, mw1239 to jessie
  • 10:28 elukey: reimaging mw123[01] to Debian Jessie
  • 10:27 elukey: mw122[89] back in live api server pool
  • 10:00 _joe_: updated conftool to 0.3.1 on all the cluster except caches, T147480
  • 09:48 _joe_: creating etcd100[1-6].eqiad.wmnet on ganeti, T147620
  • 09:32 moritzm: reimaging mw1220, mw1236, mw1237 to jessie
  • 09:28 moritzm: installing pillow/python-imaging security updates on Ubuntu systems
  • 09:20 gehel: reimaging maps-test2004 - T147194
  • 09:18 moritzm: installing php security updates on precise hosts
  • 08:53 gehel: reimaging maps-test2003 - T147194
  • 08:33 logmsgbot: oblivian@puppetmaster1001 conftool action : set/weight=20; selector: cluster=api_appserver,dc=eqiad,name=mw123.*
  • 08:31 logmsgbot: oblivian@puppetmaster1001 conftool action : set/weight=0; selector: cluster=api_appserver,dc=eqiad,name=mw123.*
  • 08:20 _joe_: restarting hhvm on a few api appservers, due to memory leaks (T146451)
  • 07:17 elukey: reimaging mw1228 and mw1229 (api appservers) to Debian Jessie
  • 06:32 moritzm: reimaging mw1216, mw1218, mw1219 to jessie
  • 06:31 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1082 to get its raid controller firmware upgraded - T145533 (duration: 00m 49s)
  • 06:10 _joe_: restarted hhvm, jobrunner on mw1161
  • 05:55 marostegui: Deploying schema change on S4 master commonswiki.revision table - T147113
  • 04:49 kart_: Update cxserver to 84fb704 (T147368)
  • 02:41 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Oct 7 02:41:54 UTC 2016 (duration 5m 43s)
  • 02:36 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.21) (duration: 16m 03s)
  • 00:54 Dereckson: https://olo.wikipedia.org has been successfully created (T146612).
  • 00:44 Dereckson: mwscript extensions/WikimediaMaintenance/filebackend/setZoneAccess.php olowiki --backend=local-multiwrite
  • 00:39 logmsgbot: dereckson@tin Synchronized wmf-config/interwiki.php: Interwiki cache update for pmid, HTTPS links and olo.wikipedia.org (duration: 00m 50s)
  • 00:25 logmsgbot: dereckson@tin Synchronized langlist: +olo (duration: 00m 49s)
  • 00:22 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Initial configuration for olo.wikipedia.org (T146612) (duration: 00m 50s)
  • 00:18 logmsgbot: dereckson@tin rebuilt wikiversions.php and synchronized wikiversions files: (no message)
  • 00:17 logmsgbot: dereckson@tin Synchronized dblists: Create olo.wikipedia.org (T146612) (duration: 00m 50s)

2016-10-06

  • 23:46 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.21/extensions/WikimediaMessages/i18n/wikimediainterwikisearchresults/: olo.wikipedia.org project name (duration: 00m 49s)
  • 23:44 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.21/extensions/WikimediaMessages/i18n/wikimediaprojectnames: olo.wikipedia.org project name (duration: 00m 51s)
  • 23:37 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.21/includes/Revision.php: Revision->insertOn: Set READ_LATEST flag (T138310) (duration: 00m 49s)
  • 23:34 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Logo for pt.wikimedia (T126832, 2/2, no-op for the moment) (duration: 00m 50s)
  • 23:31 logmsgbot: dereckson@tin Synchronized static/images/project-logos/: Logo for pt.wikimedia (T126832, 1/2) (duration: 00m 50s)
  • 23:23 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Enable RelatedArticles on Minerva skin for all but top 6 wikis (T144812) (duration: 00m 50s)
  • 23:11 logmsgbot: dereckson@tin Synchronized wmf-config/throttle.php: Clean expired throttle rules (Gerrit:313166) (duration: 00m 50s)
  • 23:11 ejegg: enabled donations, recurring and refund queue consumers
  • 23:02 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Enable footer v2 on Minerva for all wikis (T145442) (duration: 00m 50s)
  • 22:38 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.21
  • 22:31 ejegg: enabled donations queue consumer
  • 22:15 ejegg: disabled donations, refund, and recurring queue consumers
  • 22:14 ejegg: disabled paypal audit processor
  • 22:08 twentyafterfour: phd enabled on iridium
  • 22:07 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.21
  • 22:04 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.21/includes/libs/rdbms/loadbalancer/LoadBalancer.php: Ignore reuseConnection() errors after LoadBalancer/LBFactory destruction (T147520) (duration: 00m 50s)
  • 22:01 ejegg: enabled paypal nightly audit parser
  • 22:01 logmsgbot: ori@tin Synchronized wmf-config/abusefilter.php: If794eb2a: AbuseFilter: Use new parser from I4aea5f00 on Labs (duration: 00m 49s)
  • 21:53 ejegg: updated tools from bde60d5 to 6e36fd5
  • 21:53 logmsgbot: thcipriani@tin Synchronized wmf-config/Wikibase-production.php: SWAT: Add config for units on Wikidata (T117032) PART II (duration: 00m 50s)
  • 21:51 logmsgbot: thcipriani@tin Synchronized wmf-config/unitConversionConfig.json: SWAT: Add config for units on Wikidata (T117032) PART I (duration: 00m 48s)
  • 21:38 ejegg: updated payments-wiki from 27ffd8c to 1fbe171
  • 21:22 mutante: restarting zuul on gallium
  • 21:12 ostriches: lead: enabling & running puppet again, should bring things back up
  • 21:02 akosiaris: rebooting lead one more time
  • 20:17 akosiaris: rebooting lead once more
  • 19:47 ostriches: lead: rebooting, because what have we got to lose
  • 19:23 twentyafterfour: disabled phd and puppet on iridium
  • 18:31 twentyafterfour: stopped phd on iridium to relieve some load on gerrit
  • 18:24 ostriches: lead: restarting apache to force error page to show for now
  • 18:21 ostriches: lead: disabled puppet for now, gerrit's sick
  • 18:04 ostriches: gerrit: kicking gerrit and apache, something is unhappy...
  • 17:11 ema: power-cycling cp2017
  • 16:54 akosiaris: uploaded to apt.wikimedia.org precise-wikimedia/main: php5_5.3.10-1ubuntu3.25+wmf1
  • 16:31 ema: power-cycling cp2022
  • 15:48 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Oct 6 15:48:38 UTC 2016 (duration 7m 12s)
  • 15:41 logmsgbot: dereckson@tin scap sync-l10n completed (1.28.0-wmf.21) (duration: 15m 46s)
  • 15:41 _joe_: upgrading conftool to 0.3.1 on all mw*, wtp* servers, T147480 T145518
  • 15:25 ema: powercycle cp3045
  • 15:05 logmsgbot: dereckson@tin scap sync-l10n completed (1.28.0-wmf.20) (duration: 15m 59s)
  • 15:03 ema: cp3034 hanging during boot, power-cycled
  • 14:41 jynus: restarting db1069:3133 mysql instance
  • 14:40 _joe_: uploaded conftool 0.3.1 to apt.w.o, T147480
  • 14:33 ema: cache_upload: rolling reboots for kernel upgrades
  • 14:26 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.21/extensions/WikimediaMessages/i18n/wikimedia: Wikimedia messages for new 'engineer' group for ruwiki (T144599) (duration: 00m 49s)
  • 14:25 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.20/extensions/WikimediaMessages/i18n/wikimedia: Wikimedia messages for new 'engineer' group for ruwiki (T144599) (duration: 00m 49s)
  • 14:23 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: New 'engineer' group for ruwiki (T144599) (duration: 00m 52s)
  • 13:59 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Configure Visual Editor namespaces on sv.wikipedia (gerrit:309808 and gerrit:314558, T144688) (duration: 00m 50s)
  • 13:54 mobrovac: citoid deploying 4d97774
  • 13:50 elukey: added mw122[67] back to the api appservers live pool
  • 13:50 ema: cache_text: rolling reboots for kernel upgrades
  • 13:38 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Disable Upload Wizard blacklist issues on Commons (T146417) (duration: 00m 49s)
  • 13:23 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.21/extensions/EventBus: Send a resource_change event on page_image property change (T145569) (duration: 00m 48s)
  • 13:16 logmsgbot: dereckson@tin Synchronized wmf-config/CirrusSearch-common.php: Initialize subphrases autocomplete on wikisources, mw.org and wikitech (T146208, 3/3) (duration: 00m 49s)
  • 13:14 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Initialize subphrases autocomplete on wikisources, mw.org and wikitech (T146208, 2/3) (duration: 00m 49s)
  • 13:11 logmsgbot: dereckson@tin Synchronized tests/cirrusTest.php: Initialize subphrases autocomplete on wikisources, mw.org and wikitech (T146208, 1/3, no-op in prod part) (duration: 00m 50s)
  • 12:50 ema: cache_misc: rolling reboots for kernel upgrades
  • 12:36 elukey: reimaging mw122[67] to Debian Jessie
  • 12:33 elukey: adding mw122[45] back to the live api appservers pool (note: mw1224 was pooled => no before the reimage, but I don't see any blocker in adding it back to serve live traffic)
  • 11:57 mobrovac: restbase deploy end of fa4dc79
  • 11:50 moritzm: rebooting video scalers for kernel security update
  • 11:36 mobrovac: restbase deploy start of fa4dc79
  • 11:14 elukey: reimaging mw122[34] to Debian Jessie
  • 11:12 elukey: added mw122[23] back to the api appservers live pool
  • 10:50 mobrovac: change-prop deploying 403eec8
  • 10:27 ema: cache_maps: rolling reboots for kernel upgrades
  • 10:16 moritzm: reimaging mw1209, mw1210, mw1215 to jessie
  • 10:09 ema: power cycling cp2015, reboot failed
  • 10:02 elukey: reimaging mw122[23] to Debian jessie (api appservers)
  • 09:57 ema: cp1046 cp2015 depooled reboot for kernel upgrades
  • 09:55 moritzm: installing jackrabbit security updates on Ubuntu and Debian systems
  • 09:36 elukey: adding mw1208 and mw1221 back to the api appservers live pool
  • 09:22 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Restoring db1082 original weight: 500 (duration: 00m 52s)
  • 08:54 ema: jessie dist-upgrade on cp* cache hosts
  • 08:53 moritzm: reimaging mw1212-mw1214 to jessie
  • 08:27 twentyafterfour: Restarted apache on iridium to apply hotfix to phab calendar form. refs T147525
  • 08:25 moritzm: restarted hhvm on mw1213
  • 08:02 marostegui: Dropping tables in S3.testwiki - T57676
  • 07:43 moritzm: upgrading labtestvirt2001 to Linux 4.4
  • 07:40 marostegui: Dropping tables in S1.enwiki - T57676
  • 06:38 elukey: reimaging mw1208 and mw1221 to Debian Jessie (API appservers)
  • 06:36 moritzm: reimaging mw1187, mw1188, mw1211 to jessie (the latter is a scap proxy)
  • 03:14 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Oct 6 03:14:25 UTC 2016 (duration 7m 1s)
  • 03:07 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.21) (duration: 15m 38s)
  • 02:32 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.20) (duration: 11m 53s)
  • 00:15 twentyafterfour: phabricator update complete and service is restored
  • 00:15 bblack: cache_upload: rolling depooled frontend restarts for libvmod-netmapper upgrade
  • 00:11 twentyafterfour: scheduled phabricator update starting momentarily. service will be offline for (hopefully) less than 5 minutes
  • 00:08 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.21/extensions/Flow/: Make more types of exceptions loggable (Gerrit:314452, T135545, T138310) (duration: 01m 12s)
  • 00:03 Dereckson: Created Flow tables on labswiki (wikitech.wikimedia.org)
  • 00:02 bblack: cache_maps: rolling depooled frontend restarts for libvmod-netmapper upgrade

2016-10-05

  • 23:57 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Set Flow database for wikitech (T127792) (duration: 00m 50s)
  • 23:54 bblack: cache_misc: rolling depooled frontend restarts for libvmod-netmapper upgrade
  • 23:38 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Always set wgFlowDefaultWikiDb (Gerrit:314194 and Gerrit:314453) (duration: 00m 50s)
  • 23:22 bblack: rebooting radon for kernel update (ns0.wikimedia.org)
  • 23:13 logmsgbot: dereckson@tin Synchronized dblists/commonsuploads.dblist: Disable local upload on bat-smg.wikipedia (T142632) (duration: 00m 49s)
  • 23:09 logmsgbot: dereckson@tin Synchronized wmf-config/CirrusSearch-common.php: Cirrus: Support document versioning (T144039) (duration: 00m 50s)
  • 22:49 bblack: rebooting primary LVS hosts for kernel updates
  • 22:43 ejegg: updated civicrm from 8bc4908 to 17fab4d
  • 22:35 bblack: jessie dist-upgrade on primary LVS servers
  • 22:28 awight: update fundraising-tools from 5427a60 to bde60d5
  • 22:25 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.21/includes/libs/rdbms/loadbalancer/LoadBalancer.php: Add more information to reuseConnection() exceptions (duration: 00m 51s)
  • 22:15 bblack: rebooting secondary (inactive) LVS hosts for kernel updates
  • 22:00 ejegg: updated civicrm from d52c04d to 8bc4908
  • 21:59 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.21/extensions/TimedMediaHandler: Revert "Rewrite discovery of TimedText tracks" (duration: 00m 54s)
  • 21:30 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 to 1.28.0-wmf.20
  • 21:14 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.21
  • 21:09 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.21/includes/libs/rdbms: Make LoadMonitor use $serverIndexes in the cache key (T147359) PART II (duration: 00m 55s)
  • 21:08 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.21/maintenance/lag.php: Make LoadMonitor use $serverIndexes in the cache key (T147359) PART I (duration: 00m 50s)
  • 20:34 Krenair: ran package updates on wikitech-static vm
  • 19:56 Pchelolo: deploy RESTBase 810b6aa563
  • 19:52 bblack: jessie dist-upgrade on secondary LVS servers
  • 19:37 Pchelolo: deploy RESTBase 810b6aa563 canary on restbase1007
  • 19:15 gehel: rebooting maps1* for kernel upgrade
  • 19:05 logmsgbot: demon@tin Synchronized wmf-config/CommonSettings.php: remove dumb commented setting, dumb me (duration: 00m 49s)
  • 18:51 awight: update fundraising CRM from 5f53ef8 to d52c04d
  • 18:46 urandom: T146211: Performing rolling restart of RESTBase eqiad rack 'd' Cassandra instances, and marking SSTables unrepaired.
  • 18:39 logmsgbot: demon@tin Synchronized wmf-config/CommonSettings.php: disable gzip internally, T125938 (duration: 00m 50s)
  • 18:25 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.21/extensions/TimedMediaHandler/: fix fatal (duration: 00m 54s)
  • 18:18 XenoRyet: updated civicrm from 412d999 to 5f53ef8
  • 18:17 urandom: T146211: Performing rolling restart of RESTBase rack 'b' Cassandra instances, and marking SSTables unrepaired.
  • 17:58 urandom: T146211: Performing rolling restart of restbase1011.eqiad.wmnet Cassandra instances, and marking SSTables unrepaired.
  • 17:35 moritzm: installing chromium security updates on osmium
  • 17:32 urandom: T146211: Performing rolling restart of restbase1010.eqiad.wmnet Cassandra instances, and marking SSTables unrepaired.
  • 17:23 moritzm: installing libav security updates
  • 16:35 logmsgbot: legoktm@tin Synchronized wmf-config/InitialiseSettings.php: Don't grant editcontentmodel to all users yet (duration: 01m 01s)
  • 16:28 godog: upgrade mysqld_exporter to 0.9.0 on db2030 T147476
  • 15:54 urandom: T146211: Restarting Cassandra on restbase1007-c.eqiad.wmnet to mark parsoid.data-parsoid tables unrepaired
  • 15:48 urandom: T146211: Restarting Cassandra on restbase1007-b.eqiad.wmnet to mark parsoid.data-parsoid tables unrepaired
  • 15:44 paravoid: upgrading JunOS on cr2-knams
  • 15:39 urandom: T146211: Restarting Cassandra on restbase1007-a.eqiad.wmnet to mark parsoid.data-parsoid tables unrepaired
  • 15:38 moritzm: restarted hhvm on mw1274, was stuck
  • 15:35 godog: reimage lithium with bigger disks T143307
  • 15:18 paravoid: upgrading JunOS on cr1-esams
  • 15:04 godog: add lpxelinux.0 to volatile/tftpboot on puppet.eqiad.wmnet
  • 14:33 logmsgbot: gehel@puppetmaster1001 conftool action : set/pooled=yes; selector: dc=codfw,cluster=wdqs,service=wdqs
  • 14:32 logmsgbot: gehel@puppetmaster1001 conftool action : set/pooled=yes; selector: dc=eqiad,cluster=wdqs,service=wdqs
  • 14:27 gehel: restarting pybal on lvs1003 - T132457
  • 14:25 cmjohnson1: db1055 replacing disk slot 0
  • 14:22 gehel: restarting pybal on lvs1006 - T132457
  • 14:18 gehel: restarting pybal on lvs1003 - T132457
  • 14:17 bblack: rebooting baham (ns1.wikimedia.org)
  • 14:17 gehel: restarting pybal on lvs1012 - T132457
  • 14:15 gehel: restarting pybal on lvs1009 - T132457
  • 14:11 gehel: restarting pybal on lvs1006 - T132457
  • 14:02 bblack: rebooting eeden (ns2.wikimedia.org)
  • 13:58 paravoid: upgrading JunOS on cr1-ulsfo
  • 13:47 elukey: adding mw120[67] back to the api appservers live pool after reimage
  • 13:46 gehel: deploying new LVS configuration for WDQS service - T132457
  • 13:42 moritzm: upgrading neodymium to Linux 4.4
  • 13:32 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: wmf-config/db-codfw.php Remove db1019 entries as it is going to be decommissioned - T146265 (duration: 00m 49s)
  • 13:32 paravoid: upgrading JunOS on cr2-ulsfo (attempt 2)
  • 13:17 bblack: upgrading kernel packages on cp* cache hosts (no reboots yet)
  • 13:07 paravoid: upgrading JunOS on cr2-ulsfo
  • 12:50 marostegui: dropping views jamwiki_p.abuse_filter_history drop view adywiki_p.abuse_filter_history - T147413
  • 12:14 logmsgbot: reedy@tin Synchronized php-1.28.0-wmf.21/extensions/UserMerge: Fix fatal when using special page (duration: 00m 50s)
  • 12:10 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: scb2001.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=apertium'])
  • 11:44 elukey: reimaging mw120[67] to Debian Jessie
  • 11:23 moritzm: reimaging mw1184-mw1186 to jessie
  • 10:41 moritzm: reimaging mw1181-mw1183 to jessie
  • 10:29 elukey: adding mw120[01] back to the mw api live pool after reimage
  • 10:18 kart_: Update cxserver to 0b2c3fa (T144588)
  • 10:12 godog: reimage bast3001 with /srv partition scheme
  • 10:05 akosiaris: restart pybal on lvs1003, lvs2003 T147288
  • 09:43 akosiaris: restart pybal on lvs1006, lvs1009, lvs1012, lvs2006 T147288
  • 09:40 akosiaris: pool all scb hosts for apertium service
  • 09:39 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: scb2001.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=apertium'])
  • 09:39 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: scb2002.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=apertium'])
  • 09:39 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: scb1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=apertium'])
  • 09:39 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=apertium'])
  • 09:29 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=cxserver'])
  • 09:22 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=no; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=cxserver'])
  • 09:22 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: scb1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=cxserver'])
  • 09:21 akosiaris: enable puppet on scb1002. T147288
  • 09:11 ema: repooling varnish-be-rand on cp2014 and cp1073 T147209
  • 08:42 moritzm: installing PHP security updates on Ubuntu systems
  • 08:28 moritzm: reimaging mw1172,mw1179, mw1180 to jessie
  • 08:20 logmsgbot: akosiaris@puppetmaster1001 conftool action : set/pooled=no; selector: scb1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=cxserver'])
  • 08:12 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Increase weight for db1082 from 100 to 300 (duration: 00m 52s)
  • 08:11 akosiaris: T147288 disable puppet on scb1001, scb1002, scb2001, scb2002
  • 08:10 akosiaris: disable puppet on scb1001, scb1002, scb2001, scb2002
  • 07:57 elukey: reimaging mw120[01] to Debian Jessie (mw1201 is a scap proxy)
  • 07:08 moritzm: reimaging mw1176-mw1178 to jessie
  • 03:20 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Oct 5 03:20:16 UTC 2016 (duration 7m 7s)
  • 03:13 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.21) (duration: 19m 19s)
  • 02:37 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.20) (duration: 13m 47s)
  • 00:05 logmsgbot: maxsem@tin Synchronized wmf-config/mobile.php: https://gerrit.wikimedia.org/r/#/c/314206/1 (duration: 00m 49s)
  • 00:01 logmsgbot: maxsem@tin Synchronized wmf-config/: (no message) (duration: 00m 52s)
  • 00:00 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/313158/2 (duration: 01m 57s)

2016-10-04

  • 23:39 logmsgbot: maxsem@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/313157/2 (duration: 01m 38s)
  • 23:31 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/313156/2 (duration: 00m 50s)
  • 23:29 logmsgbot: maxsem@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/313156/2 (duration: 00m 57s)
  • 23:24 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/313155/2 (duration: 00m 49s)
  • 23:20 logmsgbot: maxsem@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/313155/2 (duration: 00m 49s)
  • 23:13 logmsgbot: maxsem@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/313154/2 (duration: 00m 50s)
  • 22:41 XenoRyet: roll back civicrm from 5f53ef8 to 412d999
  • 22:29 XenoRyet: updated civicrm from 412d999 to 5f53ef8
  • 22:08 ejegg: updated payments-wiki from e6027d5 to 27ffd8c
  • 21:44 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.28.0-wmf.21
  • 21:37 bblack: cache_upload: rolling frontend restarts for https://gerrit.wikimedia.org/r/#/c/313847/ (sequential depooled, ~30s per host)
  • 21:36 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: testwiki to 1.28.0-wmf.21
  • 21:34 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.21/includes/libs/rdbms/loadmonitor/LoadMonitor.php: getCacheKey() (T147359) (duration: 00m 53s)
  • 21:05 ejegg: enabled recurring donation consumer
  • 20:59 ejegg: updated civicrm from 7502c0b to 412d999
  • 20:47 ejegg: updated civicrm from 51b790b to 7502c0b
  • 20:46 ejegg: disabled recurring donation queue consumer
  • 20:38 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: testwiki back to 1.28.0-wmf.20
  • 20:22 logmsgbot: thcipriani@tin Finished scap: testwiki to 1.28.0-wmf.21 and rebuild l10n cache (duration: 55m 51s)
  • 20:03 Pchelolo: RESTBase deploy 810b6aa563 to staging
  • 19:26 logmsgbot: thcipriani@tin Started scap: testwiki to 1.28.0-wmf.21 and rebuild l10n cache
  • 19:09 XenoRyet: update civicrm from b45b155 to 51b790b
  • 18:40 RoanKattouw: Running extension/Echo/removeInvalidNotification.php on testwiki, test2wiki and mediawikiwiki (T147138)
  • 18:33 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Flow opt in: Temporarily disable all, MW.org is redundant (Gerrit:314042) (duration: 00m 50s)
  • 18:25 thcipriani: cutting branch 1.28.0-wmf.21 of mediawiki and extensions
  • 18:12 XenoRyet: update civicrm from e2b5bbf to b45b155
  • 18:10 awight: fundraising campaigns reenabled
  • 17:57 yurik: deployed tilerator (disabled on maps-test*) https://gerrit.wikimedia.org/r/#/c/314030/
  • 17:51 awight: disabled fundraising campaigns
  • 17:44 logmsgbot: krinkle@tin Synchronized docroot/noc/db.php: (no message) (duration: 00m 48s)
  • 17:38 yurik: deployed kartotherian https://gerrit.wikimedia.org/r/#/c/314018/ -- Possible issue https://phabricator.wikimedia.org/T147334
  • 17:19 yurik: deploying kartotherian & tilerator updates
  • 16:55 logmsgbot: krinkle@tin Synchronized docroot/noc: (no message) (duration: 01m 01s)
  • 16:32 mutante: new wiki language Livvi-Karelian -> olo.wikipedia.org has been added to DNS (T146612)
  • 16:30 moritzm: upgrading labvirt1014 to Linux 4.4
  • 16:30 mutante: authdns commands from T97051#1994679 to add olo.wp for T146612
  • 16:29 mutante: authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones
  • 15:18 elukey: adding mw120[45] back to the api live pool after reimage
  • 15:00 volans: created marostegui account into Racktables
  • 14:50 godog: eqiad-prod: ms-be1022 to weight 1000 T136631
  • 14:11 cmjohnson1: ms-be1002 replacing failed disk slot 11
  • 14:07 cmjohnson1: db1055 swapped disk 0
  • 13:40 hoo: Updated Wikidata's property suggester with data from Monday's json dump and applied the T132839 workarounds
  • 13:33 marostegui: Remove db1019 from prometheus also adding it to spare as it is going to be decommissioned
  • 13:33 godog: upgrade grafana to 3.1.1 on labmon1001 - T146354
  • 13:16 logmsgbot: hashar@tin Synchronized rpc/RunJobs.php: trick mw into generating a raw exception report (duration: 00m 47s)
  • 13:08 elukey: reimage mw120[45] to Jessie
  • 13:04 hashar: Purged namespace 0 pages for arbcom_nlwiki (T147186) via: mwscript purgeList.php --wiki=arbcom_nlwiki --namespace=0 --verbose
  • 13:04 logmsgbot: hashar@tin Synchronized wmf-config/InitialiseSettings.php: Enable subpages for main namespace in arbcom_nlwiki T147186 (duration: 00m 49s)
  • 12:56 logmsgbot: hashar@tin Synchronized wmf-config/throttle.php: [throttle] Increase account creation limits for an event in Perpignan on 201 T147293 (duration: 00m 50s)
  • 11:14 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Removing db1019 entry as it is going to be decommissioned - T146265 (duration: 00m 51s)
  • 11:14 elukey: adding mw120[23] back to the live api servers pool
  • 10:23 elukey: installed memcached 1.4.28-1.1+wmf1 on mc2009 as part of a performance test - T129963
  • 10:07 elukey: reimaging mw120[23] to Jessie
  • 09:56 elukey: adding mw119[89] to the live api server pool (volans provides magic)
  • 09:03 hashar: Regenerating configuration of all Jenkins job due to https://gerrit.wikimedia.org/r/#/c/313306/
  • 08:44 elukey: reimaging mw119[89] to jessie
  • 07:09 elukey: rebooting eventlog1001 for kernel upgrades
  • 07:04 elukey: executed salt -C 'G@cluster:jobrunner and G@site:eqiad' cmd.run 'find /var/log/hhvm/ -type f -user root -exec chown www-data:www-data {} \;' (also in codfw) to reduce cronspam
  • 02:41 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Oct 4 02:41:38 UTC 2016 (duration 4m 55s)
  • 02:36 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.20) (duration: 18m 19s)
  • 01:09 Dereckson: echo 'https://noc.wikimedia.org/' | mwscript purgeList.php
  • 00:26 logmsgbot: dereckson@tin Synchronized docroot/noc/index.html: Remove dead pybal link on noc. (Gerrit:313162) (duration: 00m 48s)
  • 00:08 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.20/extensions/Flow/includes/BoardMover.php: SWAT: BoardMover: do not try to save a null edit (T138310) (duration: 00m 49s)
  • 00:04 gwicke: Started run of exportRestrictions script on terbium (T135278); this is running in screen as user gwicke. It is not expected to generate noticeable load.

2016-10-03

  • 23:45 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Wikidata descriptions on Japanese and Spanish Wikipedias (T145786) (duration: 00m 49s)
  • 23:36 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.20/resources/lib/oojs-ui: SWAT: Update OOjs UI to v0.17.10 (duration: 00m 48s)
  • 23:34 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.20/vendor: SWAT: Update OOjs UI to v0.17.10 (duration: 01m 33s)
  • 23:24 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable wmgEchoFooterNotice (duration: 00m 49s)
  • 23:15 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Reduce number of replicas for titlesuggest indices (T147192) (duration: 00m 51s)
  • 22:26 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1091 with regular weight (duration: 00m 51s)
  • 22:08 jynus: running schema change (innodb conversion) on phabricator db hosts T146673
  • 21:54 cscott: OCG deploy temporarily disabled PDF render on en.wiktionary.org to combat DoS.
  • 21:50 cscott: updated OCG to version 0bf27e3 (T147211, T144120)
  • 21:50 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1091 with low weight after maintenance (duration: 00m 50s)
  • 21:47 cscott: starting OCG deploy
  • 21:01 jynus: disabling puppet on labsdb1002 and shutting it down for decommission
  • 20:57 bearND: deployed mobileapps 17bc059
  • 20:53 bearND: starting mobileapps deploy
  • 20:32 yurik: deployed graphoid update - https://gerrit.wikimedia.org/r/#/c/313887/
  • 20:28 yurik: about to deploy graphoid update - https://gerrit.wikimedia.org/r/#/c/313887/
  • 20:11 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1091 (duration: 00m 48s)
  • 19:48 cscott: cleared OCG queue again, while I work on a blacklist patch for the OCG frontend
  • 19:33 ejegg: updated payments-wiki settings
  • 19:28 XenoRyet: updated payments-wiki from cc27f83 to e6027d5
  • 18:56 cscott: cleared rapidly-growing OCG queue w/ mw-ocg-service/scripts/clear-queue.js to cope with someone trying to render all of enwiktionary to PDF.
  • 18:47 logmsgbot: catrope@tin Synchronized php-1.28.0-wmf.20/includes/exception/MWExceptionHandler.php: Restore prior render() logic (T147122) (duration: 00m 48s)
  • 18:39 logmsgbot: catrope@tin Synchronized wmf-config/CommonSettings.php: Set $wgDefaultExternalStore for wikitech before Flow settings (T127792) (duration: 01m 04s)
  • 18:36 logmsgbot: catrope@tin Synchronized php-1.28.0-wmf.20/includes/exception/MWExceptionHandler.php: Restore delegation to MWException::report (T147098) (duration: 00m 48s)
  • 18:34 logmsgbot: catrope@tin Synchronized wmf-config/CommonSettings.php: Use === for $wgDBname comparisons (duration: 01m 53s)
  • 18:24 logmsgbot: catrope@tin Synchronized wmf-config/InitialiseSettings.php: Enable Flow beta feature on elwiki (T144384) (duration: 00m 49s)
  • 18:15 logmsgbot: catrope@tin Synchronized wmf-config/InitialiseSettings.php: Enable PageAssessments on enwiki (T146679) (duration: 00m 49s)
  • 17:17 ejegg: updated SmashPig from efe8720 to fa0267b
  • 17:16 gehel: deploying latest wdqs updater and gui
  • 16:19 ejegg: updated payments-wiki settings to stop sending completed donations to ActiveMQ
  • 15:15 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Increase weight for db1084 after its maintenance to its original value: 500 - T147113 (duration: 00m 48s)
  • 14:58 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Increase weight for db1084 after its maintenance - T147113 (duration: 00m 48s)
  • 14:53 chasemp: adding volans (RCoccioli) to phab security, confirmed staff account association and membership in ops acl already, confirmed w/ riccardo he is missing, and there is a long standing agreement all members of ops should be in #security
  • 14:37 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1084 after its maintenance - T147113 (duration: 00m 48s)
  • 14:32 zeljkof: ending EU SWAT
  • 14:28 logmsgbot: zfilipin@tin Synchronized robots.txt: SWAT: Fix an invalid empty line in the global robots.txt (T146908) (duration: 00m 47s)
  • 14:24 logmsgbot: zfilipin@tin Synchronized static/images/project-logos/olowiki-2x.png: SWAT: Add 1.5 and 2x logos for olowiki (T146745) (duration: 00m 48s)
  • 14:23 logmsgbot: zfilipin@tin Synchronized static/images/project-logos/olowiki-1.5x.png: SWAT: Add 1.5 and 2x logos for olowiki (T146745) (duration: 00m 48s)
  • 14:20 hashar: T146271 mwscript purgeList.php --wiki=testwikidatawiki --namespace=121 --verbose
  • 14:19 hashar: Purged wikidata wiki property talk page, they now allow subpages (T146271). Ran: mwscript purgeList.php --wiki=wikidatawiki --namespace=121 --verbose
  • 14:15 logmsgbot: zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable subpages in 121 namespace in wikidata (T146271) (duration: 00m 49s)
  • 14:02 zeljkof: extending EU SWAT
  • 13:55 logmsgbot: zfilipin@tin Synchronized static/images/project-logos/olowiki.png: SWAT: Upload 1x logo for olowiki (T146745) (duration: 00m 48s)
  • 13:51 logmsgbot: zfilipin@tin Synchronized static/images/project-logos/hewiki.png: SWAT: Fix hewiki logos (T145017) (duration: 00m 47s)
  • 13:38 godog: reenable puppet on scb1* ores/celery spamming is over
  • 13:31 logmsgbot: zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Change protection level autoreview in arwiki (T146575) (duration: 00m 48s)
  • 13:23 dcausse: elasticsearch@eqiad: reducing replica count from 5 to 2 for jawiki_titlesuggest and eswiki_titlesuggest
  • 13:21 logmsgbot: zfilipin@tin Synchronized wmf-config/throttle.php: SWAT: [throttle] Rule for Winona State University (T146600) [throttle] Ada Lovelace Day Edit-a-thon (T146654) (duration: 00m 49s)
  • 13:19 dcausse: elasticsearch@eqiad: reducing replica count from 5 to 2 for ruwiki_titlesuggest
  • 13:15 dcausse: elasticsearch@eqiad: reducing replica count from 5 to 2 for zhwiki_titlesuggest
  • 13:13 gehel: reimage of maps-test2001 - T147194
  • 13:11 dcausse: elasticsearch@eqiad: reducing replica count from 5 to 2 for frwiki_titlesuggest
  • 13:09 gehel: shutting down services on maps-test* servers prior to reimage -T147194
  • 13:02 zeljkof: starting EU SWAT
  • 13:00 dcausse: elasticsearch@eqiad: reducing replica count from 5 to 3 for enwiki_titlesuggest
  • 12:52 dcausse: elasticsearch@eqiad: reducing replica count from 5 to 3 for dewiki_titlesuggest
  • 12:23 marostegui: Deploying alter table in S4 - T147113
  • 12:14 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1084 for maintenance - T147113 (duration: 00m 48s)
  • 11:51 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Increase db1081 weight to its original value after finishing maintenance - T147113 (duration: 00m 48s)
  • 11:21 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Increase db1081 weight after finishing its maintenance - T147113 (duration: 00m 48s)
  • 10:33 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Increase db1081 weight after finishing its maintenance - T147113 (duration: 00m 48s)
  • 10:21 akosiaris: restarting slapd on dubnium.wikimedia.org T143302
  • 10:16 akosiaris: restarting slapd on seaborgium.wikimedia.org T143302
  • 10:13 akosiaris: restarting slapd on serpens.wikimedia.org T143302
  • 10:11 akosiaris: restarting slapd on pollux.wikimedia.org T143302
  • 09:54 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1081 after finishing its maintenance - T147113 (duration: 00m 49s)
  • 09:48 elukey: lowered down builds log retention from 90 to 60 days for the puppet compiler (https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/)
  • 09:32 akosiaris: T147173 clean exim queues on mx1001 from backscatter spam. Seems to be originating from mx.{east,west}.cox.net, blocked them for now
  • 09:28 marostegui: dbstore2001 going to be reimaged as jessie
  • 09:27 gehel: rolling restart of elasticsearch codfw cluster for kernel upgrade - T146123
  • 09:14 akosiaris: T147173 clean exim queues on mx1001 from backscatter spam
  • 09:08 akosiaris: clean exim queues on mx1001 from backscatter spam
  • 08:46 elukey: rebooted compiler02.puppet3-diffs.eqiad.wmflabs (not reachable by Jenkins, pingable from bastions but no ssh available)
  • 08:04 _joe_: powercycling mw1207
  • 07:56 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1081 for maintenance - T147113 (duration: 00m 50s)
  • 07:24 volans: emptying /var/log/debug on dubnium because of disk full (the same data is on syslog) T147173
  • 06:30 marostegui: altering S3,S4,S5,S6,S7 user_groups tables in sanitarium to avoid tokudb bug - T146121
  • 02:28 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.20) (duration: 13m 16s)

2016-10-02

  • 07:29 gehel: silencing wdqs response time alerts, it is flapping, related to traffic - T147130
  • 04:58 cwd|afk: updated smash pig from 4b36376 to efe8720
  • 02:27 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.20) (duration: 13m 07s)

2016-10-01

  • 11:03 Amir1: ladsgroup@terbium:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=commonswiki --logwiki=metawiki Gautehuus Neuraxıs
  • 02:27 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.20) (duration: 12m 51s)

2016-09-30

  • 22:06 Krinkle: Re-run mwscript deleteEqualMessages.php on all wikis it was previously run on (T45917)
  • 18:49 ejegg: updated SmashPig from 8ff1950 to 4b36376
  • 17:57 ejegg: updated civicrm from 18e59ab to e2b5bbf
  • 13:33 yuvipanda: restart grafana-server on labmon1001
  • 02:33 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Sep 30 02:33:12 UTC 2016 (duration 4m 49s)
  • 02:28 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.20) (duration: 13m 31s)
  • 00:29 matt_flaschen: Manually updated the DB to fix already-broken cases caused by since-fixed T138310
  • 00:25 ejegg: updated civicrm from 637659e to 18e59ab
  • 00:12 ejegg: updated SmashPig from 3811f0f to 8ff1950
  • 00:12 ejegg: updated SmashPig // FIXME: var map can't put one thing in two places
  • 00:07 ejegg: re-enabled donations queue consumer

2016-09-29

  • 23:38 ejegg: updated CiviCRM from 6768613 to 637659e
  • 23:25 ejegg: disabled donations queue consumer
  • 23:06 ejegg: enabled mirroring completed donations queue from payments-wiki
  • 23:03 ejegg: updated SmashPig from 2169b71 to 3811f0f
  • 21:14 ejegg: updated SmashPig from 077ffcc to 2169b71
  • 18:25 ejegg: disabled CiviCRM dedupe jobs
  • 18:24 bd808: https://tools.wmflabs.org/sal/ missing some entries for 2016-09-29; consider https://wikitech.wikimedia.org/wiki/Server_Admin_Log canonical
  • 18:21 cwd: rolled forward PaymentListeners again
  • 17:54 cwd: updated smashpig from 0d88fea to 077ffcc
  • 17:25 cwd: rolled back PaymentListeners
  • 17:05 cwd: updated PaymentListeners from b4d77a9 to 21647c8
  • 16:59 cwd: updated smashpig from 40c4a7c to 0d88fea
  • 16:51 ejegg: enabled adyen job runner
  • 16:48 cwd: updated smashpig from 3458f93 to 40c4a7c
  • 16:34 elukey: executed 'sudo salt -C 'G@cluster:imagescaler and G@site:eqiad' cmd.run 'find /var/log/hhvm/ -type f -user root -exec chown www-data:www-data {} \;' to reduce cronspam
  • 16:32 elukey: executed 'sudo salt -C 'G@cluster:imagescaler and G@site:codfw' cmd.run 'find /var/log/hhvm/ -type f -user root -exec chown www-data:www-data {} \;' to reduce cronspam
  • 16:32 urandom: T133395: restbase staging: starting bootstrap of restbase-test2001-b.codfw.wmnet (test of decomm/bootstrap under time-windowed compaction)
  • 15:18 urandom: T133395: restbase staging: decommissioning restbase-test2001-b.codfw.wmnet (test of decomm/bootstrap under time-windowed compaction)
  • 13:29 cwd|afk: disabled Adyen job runner
  • 10:28 hashar: Upgrading Jenkins plugins with zeljkof :]
  • 09:32 robh: received notification of ulsfo.1.23.pdu flapping power status via united layer icinga, yet checking router shows no power interruption for cr1-ulsfo. seems to be a monitoring false alarm (from united layers end, not ours)
  • 08:38 logmsgbot: reedy@tin Synchronized wmf-config/mobile-labs.php: Remove transfers of non existent $wmg variables (duration: 00m 48s)
  • 02:32 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Sep 29 02:32:47 UTC 2016 (duration 4m 46s)
  • 02:28 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.20) (duration: 12m 42s)
  • 02:14 eileen: civicrm upgraded from 8637123 to 6768613
  • 01:04 cwd: rolled smashpig back because JSON_UNESCAPED_UNICODE is unavailable in php5.3 and don't want consumers to explode
  • 00:56 cwd: updated SmashPig from 3458f93 to 3d0a76b

2016-09-28

  • 23:51 ejegg: updated SmashPig from d258927 to 3458f93
  • 22:06 cwd: PaymentListeners rolled back
  • 21:54 cwd: PaymentListeners updated from b4d77a9 to 21647c8
  • 21:33 Krenair: Fixed labs 205.21.68.10.in-addr.arpa. entry to remove another broken contintcloud name, unbreaking beta scap
  • 21:25 XenoRyet: update SmashPig from 372cd40 to d258927
  • 20:51 logmsgbot: apergos is awesome and made the bot work again by restarting it
  • 20:50 apergos: restarted logmsgbot on neon
  • 20:47 bd808: logmsgbot seems to be down: "error: [Errno 111] Connection refused" from scap sync-file
  • 20:46 bd808: scap sync-file wmf-config/throttle.php "IP cap lift for eswiki on 2016-09-30 (T146788)"
  • 19:48 awight: update fundraising CRM from 6b2bd98 to 8637123
  • 19:22 XenoRyet: reverted SmashPig from 4b930ad to 372cd40
  • 17:22 jynus: restarting db1069.s3 (stagnant replication)
  • 16:42 awight: update CRM from 88000ac to 6b2bd98
  • 08:15 twentyafterfour: twentyafterfour@iridium:/srv/phab/phabricator$ sudo bin/search index --type PhabricatorProject --force
  • 03:34 eileen: tools upgrade from b0be0f9 to 5427a60
  • 01:20 eileen: updating CiviCRM from 26e5214 to 88000ac
  • 01:05 eileen: civicrm upgrade from d30a5e4 to 26e5214

2016-09-27

  • 05:32 _joe_: rebooting ms-be1002, stuck in a failed disk

2016-09-26

  • 19:37 awight: enabling awight_test5 banner at 1% of nlwiki
  • 18:09 ejegg: rolled back paypal IPN listener to b4d77a9
  • 17:59 ejegg: updated standalone paypal IPN listener from b4d77a9 to 21647c8
  • 17:47 ejegg: rolled back paypal IPN listener to b4d77a9
  • 17:39 ejegg: updated standalone paypal IPN listener from b4d77a9 to 21647c8
  • 13:52 marostegui: phabricator is back in write mode - search is degraded. we are regenerating the indexes
  • 13:52 chasemp: iridium phab ./bin/search index --all
  • 03:39 cwdent_: disabled civicrm dedupe contacts job

2016-09-25

2016-09-24

  • 19:30 ema: hhvm 1283-1290 rolling restart
  • 12:21 godog: apply temporary cleanup of old (+20m) thumbor temporary files - T146262
  • 10:47 _joe_: systemctl restart thumbor-instances.service on thumbor1001 freed 3 GB of space
  • 02:45 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Sep 24 02:44:59 UTC 2016 (duration 5m 57s)
  • 02:39 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.20) (duration: 16m 49s)

2016-09-23

  • 22:05 matt_flaschen: Deployed patch for T146425
  • 21:42 logmsgbot: ebernhardson@tin Synchronized php-1.28.0-wmf.20/extensions/CirrusSearch/includes/ElasticsearchIntermediary.php: Additional logging to track down autocomplete timing regression (duration: 00m 50s)
  • 20:52 gehel: cleaning up leftover system unit files on wdqs1*
  • 18:41 gehel: killing stuck tilerator notification processes on maps1001 - T145534
  • 17:57 mutante: mira restarted cron
  • 17:53 ejegg: updated SmashPig from 8ac1160 to 372cd40
  • 17:46 logmsgbot: thcipriani@tin Synchronized README: Test sync for new mira (duration: 01m 27s)
  • 17:43 mutante: mira - changing UID of l10nupdate to 10002, chown'ing files (1001 -> 10002)
  • 17:35 logmsgbot: ebernhardson@tin Synchronized php-1.28.0-wmf.20/extensions/CirrusSearch/includes/ElasticsearchIntermediary.php: Add timing marks to narrow down autocomplete timing regression (duration: 00m 50s)
  • 17:31 logmsgbot: ebernhardson@tin Synchronized php-1.28.0-wmf.20/extensions/CirrusSearch/includes/CompletionSuggester.php: Add timing marks to narrow down autocomplete timing regression (duration: 18m 43s)
  • 17:04 mutante: stat1002 - before it was hanging and then fixed due to https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Administration#Fixing_HDFS_mount_at_.2Fmnt.2Fhdfs
  • 17:03 mutante: stat1002 - starting nagios-nrpe-server
  • 14:55 jynus: deployed dns update (removing db1010) T129395
  • 12:20 moritzm: rearmed keyholder on mira
  • 12:03 _joe_: rolling restart of mw1280-90, high cpu usage due to memory leaks.
  • 10:16 moritzm: reimaging mira to jessie (again, previously installer config still pointed to trusty)
  • 10:05 Amir1: ladsgroup@terbium:~$ mwscript extensions/ORES/maintenance/PopulateDatabase.php --wiki=wikidatawiki (T146461) and for 'trwiki', 'plwiki', 'fawiki', 'nlwiki', 'ruwiki', 'ptwiki'
  • 10:00 Amir1: ladsgroup@terbium:~$ mwscript extensions/ORES/maintenance/PopulateDatabase.php --wiki=enwiki
  • 09:58 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.20/extensions/ORES/includes/Cache.php: No int typehinting (causes jobs to crash) T146461 (duration: 00m 42s)
  • 09:58 moritzm: rearmed keyholder on mira
  • 09:48 jynus: disabling alerts and shutting down db1010 in preparation for decommissioning T129395
  • 09:08 moritzm: reimaging mira to jessie
  • 09:06 elukey: reboot eventlog2001.codfw.wmnet for kernel upgrades
  • 08:52 elukey: upgrading varnishkafka to 1.0.12-1 in cache:misc
  • 08:44 ema: depooled nginx restart on cp4003 and cp1045 for libssl upgrade
  • 08:30 elukey: upgrading varnishkafka to 1.0.12-1 in cache:maps
  • 07:34 elukey: executed 'find /var/log/hhvm/ -type f -user root -exec chown www-data:www-data {} \;' for all the api and appservers to remove/prevent cronspam (root:adm files also related to new reimaged hosts, Rsyslog needs to be configured before hhvm) - T132324
  • 07:02 moritzm: rebooting francium for kernel security update
  • 04:03 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.20/includes/deferred: 5af1b93 (duration: 00m 48s)
  • 04:02 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.20/includes/libs/rdbms: 5af1b93 (duration: 00m 51s)
  • 02:46 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Sep 23 02:46:04 UTC 2016 (duration 6m 10s)
  • 02:39 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.20) (duration: 17m 04s)
  • 02:13 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.20/extensions/SecurePoll/: https://gerrit.wikimedia.org/r/#/c/312450/1 (duration: 00m 51s)
  • 02:10 mutante: mw1206, mw1224 - restarted hhvm and apache
  • 01:49 bblack: depooled mw1224 service apache2
  • 00:38 Krenair: mw1224 apache stuck, not restarting for now in case someone wants to investigate later. possibly T89912?
  • 00:17 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/312339 (duration: 00m 48s)
  • 00:16 logmsgbot: krenair@tin Synchronized wmf-config/mobile.php: https://gerrit.wikimedia.org/r/312339 (duration: 00m 47s)
  • 00:08 logmsgbot: krenair@tin Synchronized php-1.28.0-wmf.20/extensions/FlaggedRevs/business/RevisionReviewForm.php: https://gerrit.wikimedia.org/r/#/c/312423/ (duration: 00m 48s)

2016-09-22

  • 23:46 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/310483 (duration: 00m 48s)
  • 23:19 logmsgbot: krenair@tin Synchronized php-1.28.0-wmf.20/resources/src/mediawiki.less/mediawiki.ui/mixins.less: https://gerrit.wikimedia.org/r/#/c/312340/ (duration: 00m 48s)
  • 22:49 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.20/includes/libs/rdbms/loadbalancer/LoadBalancer.php: a73a7ef (duration: 01m 04s)
  • 22:18 mutante: added slaporte and zhousquared to wmf LDAP group (T146227)
  • 21:24 hasharAway: Nodepool is all back and operational. Reduced amount of queries to the OpenStack API by more than 10%
  • 21:14 yurik: deployed tilerator https://gerrit.wikimedia.org/r/#/c/312329/
  • 21:05 hasharAway: stopped nodepooled and restarted it with 0.1.1-wmf5
  • 21:04 mutante: upgraded nodepool to 0.1.1-wmf5 on labnodepool1001
  • 21:04 logmsgbot: krinkle@tin Synchronized php-1.28.0-wmf.20/resources/src/mediawiki/mediawiki.js: T146099 (duration: 00m 48s)
  • 21:02 mutante: imported nodepool_0.1.1-wmf5_amd64 into jessie-wikimedia (T145142)
  • 20:52 urandom: T133395: RESTBase Staging: starting dumps (3, eqiad)
  • 20:47 urandom: T133395: RESTBase Staging: altering table to set TWCS on wikipedia parsoid.html table
  • 20:20 urandom: T133395: RESTBase Staging: Restarting Cassandra to pick up TWCS jar in classpath
  • 20:08 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.20
  • 20:00 thcipriani: rolling out wmf.20 to all wikis
  • 19:23 SMalyshev: Deploying new version of WDQS GUI
  • 19:09 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.20
  • 19:01 thcipriani: wmf.20 to group1 will watch until 20 UTC and move forward to all wikis
  • 18:56 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.18/extensions/CentralNotice: SWAT: Update extensions/CentralNotice submodule (T144952) (duration: 00m 50s)
  • 18:50 yuvipanda: enabling puppet on labcontrol1001, run on labtestcontrol2001 seems ok
  • 18:47 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.20/extensions/CentralNotice: SWAT: Update extensions/CentralNotice submodule (T144952) (duration: 00m 52s)
  • 18:46 yuvipanda: disable puppet on labcontrol1001 for https://gerrit.wikimedia.org/r/#/c/312301/
  • 18:38 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.20/includes/libs/rdbms/database/Database.php: 844cfd5 & 014a420 (duration: 00m 49s)
  • 18:31 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.19/extensions/TimedMediaHandler/MwEmbedModules: SWAT: Update ogv.js to 1.2.0 (T145983) (duration: 00m 48s)
  • 18:28 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.18/extensions/TimedMediaHandler/MwEmbedModules: SWAT: Update ogv.js to 1.2.0 (T145983) (duration: 00m 51s)
  • 17:24 moritzm: rebooting ms-be1016, high load caused by XFS bug
  • 17:09 moritzm: rolling reboot of trusty swift backend servers in eqiad completed
  • 16:40 elukey: forced logrotation for /etc/logrotate.d/upstart on labvirt1014 to investigate cronspam
  • 16:17 godog: offline sdd on ms-be1004 via megacli T144499
  • 15:29 mobrovac: restbase deploy end of d96fbc1
  • 15:10 mobrovac: restbase deploy start of d96fbc1
  • 15:02 bblack: upgrading openssl on cp*
  • 13:22 moritzm: resume rolling reboot of trusty swift backend servers in eqiad for kernel security update
  • 13:02 moritzm: uploaded openssl 1.0.2i for jessie-wikimedia to carbon
  • 12:31 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.20/extensions/Popups: Merge mw.popups.experiment into mw.popups.core T146035 (duration: 00m 49s)
  • 12:27 mobrovac: restbase deploy end of d5538ad
  • 12:25 elukey: installing varnishkafka 1.0.12 on cache:upload ulsfo and eqiad
  • 12:24 akosiaris: uploaded to apt.wikimedia.org jessie-wikimedia: apertium-es-ro_0.7.3~r57551-2+wmf1
  • 12:13 hashar: Early SWAT for mobile team ( https://gerrit.wikimedia.org/r/#/c/311977/ )
  • 12:11 mobrovac: restbase deploy start of d5538ad
  • 11:34 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1082 with some light weight (duration: 00m 52s)
  • 10:24 moritzm: rolling reboot of trusty swift backend servers in eqiad for kernel security update
  • 09:59 moritzm: rebooting subra/suhail for kernel security update
  • 09:50 hashar: updated jobrunner code to a0e82166 (tweak errors reporting in logs) | Does not include 51014242 "Batch stats to statsd" (poke addshore )
  • 09:19 gehel: upgrade / restart of elasticsearch eqiad cluster done T145404 / T146123
  • 09:02 elukey: installing varnishkafka 1.0.12 on cache:upload codfw
  • 08:49 marostegui: Deploying schema change on S7 master - T141951
  • 08:43 elukey: installing varnishkafka 1.0.12 on cache:upload esams
  • 08:40 elukey: installed varnishkafka 1.0.12 on cp1099
  • 08:35 elukey: restarted varnishkafka on cp1099 (log abandoned )
  • 08:19 hashar: Cleanup jobrunner list of minions in redis ( "deploy:jobrunner/jobrunner:minions" )
  • 08:09 hashar: Resyncing all jobrunner deployment installations since only 41/68 minions have completed fetch/checkout
  • 08:01 elukey: rolling restart of the whole Analytics Hadoop cluster for kernel upgrades (analytics* hosts)
  • 07:58 elukey: uploaded varnishkafka 1.0.12-1 to reprepro
  • 07:52 elukey: rebooted stat100[23] for kernel upgrades
  • 07:40 moritzm: rolling restart of trusty swift frontend servers in codfw for kernel security update
  • 07:33 elukey: rebooting stat1004 for kernel upgrades
  • 06:45 elukey: Puppet disabled on analytics1027 to stop periodic Java daemons (prep step for Hadoop cluster reboots)
  • 03:18 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Sep 22 03:18:48 UTC 2016 (duration 6m 48s)
  • 03:12 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.20) (duration: 17m 08s)
  • 02:38 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 16m 39s)
  • 02:03 eileen: all back on
  • 01:50 eileen: turned off jenkins jobs to run dbupdate: dedupe(s) thank-you & donate import
  • 01:40 eileen: update civicrm from 5393a13 to d30a5e4
  • 01:35 twentyafterfour: reboot successful, iridium is back online
  • 01:28 twentyafterfour: Rebooting iridium to apply kernel update
  • 00:39 awight: update paymentswiki from d572ee9 to cc27f83f31ecc609d4400050e73905b7364f1d42; mirror unsubscribe queue

2016-09-21

  • 23:32 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Blacklist minerva from showing Related Articles in the footer (T144912, currently no-op) (duration: 00m 47s)
  • 23:31 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Blacklist minerva from showing Related Articles in the footer (T144912, currently no-op) (duration: 00m 49s)
  • 22:57 mobrovac: change-prop deploying ea8cdf8
  • 22:35 logmsgbot: awight@tin Synchronized wmf-config/InitialiseSettings.php: Add CentralNotice debug log bucket for T144952 (duration: 00m 48s)
  • 22:33 logmsgbot: awight@tin Synchronized php-1.28.0-wmf.20/extensions/CentralNotice: Correct CentralNotice logging for T144952 (duration: 00m 51s)
  • 22:31 logmsgbot: awight@tin Synchronized php-1.28.0-wmf.18/extensions/CentralNotice: Correct CentralNotice logging for T144952 (duration: 00m 51s)
  • 22:08 cwd: updated SmashPig from f308ba4 to 8ac1160
  • 20:39 bearND: deployed mobileapps bf6943b
  • 20:36 yurik: deployed kartotherian geoshape lines support - https://gerrit.wikimedia.org/r/#/c/312097/
  • 20:35 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.28.0-wmf.20
  • 20:31 bearND: starting mobileapps deploy
  • 20:24 logmsgbot: thcipriani@tin Finished scap: testwiki to php-1.28.0-wmf.20 and rebuild l10n cache (duration: 52m 01s)
  • 20:17 arlolra: updated Parsoid to version a802de0
  • 20:05 arlolra: starting Parsoid deploy
  • 19:32 logmsgbot: thcipriani@tin Started scap: testwiki to php-1.28.0-wmf.20 and rebuild l10n cache
  • 19:24 logmsgbot: krinkle@tin Synchronized php-1.28.0-wmf.18/resources/src/mediawiki/mediawiki.js: T146099 (duration: 01m 41s)
  • 17:41 bblack: bits.wikimedia.org hostname removed from DNS (if related real complaints/problems occur, revert https://gerrit.wikimedia.org/r/305533 )
  • 16:46 Krenair: running P3833 script against designate to clean up existing T120797 mess
  • 16:46 mobrovac: restbase deploy end of a75510d
  • 16:30 mobrovac: restbase deploy start of a75510d
  • 15:03 moritzm: installing wireshark security updates
  • 13:59 akosiaris: disabled puppet on neon, puppet migration in progress
  • 13:48 hashar: European SWAT completed
  • 13:45 logmsgbot: hashar@tin Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 46s)
  • 13:44 logmsgbot: hashar@tin Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 46s)
  • 13:43 logmsgbot: hashar@tin Synchronized wmf-config/InitialiseSettings-labs.php: (no message) (duration: 00m 48s)
  • 13:39 logmsgbot: hashar@tin Synchronized wmf-config/mobile.php: For phuedx or is that for yurik? (duration: 00m 47s)
  • 13:37 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.18/extensions/Kartographer: For yurik or phuedx? :D (duration: 00m 48s)
  • 13:37 gehel: adding planet_osm_lines and roads indexes on maps*
  • 13:34 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.19/extensions/Wikidata: (no message) (duration: 02m 22s)
  • 13:31 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.19/extensions/Kartographer/: (no message) (duration: 00m 50s)
  • 13:17 logmsgbot: hashar@tin Synchronized wmf-config: (no message) (duration: 00m 49s)
  • 13:13 logmsgbot: hashar@tin Synchronized wmf-config: New wikitext editor: Enable the Beta Feature in Beta Cluster (duration: 00m 50s)
  • 13:10 logmsgbot: hashar@tin Synchronized wmf-config: New wikitext editor: Enable the Beta Feature in Beta Cluster (duration: 00m 51s)
  • 12:30 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1094 after the ALTER table - T141951 (duration: 00m 47s)
  • 12:09 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1094 for an ALTER table - T141951 (duration: 00m 47s)
  • 11:45 moritzm: rolling restart of trusty swift backend servers in codfw for kernel security update
  • 11:25 mobrovac: restbase deploy end of ca55669
  • 11:16 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1086 after the ALTER table - T141951 (duration: 00m 47s)
  • 11:07 mobrovac: restbase deploy start of ca55669
  • 11:04 marostegui: Rebuilding tables in db1082 (non pooled) - T137191
  • 11:03 elukey: adding mw1197 back to serving live traffic after the reimage
  • 10:51 elukey: restarted varnishkafka on cp1048 (VSLQ_Dispatch: Varnish Log abandoned or overrun.)
  • 10:45 elukey: adding mw1196 back to serving live traffic after the reimage
  • 10:06 moritzm: rebooting lithium for kernel security update
  • 09:39 moritzm: reimaging mw1173-mw1175 to jessie
  • 09:26 marostegui: Stopping mysql at db1019 for a few days as it will be decommissioned - T146265
  • 09:19 gehel: powercycling elastic1027 - T145404
  • 09:09 godog: reimage bast3001.wikimedia.org with separate /srv
  • 08:31 marostegui: schema change on S7 - T141951
  • 08:30 elukey: reimagining mw1196-7 to jessie
  • 08:29 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Depooling db1086 for an alter table - T141951 (duration: 00m 49s)
  • 07:19 elukey: Moved some hhvm logs (/var/log/hhvm) from root:adm to www-data:www-data on mw127[678] to remove cronspam (T132324)
  • 07:18 moritzm: reimaging mw1170-mw1172 to jessie
  • 06:59 marostegui: dropping tables in S1,S3,S4 - T54924
  • 06:21 elukey: removing aqs100[123] from live traffic - aqs.svc.eqiad.wmnet - T144497
  • 06:03 awight|afk: update SmashPig from 6651835 to f308ba4
  • 03:58 logmsgbot: krinkle@tin Synchronized php-1.28.0-wmf.18/resources/src/mediawiki/mediawiki.js: I221cd6c2b (duration: 00m 46s)
  • 03:56 logmsgbot: krinkle@tin Synchronized php-1.28.0-wmf.18/resources/src/mediawiki/mediawiki.requestIdleCallback.js: I221cd6c2b (duration: 00m 48s)
  • 03:54 logmsgbot: krinkle@tin Synchronized php-1.28.0-wmf.19/resources/src/mediawiki/mediawiki.requestIdleCallback.js: I221cd6c2b (duration: 00m 47s)
  • 03:31 awight: update SmashPig from 4530dc9 to 6651835
  • 03:13 awight: update SmashPig from f1f5509 to 4530dc9
  • 02:46 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Sep 21 02:46:22 UTC 2016 (duration 7m 11s)
  • 02:45 mutante: thumbor1002 moved nginx access logs to /srv for more space on /
  • 02:39 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 16m 28s)
  • 02:11 mutante: thumbor1001/1002 - moved logs from /var/log/thumbor to /srv/thumborlogs to free some space, the actual issue is in /tmp though. lots of systemd-private-* dirs with large sizes. like https://bugzilla.redhat.com/show_bug.cgi?id=1183684 ?
  • 01:58 awight: update SmashPig from 285d8ce to f1f5509
  • 01:51 mutante: thumbor servers ran out of disk space
  • 01:15 awight: updated SmashPig to 285d8ce
  • 00:45 eileen: jobs enabled again - update failed to run due to trigger - which didn't affect staging maybe not on - will edit & try again

2016-09-20

  • 23:49 eileen: disabled Dedupe CiviCRM contacts
  • 23:47 eileen: disabled Project Dedupe Major gifts contacts
  • 23:47 eileen: disabled CiviCRM contacts (high numbers)
  • 23:46 eileen: disabled Thank you mail send
  • 23:46 eileen: disabled Donations queue consume
  • 23:45 eileen: CiviCRM update from de1df9e to 5393a13
  • 23:35 logmsgbot: catrope@tin Synchronized php-1.28.0-wmf.18/resources/src/mediawiki/mediawiki.js: Always use requestIdleCallback polyfill for batchEval (T146099) (duration: 00m 46s)
  • 23:22 logmsgbot: catrope@tin Synchronized php-1.28.0-wmf.19/extensions/Echo/: SWAT (duration: 00m 55s)
  • 23:20 logmsgbot: catrope@tin Synchronized php-1.28.0-wmf.19/extensions/TimedMediaHandler: SWAT (duration: 00m 50s)
  • 23:17 logmsgbot: catrope@tin Synchronized php-1.28.0-wmf.18/extensions/TimedMediaHandler: SWAT (duration: 00m 50s)
  • 22:26 mobrovac: change-prop deploying 4417255
  • 21:41 mutante: powercycled mw1294 (down, frozen console)
  • 21:26 thcipriani: starting branch cut for wmf.20
  • 21:17 chasemp: rsync initial transfer of others on labstore1001 to labstore1004
  • 21:08 awight: update SmashPig from db68be9 to 285d8ce
  • 20:26 Pchelolo: restbase deploy ca41acd3f
  • 20:20 Pchelolo: restbase deploy ca41acd3f canary on restbase1007
  • 20:13 Pchelolo: restbase deploy ca41acd3f to staging
  • 20:07 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.18/extensions/ProofreadPage/modules/page/ext.proofreadpage.page.edit.js: Initializes the zoom widget after page loading (duration: 00m 47s)
  • 19:51 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.18/extensions/ProofreadPage/modules/page/ext.proofreadpage.page.edit.js: Makes sure that the zoom widget is initialized before zooming in/out - https://gerrit.wikimedia.org/r/#/c/311765/ (duration: 00m 48s)
  • 17:47 Pchelolo: update RESTBase to 4829630f
  • 17:27 Pchelolo: update RESTBase to 4829630f canary on restbase1007
  • 17:01 elukey: adding aqs1006 to live traffic - aqs.svc.eqiad.wmnet - T144497
  • 16:58 elukey: adding aqs1005 to live traffic - aqs.svc.eqiad.wmnet - T144497
  • 16:48 gehel: increase recovery bandwidth on elasticsearch eqiad to match codfw - T145404
  • 16:32 elukey: restarting cassandra on aqs100[56] (started the work earlier on today, stopped due to T146130)
  • 14:31 dcausse: restarting relforge100[12].eqiad.wmnet servers for kernel upgrade and java settings change
  • 14:23 moritzm: installing tomcat security updates on Ubuntu servers
  • 13:41 jynus: disabling puppet on labtestweb2001
  • 13:40 ottomata: merged --until flag change in check_graphite script (this could affect all graphite based alerts)
  • 13:37 zeljkof: executed script: mwscript maintenance/updateArticleCount.php --wiki=wikidatawiki --update
  • 13:34 zeljkof: EU SWAT finished
  • 13:28 gehel: restarting for elasticsearch and kernel upgrade - eqiad cluster - T145404 / T146123
  • 13:05 logmsgbot: zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Change $wgArticleCountMethod in Wikidata from default (link) to any (T144687) (duration: 00m 47s)
  • 11:41 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: (no message) (duration: 00m 46s)
  • 11:38 moritzm: upgrading ganeti2001 to Linux 4.4 (ganeti2006 has been promoted to new master node)
  • 11:05 moritzm: upgrading ganeti2006 to Linux 4.4
  • 10:49 moritzm: upgrading ganeti2005 to Linux 4.4
  • 10:44 moritzm: reimaging app servers mw1240-mw1242 and API servers mw1194/mw1195 to jessie
  • 10:43 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: (no message) (duration: 00m 48s)
  • 10:36 moritzm: upgrading ganeti2004 to Linux 4.4
  • 10:28 akosiaris: force mw2232 to use palladium for report handler testing
  • 10:25 jynus: deploying schema change on s1 hosts T139090
  • 10:15 moritzm: upgrading ganeti2003 to Linux 4.4
  • 10:12 mobrovac: change-prop deploying e1ef51e
  • 09:57 moritzm: upgrading ganeti2002 to Linux 4.4
  • 09:13 moritzm: reimaging API servers mw1192/mw1193 to jessie
  • 08:56 moritzm: reimaging mw1243-mw1245 to jessie
  • 07:36 elukey: restart cassandra on aqs100[456] for T130861 - only aqs1004 is taking live traffic
  • 02:47 eileen: CiviCRM update from c9157ba to de1df9e
  • 02:40 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Sep 20 02:40:17 UTC 2016 (duration 6m 52s)
  • 02:33 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 10m 35s)
  • 00:43 mutante: wtp2019 - down again, powercycled, probably damaged RAM

2016-09-19

  • 23:22 Pchelolo: restart restbase in staging
  • 23:01 awight: update orphan rectifier config for T145848
  • 22:58 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: Restore testwiki to 1.28.0-wmf.18
  • 22:53 dapatrick: Deployed patch for T144573 to wmf18 and wmf19
  • 22:48 logmsgbot: thcipriani@tin Finished scap: testwiki to php-1.28.0-wmf.19 and rebuild l10n cache (duration: 52m 03s)
  • 22:31 bblack: cache_upload: pooling cp1099 (storage experiment - T145661)
  • 22:02 awight: update paymentswiki from 4cd1877 to d572ee9
  • 22:00 bblack: cp1099: depooling varnish backends for storage size experimentation
  • 21:56 logmsgbot: thcipriani@tin Started scap: testwiki to php-1.28.0-wmf.19 and rebuild l10n cache
  • 21:41 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: Revert group0 wikis to 1.28.0-wmf.19
  • 21:30 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group0 wikis to 1.28.0-wmf.19
  • 20:40 hoo: Removed today's Wikidata json dumps: All shards succeeded, but final dump composition apparently failed.
  • 20:14 chasemp: reboot labstore1004
  • 20:08 Pchelolo: restbase deploy 4829630f staging
  • 18:49 logmsgbot: thcipriani@tin Synchronized wmf-config/throttle.php: SWAT: Throttle for RCL (T145838) (duration: 00m 47s)
  • 18:43 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: ORES default threshold to high for wikidatawiki (T144784) (duration: 00m 47s)
  • 18:32 jynus: emergency/unscheduled restart of mariadb @ labsdb1003 - close to OOM, unusable
  • 18:14 thcipriani: ran on terbium: mwscript extensions/ShortUrl/populateShortUrlTable.php --wiki=bdwikimedia
  • 17:47 awight: reenable banner history queue consumer
  • 17:46 awight: update civicrm from 1df2596 to c9157ba
  • 17:39 ejegg: updated payments-wiki from 392d675 to 4cd1877
  • 17:36 Krenair: Reset wikitech/horizon 2fa for Greg per request
  • 15:50 akosiaris: T107306 uploaded to apt.wikimedia.org jessie-wikimedia: apertium-nno-nob_1.1.0~r66076-1+wmf1
  • 15:50 akosiaris: T107306 uploaded to apt.wikimedia.org jessie-wikimedia: apertium-br-fr_0.5.0~r61325-1+wmf1
  • 14:40 chasemp: testing nfs export performance on labstore1004/1005 cluster
  • 14:35 yurik: depl graphoid https://gerrit.wikimedia.org/r/#/c/311374/
  • 14:28 moritzm: uploaded apache 2.4.10-10+deb8u7+wmf1 for jessie-wikimedia to carbon
  • 14:21 elukey: adding aqs1004 to live traffic - aqs.svc.eqiad.wmnet - T144497
  • 14:10 moritzm: reimaging mw1246-mw1248 to jessie
  • 14:10 hashar: European SWAT is complete.
  • 14:05 zeljkof: EU SWAT finished
  • 14:05 logmsgbot: zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add WT namespace alias to NS_PROJECT in mywiktionary (T140998) (duration: 00m 47s)
  • 14:01 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.19/extensions/Graph: Fixed wikiraw: protocol bug T146010 (duration: 00m 47s)
  • 14:00 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.18/extensions/Graph: Fixed wikiraw: protocol bug T146010 (duration: 00m 48s)
  • 13:58 logmsgbot: zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable WikidataPageBanner on itwikiwoyage (T145328) (duration: 00m 48s)
  • 13:55 hashar: Europe SWAT extended as we still have some patches to process
  • 13:47 logmsgbot: zfilipin@tin Synchronized wmf-config/throttle.php: SWAT: Throttling rule for RCL (T145838) [throttle] Allow the same number accounts Throttle for RCL (duration: 00m 47s)
  • 13:18 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.19/includes/api/ApiQueryBacklinksprop.php: API: Force straight join for prop=linkshere|transcludedin|fileusage T145079 (duration: 00m 47s)
  • 13:18 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.18/includes/api/ApiQueryBacklinksprop.php: API: Force straight join for prop=linkshere|transcludedin|fileusage T145079 (duration: 00m 50s)
  • 12:59 moritzm: installing wget updates from jessie 8.6 point update
  • 12:51 elukey: adding mw1191 back to serving traffic after reimage
  • 11:43 mobrovac: restbase cassandra truncating local_group_wikipedia_T_feed_aggregated.data
  • 11:32 jynus: rebooting again db1061 for upgrade
  • 11:13 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Temporarily depool db1062 and repool db1034, in order to be able to ALTER a large table. T141951 (duration: 00m 48s)
  • 10:49 moritzm: reimaging mw1249, mw1250, mw1258 to jessie
  • 10:23 jynus: powercycle db1061, unresponsive since ~1am
  • 09:20 moritzm: invalidated squid cache on carbon
  • 08:49 akosiaris: increase /var/lib/puppet to 50GB on puppetmaster1002, puppetmaster2001, puppetmaster2002
  • 08:48 logmsgbot: addshore@tin Synchronized wmf-config/CommonSettings.php: {{gerrit|311118}} NOOP Some inline comments added (duration: 00m 58s)
  • 08:04 marostegui: renaming tables in S1, S4 and S4 in eqiad before dropping them T54924
  • 07:50 elukey: reimaging mw1191.eqiad.wmnet to jessie
  • 07:49 moritzm: installing updates for file/libmagic from jessie 8.6 point update
  • 07:42 moritzm: reimaging mw1255-mw1257 to jessie
  • 02:28 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Sep 19 02:28:46 UTC 2016 (duration 5m 4s)
  • 02:23 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 10m 39s)

2016-09-18

  • 20:37 bblack: restart upload backend: cp3036
  • 20:22 bblack: restart upload backend: cp3039
  • 19:58 bblack: restart upload backend: cp1074 (stats indicate LRU_Fail imminent)
  • 19:28 bblack: restart upload backend: cp1064 (already in LRU_Fail, caught early)
  • 19:28 bblack: restart up
  • 18:06 bblack: restart upload varnish backend: cp1050 (already in LRU_Fail)
  • 17:58 bblack: restart upload varnish backend: cp2008
  • 17:42 bblack: restart upload varnish backend: cp1071
  • 17:32 bblack: restart upload varnish backend: cp2020
  • 17:06 bblack: restart upload varnish backend: cp2026
  • 16:35 bblack: restarting upload varnish backend: cp2011
  • 15:43 bblack: restarting upload varnish backend: cp2017
  • 15:13 bblack: restarting upload varnish backend: cp2005
  • 14:52 bblack: restarting upload varnish backend: cp1049
  • 14:42 bblack: restarting upload varnish backend: cp2022
  • 14:17 bblack: restarting varnish backend on cp1073 (503 LRU_Fail pattern, has been up a few days...)
  • 13:29 bblack: disabling puppet on cp1074, to experiment with vhtcpd regex filter
  • 11:17 ema: repooling varnish-be in codfw
  • 11:00 ema: varnish-backend restart on cp3037
  • 10:58 ema: varnish-backend restart on cp3044
  • 10:54 ema: repooling varnish on cp1050
  • 10:53 ema: repooling varnish on cp1062
  • 10:52 ema: repooling varnish on cp1064
  • 10:50 ema: repooling varnish on cp1071
  • 10:50 ema: repooling varnish on cp1072
  • 10:49 ema: repooling varnish on cp1073
  • 10:49 ema: repooling varnish on cp1074
  • 09:47 _joe_: varnish-backend-restart on cp1063
  • 09:30 _joe_: varnish-backend-restart on cp1048
  • 06:09 ejegg|afk: updated civicrm from 5ba6976 to 1df2596
  • 02:32 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Sep 18 02:32:49 UTC 2016 (duration 5m 58s)
  • 02:26 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 10m 18s)

2016-09-17

  • 16:40 _joe_: rolling restart of HHVM on part fo the API cluster in eqiad, T133674
  • 08:15 _joe_: enlarged puppet partition on puppetmaster1001, rendered full by reports
  • 07:06 p858snake: set +z on -operations, allows messages sent by +b or +q users (normally blocked) to be seen by users that currently op'ed
  • 06:55 p858snake: see T145924 or email to ops list for more info
  • 06:54 p858snake: silenced (+q) icinga-wm in operations channel, due to channel spam from low disk space on puppetm1001
  • 02:34 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Sep 17 02:34:33 UTC 2016 (duration 7m 3s)
  • 02:27 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 11m 40s)

2016-09-16

  • 23:09 mutante: titanium - shutdown -h now
  • 21:23 logmsgbot: aaron@tin Synchronized wmf-config/InitialiseSettings.php: Set some database logging groups to log (duration: 00m 47s)
  • 20:34 logmsgbot: reedy@tin Synchronized wmf-config/: Load CN via extension registration. Only load jsonconfig once (duration: 00m 56s)
  • 20:28 logmsgbot: reedy@tin Synchronized wmf-config/extension-list: Couple more to extension.json (duration: 00m 47s)
  • 19:32 mutante: fermium disabled puppet again
  • 19:31 mutante: fermium starting mailman qrunner (T144933)
  • 19:26 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.19/includes/jobqueue/JobQueueGroup.php: 01254b0 (duration: 00m 47s)
  • 18:11 mutante: fermium - re-enabled puppet (after merging gerrit 310746(
  • 17:55 mutante: gallium rm /etc/sudoers.d/jenkins-slave (to go with gerrit 311161)
  • 17:03 Pchelolo: deploy changeprop to apply gerrit 311153 config change
  • 16:16 yuvipanda: puppet developers are you reading this? just checking...
  • 15:18 akosiaris: enable puppet on puppetmaster1001 again
  • 15:03 akosiaris: disabling puppet on puppetmaster1001
  • 15:00 hashar: gallium: dpkg --purge php5-mysql (mysql got removed)
  • 14:46 gehel: disabling shard allocation check on relforge to test shard allocation issues
  • 13:56 elukey: mw1189 back serving traffic after reimage
  • 13:41 logmsgbot: akosiaris@tin Synchronized wmf-config/db-eqiad.php: (no message) (duration: 00m 46s)
  • 13:35 hashar: gallium: removing MySQL which is no more defined in puppet and running puppet. Did: apt-get remove mysql-common mysql-server mysql-server-core-5.5
  • 13:19 logmsgbot: akosiaris@tin Synchronized wmf-config/db-eqiad.php: (no message) (duration: 00m 48s)
  • 12:50 logmsgbot: hashar@tin rebuilt wikiversions.php and synchronized wikiversions files: All wikis back to 1.28.0-wmf.18 :( T145819
  • 12:40 hashar: Going to rollback all Wikis back to 1.28.0-wmf.18 . Despite much investigation, a bunch of jobs are broken due to T145819 which includes Special:CreateAccount :(
  • 12:37 elukey: mw1190 back serving traffic after the reimage
  • 12:24 gehel: rolling restart of codfw elasticsearch cluster completed - T145404
  • 12:20 moritzm: installing security updates for mysql 5.5 (one off systems running mysql as packaged by Ubuntu/Debian and not running wmf-mariadb10)
  • 12:15 moritzm: installing python-imaging security updates on precise
  • 11:24 akosiaris: silence icinga-wm for a while
  • 11:22 akosiaris: restarted puppetmaster on all puppetmasters
  • 11:22 akosiaris: stop puppetmaster on all puppetmasters, resizing /var/lib/puppet
  • 10:21 marostegui: renaming tables before dropping them in codfw S1,S3,S4 - T54924
  • 09:34 elukey: reimage mw1189-90 to Jessie (trying Riccardo's script!)
  • 09:02 moritzm: reimaging mw1252-mw1254 to jessie
  • 08:59 moritzm: installing tomcat7 security updates
  • 08:52 moritzm: installing tomcat8 security updates
  • 08:43 moritzm: installing libidn security updates in eqiad
  • 07:36 elukey: forced logrotation with debug of /etc/logrotate.d/graphite-web on graphite1001 to find cronspam source
  • 03:21 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Sep 16 03:21:54 UTC 2016 (duration 5m 18s)
  • 03:16 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.19) (duration: 18m 44s)
  • 02:41 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 18m 25s)
  • 01:28 mutante: mw1294 - down and frozen, powercycled
  • 00:53 awight: remove "pending" from AMQ old message consumer
  • 00:50 awight: update paymentswiki config to disable legacy orphan rectifier

2016-09-15

  • 23:38 legoktm: legoktm@terbium:~$ foreachwiki extensions/WikimediaMaintenance/createExtensionTables.php babel # T145366
  • 23:24 awight: rolled back paymentswiki to 392d675
  • 23:22 awight: updating paymentswiki to possible broken 609c1e5
  • 23:00 awight: update paymentswiki to from 996ca30 to 392d675 (reverted DI submodule update)
  • 22:55 awight: Reenabled donations and fredge consumers
  • 22:50 awight: update fundraising CRM from f381bd1 to 5ba6976
  • 22:30 awight: rollback paymentswiki from 609c1e5 to 996ca30
  • 22:28 awight: update paymentswiki from 996ca30 to 609c1e5
  • 22:19 ejegg: updated SmashPig from af19422 to db68be9
  • 22:10 ejegg: updated SmashPig from 12a7b78 to af19422
  • 21:41 ejegg: updated SmashPig from e11af57 to 12a7b78
  • 20:55 hashar: All wikis are on 1.28.0-wmf.19 wikidatawiki / testwikidatawiki stick to .18 for now.
  • 20:35 gehel: increasing number of shards per node for dewiki_content index to 2 on elasticsearch codfw
  • 20:28 logmsgbot: hashar@tin rebuilt wikiversions.php and synchronized wikiversions files: All wiki to .19. Keep testwikidata and wikidata at .18 (commits: 38603f0 770d336)
  • 20:14 yuvipanda: remove self from github wikimedia org, was getting spammed for each new repo creation
  • 20:13 ema: varnish-be esams cache_upload: rolling depool and restart
  • 20:11 ejegg: updated SmashPig from e11af57 to 12a7b78
  • 20:08 gehel: increasing number of shards per node for enwiki_content index to 2 on elasticsearch codfw
  • 20:01 yuvipanda: restart puppetmaster on labcontrol1001 to pick up hiera changes
  • 19:59 ema: depool and restart varnish-be on cp1048
  • 19:56 ema: depool and restart varnish-be on cp1064
  • 19:52 logmsgbot: hashar@tin rebuilt wikiversions.php and synchronized wikiversions files: (no message)
  • 19:50 ema: depool and restart varnish-be on cp1062
  • 19:45 ema: depool and restart varnish-be on cp1073
  • 19:33 ema: depool and restart varnish-be on cp1050
  • 19:20 ema: depool and restart varnish-be on cp1063
  • 19:09 logmsgbot: hashar@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.19
  • 19:08 ema: depool and restart varnish-be on cp1072
  • 19:08 awight: disabled fredge, donations, and banner-history consumers
  • 19:07 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.18/extensions/CentralAuth/maintenance/fixStuckGlobalRename.php: To unblock renames stuck on mediawiki.org T145596 (duration: 00m 47s)
  • 19:06 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.19/extensions/CentralAuth/maintenance/fixStuckGlobalRename.php: To unblock renames stuck on mediawiki.org T145596 (duration: 00m 47s)
  • 19:05 logmsgbot: thcipriani@tin Finished scap: SWAT: Add missing close button title message (T145774) and Revert "Remove jquery.arrowSteps module" (T144974) (duration: 28m 08s)
  • 18:59 ema: depool and restart varnish-be on cp1099
  • 18:43 ejegg: updated smashpig consume pending job with new queue name
  • 18:37 logmsgbot: thcipriani@tin Started scap: SWAT: Add missing close button title message (T145774) and Revert "Remove jquery.arrowSteps module" (T144974)
  • 18:33 mutante: titanium - stop salt, stop puppet, revoke puppet cert, delete salt key
  • 18:23 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.19/extensions/Kartographer: SWAT: Fix map popup CSS (T145716) (duration: 00m 56s)
  • 18:23 ema: depool and restart varnish-be on cp1074
  • 17:56 awight: Purging GC messages from pending with timestamp < '2016-09-15 13:44:55'
  • 17:44 ema: depool and restart varnish-be on cp1049
  • 17:18 Pchelolo: change-prop deploy 310877
  • 16:51 Pchelolo: change-prop deploy gerrit 310873
  • 16:49 akosiaris: uploaded to apt.wikimedia.org precise-wikimedia: zuul_2.5.0-8-gcbc7f62-wmf3precise1
  • 16:49 akosiaris: uploaded to apt.wikimedia.org jessie-wikimedia: zuul_2.5.0-8-gcbc7f62-wmf3jessie1
  • 15:55 moritzm: uploaded trebuchet-trigger 0.5.6-1~jessie1 to carbon (no change rebuild for jessie)
  • 15:45 kart_: Update cxserver to a1949e9
  • 14:59 _joe_: starting a noop run on all nodes to puppetmaster2001 to test puppetdb
  • 14:57 elukey: deployed new-aqs-cluster branch (--rev new-aqs-cluster) to aqs100[456] (new AQS cluster not serving live traffic)
  • 14:30 _joe_: removing old reports from the puppet directory
  • 14:03 godog: empty big log file on thumbor1001 /var/log/thumbor/thumbor.log
  • 13:31 zeljkof: EU SWAT done!
  • 13:28 logmsgbot: zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Update outdated comment for Wikibase (duration: 00m 48s)
  • 13:25 logmsgbot: zfilipin@tin Synchronized wmf-config/CommonSettings.php: SWAT: Remove $wgTranslateEC (duration: 00m 48s)
  • 12:43 logmsgbot: addshore@tin Finished scap: Update RevisionSlider i18n (duration: 30m 26s)
  • 12:13 logmsgbot: addshore@tin Started scap: Update RevisionSlider i18n
  • 11:11 hashar: CI is catching up. It is starved processing a long serie of dependent changes in Gerrit
  • 11:08 hashar: CI / Jenkins is starved. Investigating
  • 10:09 moritzm: reimaging mw1251 to jessie
  • 09:35 moritzm: reimaging mw1250 to jessie
  • 08:11 marostegui: altering tables in S7 - eqiad hosts - T141951
  • 07:01 moritzm: installing libidn security updates
  • 06:54 marostegui: renaming tables before dropping them - T145487
  • 06:29 moritzm: installing chromium security updates on osmium
  • 06:24 _joe_: turning off nitrogen for memory reduction, reimage
  • 03:23 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Sep 15 03:23:49 UTC 2016 (duration 6m 44s)
  • 03:17 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.19) (duration: 18m 25s)
  • 02:39 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 17m 49s)
  • 02:04 mutante: ms-be1022 - down per icinga, but also mgmt is not reachable

2016-09-14

  • 23:44 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.19/extensions/Kartographer/tests/phpunit/KartographerTest.php: Always serve all the data on preview (T145615, 2/2, no-op part) (duration: 00m 50s)
  • 23:43 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.19/extensions/Kartographer/includes/Tag/TagHandler.php: Always serve all the data on preview (T145615, 1/2) (duration: 00m 47s)
  • 23:34 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Add logging channel for NewUserMessage (T131957) (duration: 00m 47s)
  • 22:36 logmsgbot: krinkle@tin Synchronized php-1.28.0-wmf.19/includes/resourceloader/ResourceLoaderWikiModule.php: T145673 (duration: 00m 47s)
  • 21:18 Pchelolo: RESTBase update to fd43f3a58
  • 21:13 Pchelolo: RESTBase update to fd43f3a58 canary on restbase1007
  • 20:57 Pchelolo: RESTBase update to fd43f3a58 staging
  • 20:28 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.19/extensions/CentralAuth/includes/LocalRenameJob/LocalRenameJob.php: Fix LocalRenameJob transaction owner to match JobRunner T143328 T145596 (duration: 00m 48s)
  • 20:23 yuvipanda: manually raise max_connections on labtestcontrol2001, see T145679 for ticket
  • 20:13 Pchelolo: revert RESTBase is staging to d10d759d42
  • 20:11 arlolra: updated Parsoid to version aed15dda
  • 20:05 Pchelolo: update RESTBase to 5ae9a506 - staging
  • 20:03 arlolra: starting Parsoid deploy
  • 19:25 logmsgbot: hashar@tin rebuilt wikiversions.php and synchronized wikiversions files: Revert group1. Hebrew wiki has templates on the wrong side / CSS is off
  • 19:14 logmsgbot: hashar@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.19
  • 19:09 Pchelolo: revert RESTBase to d10d759
  • 19:08 awight: update paymentswiki config to 9919bad
  • 19:07 mutante: titanium - puppet node clean
  • 18:59 Pchelolo: RESTBase deploy d39580f14
  • 18:57 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.18/extensions/Kartographer/modules/box/Map.js: SWAT: Map should take viewport width/height instead of body width/height (T145521) (duration: 00m 47s)
  • 18:53 Pchelolo: RESTBase deploy d39580f14 canary on restbase1007
  • 18:50 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.18/extensions/ZeroBanner/modules: SWAT: Display edit icon and page actions (duration: 00m 47s)
  • 18:45 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.19/extensions/Kartographer/modules/box/Map.js: SWAT: Map should take viewport width/height instead of body width/height (T145521) (duration: 00m 47s)
  • 18:23 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.19/includes/pager/ReverseChronologicalPager.php: SWAT: Partially reverting I8e684f06 to restore some legacy behavior (T145597) (duration: 00m 48s)
  • 18:17 urandom: T133805: Renabling Pupppet, forcing run, and restarting Cassandra to restore 8M region size on restbase1013-a.eqiad.wmnet
  • 17:57 AaronSchulz: Deleted big pages per https://meta.wikimedia.org/w/index.php?title=Steward_requests/Miscellaneous&oldid=15908701#Deleting_a_pages_with_a_.3E5000_revisions_in_ruwiki
  • 17:53 mutante: meitnerium - oops, an unrelated rsyncd is supposed to be running on this, puppet re-created files
  • 17:50 mutante: meitnerium - stop rsyncd, remove config fragments
  • 17:20 mobrovac: change-prop deploying 19e2d51
  • 17:19 volans: reimage mw2198 as it failed before
  • 16:32 jynus: stopping mysql and shutting down db1082
  • 16:05 elukey: restarting cassandra on aqs100[23] T130861
  • 16:01 jynus: starting mysql on db1082
  • 15:57 elukey: restarting cassandra on aqs1001 T130861
  • 15:54 akosiaris: updated cr1-eqiad,cr2-eqiad puppet rules
  • 15:50 hoo: Ran T132839-Workarounds.sh from my home in terbium (see T132839)
  • 15:49 urandom: T130861: Performing rolling Cassandra restart, restbase staging
  • 15:45 urandom: T130861: Restarting Cassandra, xenon.eqiad.wmnet
  • 15:41 urandom: T130861: Forcing puppet run in restbase staging
  • 14:38 mobrovac: change-prop deploying ddc091e
  • 14:30 gehel: increasing delayed allocation to 10m on elasticsearch codfw to speed up cluster restart - T145404
  • 14:26 gehel: upgrading elasticsearch codfw to elasticsearch 2.3.5 - T145404
  • 13:33 gehel: upgrading logstash to elasticsearch 2.3.5 - T145404
  • 13:20 marostegui: renaming tables in s3 codfw - T132837
  • 13:11 logmsgbot: hashar@tin Synchronized portals: Bumping portals to master T128546 (duration: 00m 47s)
  • 13:10 logmsgbot: hashar@tin Synchronized portals/prod/wikipedia.org/assets: Bumping portals to master T128546 (duration: 00m 48s)
  • 13:07 akosiaris: stop ircecho (icinga-wm) temporarily on neon
  • 12:12 akosiaris: stop ircecho (icinga-wm) temporarily on neon
  • 12:05 _joe_: restarting apache on puppetmaster1001
  • 10:31 akosiaris: stopped temporarily ircecho (icinga-wm) on neon
  • 10:25 ema: varnish-be restarted on cp4005
  • 10:03 marostegui: alter localuser table in db2054 - T141951
  • 09:47 marostegui: Renaming tables before dropping them T54924
  • 08:43 marostegui: alter localuser table in db2047 - T141951
  • 07:35 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.19/includes/MediaWiki.php: Use cpPosTime cookie for same-domain redirects on DB change - https://gerrit.wikimedia.org/r/#/c/310494/ (duration: 00m 45s)
  • 07:34 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.19/includes/db/ChronologyProtector.php: Use cpPosTime cookie for same-domain redirects on DB change - https://gerrit.wikimedia.org/r/#/c/310494/ (duration: 00m 47s)
  • 07:32 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.19/includes/db/loadbalancer/LBFactory.php: Use cpPosTime cookie for same-domain redirects on DB change - https://gerrit.wikimedia.org/r/#/c/310494/ (duration: 00m 46s)
  • 03:22 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Sep 14 03:22:13 UTC 2016 (duration 7m 6s)
  • 03:15 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.19) (duration: 18m 08s)
  • 02:46 mutante: restarted grrrrit-wm
  • 02:44 mutante: gerrit back to normal
  • 02:42 mutante: gerrit restarting to apply config changes 256663 and 308885
  • 02:40 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 17m 56s)
  • 01:56 logmsgbot: aaron@tin Synchronized wmf-config: Set $wgAPIMaxLagThreshold => 3 and "max lag" => 6 (duration: 00m 51s)

2016-09-13

  • 23:32 logmsgbot: catrope@tin Synchronized php-1.28.0-wmf.18/resources/lib/moment/locale: T145382 (duration: 00m 47s)
  • 23:19 logmsgbot: catrope@tin Synchronized php-1.28.0-wmf.19/resources/lib/moment/locale: T145382 (duration: 00m 49s)
  • 22:44 awight: update paymentswiki config to da01ae9
  • 22:13 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.19/includes/MediaWiki.php: Avoid stupid warnings on url parsing (duration: 00m 47s)
  • 21:45 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.19/includes/DefaultSettings.php: for lego <3 (duration: 00m 47s)
  • 20:46 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.19/extensions/Echo: For Roan <3 (duration: 00m 54s)
  • 20:23 Krenair: restarted designate-api on labtestservices2001, now designate in labtest is working again
  • 20:21 Krenair: restart rabbitmq-server on labtestcontrol2001
  • 19:01 logmsgbot: demon@tin rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.19
  • 18:35 gehel: starting data import on wdqs200?
  • 18:24 logmsgbot: demon@tin Finished scap: testwiki to wmf.19 + l10n bootstrap (try 2) (duration: 47m 21s)
  • 18:08 gehel: moving to scap deployed configuration for wdqs - T144380
  • 17:58 mutante: wtp2019 - powercycled, back up without the error, services started
  • 17:55 mutante: wtp2019 Uncorrectable Memory Error
  • 17:55 mutante: wtp2019 - down sinc a couple days. console says "Alert! System fatal error during previous boot"
  • 17:53 volans: reimaging mw2198 that failed early today
  • 17:36 logmsgbot: demon@tin Started scap: testwiki to wmf.19 + l10n bootstrap (try 2)
  • 17:24 logmsgbot: demon@tin scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_4282891950" --threads=4 --lang en --quiet' returned non-zero exit status 1 (duration: 09m 23s)
  • 17:15 logmsgbot: demon@tin Started scap: testwiki to wmf.19 + l10n bootstrap
  • 16:59 jynus: beta dbs back in rw mode
  • 16:54 mobrovac: restbase deploy end of d10d759
  • 16:53 logmsgbot: demon@tin Synchronized multiversion/updateWikiversions: unbreak myself (duration: 00m 48s)
  • 16:41 mobrovac: restbase deploy start of d10d759
  • 16:25 mobrovac: change-prop deploying d701a69
  • 16:24 godog: dump hhvm backtrace on mw1162 and restart hhvm, apache gets connection refused
  • 16:17 urandom: T144826: Removing compaction rate limit, increasing compactor threads (from 10 to 20), and beginning scrub of local_group_wikipedia_T_parsoid_html.data (restbase2004-b.codfw.wmnet)
  • 16:06 jynus: power resetting db1082
  • 16:02 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1082 (duration: 01m 00s)
  • 15:46 gehel: test_* indices remvoed on relforge cluster, cluster restarted
  • 15:44 jynus: setting deployment-db1 and deployment-db1 mysqls in read only mode
  • 15:40 gehel: shutting down relforge cluster for indices cleanup
  • 15:18 marxarelli: starting 2-hour read-only maintenance window for beta cluster migration
  • 14:50 godog: drain and reboot restbase2004
  • 14:26 gehel: deleting test_* indices on relforge cluster
  • 14:24 akosiaris: T107306 uploaded to apt.wikimedia.org jessie-wikimedia: apertium-kaz-tat_0.2.1~r57554-1+wmf1
  • 14:03 akosiaris: T107306 uploaded to apt.wikimedia.org jessie-wikimedia: apertium-kaz_0.1.0~r61338-1+wmf1
  • 13:33 logmsgbot: hashar@tin Synchronized wmf-config/CommonSettings.php: Stop logging xff from 127.0.0.1 T129982 (duration: 00m 47s)
  • 13:28 hashar: Pulling "Stop logging xff from 127.0.0.1" patch on mw1299 and mw1161-mw1169 T129982
  • 13:21 hashar: Pulling "Stop logging xff from 127.0.0.1" patch on mw1300-1303 T129982
  • {{safesubst:SAL entry|1=13:19 logmsgbot: addshore@tin Synchronized php-1.28.0-wmf.18/extensions/UploadWizard/resources/uw.EventFlowLogger.js: SWAT: [[gerrit:310180|uw.EventFlowLogger: Fix NS_ERROR_NOT_AVAILABLE debug logging (duration: 00m 49s)}}
  • 13:15 logmsgbot: hashar@tin Synchronized portals: Bumping portals to master (duration: 00m 50s)
  • 13:10 logmsgbot: addshore@tin Synchronized dblists/clldefault.dblist: SWAT: Deploy Compact Language Links out of beta for Tulu Wikipedia (duration: 00m 46s)
  • 13:08 logmsgbot: addshore@tin Synchronized wmf-config/flaggedrevs.php: SWAT: Fix illegal wgFlaggedRevsWhitelist for arwiki (duration: 00m 47s)
  • 13:04 logmsgbot: addshore@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RevisionSlider BetaFeature on all wikis (duration: 00m 49s)
  • 12:47 marostegui: alter localuser table in db2040 - T141951
  • 12:24 jynus: putting db1024 under maintenance (potential lag, etc.) to test solutions for T145079
  • 12:06 mobrovac: zotero translators deploying cad95af
  • 11:49 gehel: restarting elasticseaarch on relforge1001 - OOM heap space
  • 11:18 volans: reimaging mw2198 and mw2199 to test the automation script T143536
  • 10:58 mobrovac: citoid deployed e79430f for T144597
  • 10:58 godog: finished rolling restart swift-proxy for thumbor change T139606
  • 10:48 elukey: zuul upgraded to zuul_2.5.0-8-gcbc7f62-wmf2jessie1 on scandium (T145057)
  • 10:45 elukey: uploaded 2.5.0-8-gcbc7f62-wmf2jessie1 to jessie-wikimedia/thirdparty (T145057)
  • 10:10 godog: enable shadow requests to thumbor for small wikis T139606
  • 09:39 marostegui: alter localuser table in dbstore2002 - T141951
  • 08:52 gehel: relforge is taking more time than expected to recover after upgrade, most probably related to ~3k indices that were created for test purpose
  • 08:46 gehel: relforge is taking more time than expected to recover after upgrade, most probably related to >10k indices that were created for test purpose
  • 08:33 marostegui: renaming tables in db1015 - T145487
  • 08:22 marostegui: alter localuser table in https://tendril.wikimedia.org/host/view/dbstore2001.codfw.wmnet/3306 - T141951
  • 08:04 gehel: upgrading elasticsearch & plugins to 2.3.5 on relforge - T145404
  • 07:52 elukey: wrong package name for my prev entry - remove kafkatee from stat1002 - not in puppet and causing cronspam (T132324)
  • 07:50 elukey: remove kafkacat from stat1002 - not in puppet and causing cronspam (T132324)
  • 07:07 moritzm: installing openjdk6 security updates
  • 02:46 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Sep 13 02:46:54 UTC 2016 (duration 7m 12s)
  • 02:39 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 17m 41s)
  • 01:58 logmsgbot: aaron@tin Synchronized wmf-config/CommonSettings.php: Lower $wgMaxUserDBWriteDuration to 3 (duration: 00m 47s)
  • 01:27 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.18/extensions/GlobalUsage: 1843a85 (duration: 00m 48s)
  • 01:24 bblack: cache_upload: reverting codfw to file storage
  • 01:15 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.18/extensions/GeoData: 2dedca3 (duration: 00m 49s)
  • 01:14 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.18/includes/jobqueue/utils/PurgeJobUtils.php: (no message) (duration: 00m 52s)
  • 00:50 bblack: cache_upload: reverting eqiad to file storage
  • 00:50 bblack: reverting eqiad to file storage
  • 00:17 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.18/extensions/GeoData: 2dedca3 (duration: 00m 48s)
  • 00:16 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.18/includes/jobqueue/utils/PurgeJobUtils.php: 0419831 (duration: 00m 47s)

2016-09-12

  • 23:40 bblack: switch cache_upload codfw to -sdeprecated_persistent...
  • 23:40 logmsgbot: dereckson@tin Synchronized wmf-config/throttle.php: Women in Science throttle rules (T145115 and T145253) (duration: 00m 47s)
  • 23:33 urandom: T144826: Restarting Cassandra on restbase2004-b.codfw.wmnet (scrub complete, re-joining cluster)
  • 23:14 bblack: switch cache_upload eqiad to -sdeprecated_persistent...
  • 21:54 eileen: from 3f01d93 to f381bd1
  • 21:17 arlolra: For completeness, "back" in my last log is a mistake. I scap deployed the wrong --rev, but that was ultimately the version we wanted deployed anyways, so no harm no foul. (T145460)
  • 20:40 arlolra: Parsoid back on f7c43009c
  • 20:32 arlolra: Parsoid deploy failed, rolling back
  • 20:28 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1015-b.eqiad.wmnet
  • 20:28 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1014-b.eqiad.wmnet
  • 20:28 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1009-b.eqiad.wmnet
  • 20:21 mobrovac: change-prop deploying 86a60b3
  • 20:13 arlolra: starting Parsoid deploy
  • 19:52 logmsgbot: demon@tin Synchronized multiversion/getMWVersion: for dumps <3 (duration: 00m 46s)
  • 19:44 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1013-c.eqiad.wmnet
  • 19:44 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1012-c.eqiad.wmnet
  • 19:44 urandom: !log T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1008-c.eqiad.wmnet
  • 19:42 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1011-c.eqiad.wmnet
  • 19:42 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1010-c.eqiad.wmnet
  • 19:42 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1007-c.eqiad.wmnet
  • 19:13 urandom: T133805: Restarting Cassandra to apply G1 region size of 32M on restbase1013-a.eqiad.wmnet
  • 19:12 urandom: T133805: Disabling Puppet for GC experiment on restbase1013.eqiad.wmnet
  • 19:10 logmsgbot: thcipriani@tin Synchronized static/images/project-logos: SWAT: Fix HD logos for hewiki (T145017) (duration: 00m 48s)
  • 18:43 ejegg: updated civicrm from 9309163 to 9cbee66
  • 18:24 ori: Changing wikiversion for group2 wikis on mw1017 to debug regression (T145359)
  • 18:23 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.18/resources/Resources.php: Add missing dependency to 'mediawiki.Upload.BookletLayout' module (T145315) (duration: 00m 47s)
  • 18:13 gehel: rolled back wdqs to HEAD^1
  • 17:50 gehel: wdqs1001 put in maitnenance, some issue with config file deployment
  • 17:46 gehel: deploying latest wikidata query service
  • 17:29 godog: roll-restart cassandra in eqiad with new CA and certs T143044
  • 16:30 ema: wiping/repooling cp4015
  • 15:41 godog: roll-restart cassandra in codfw with new CA and certs T143044
  • 15:15 godog: drain and restart cassandra instances on restbase2001 with new CA - T143044
  • 14:51 ema: depool cp4015, restart and repool cp4006's backend
  • 14:38 moritzm: powering down mw2017 for hardware maintenance
  • 14:38 mobrovac: change-prop deploying 5d5d39e
  • 14:36 urandom: T144826: Removing compaction rate limit, increasing compactor threads (from 10 to 20), and beginning scrub of local_group_wikipedia_T_parsoid_html.data (restbase2004-b.codfw.wmnet)
  • 14:10 mobrovac: change-prop deploying 404b07c to enable scap config deploys
  • 13:48 logmsgbot: hashar@tin Synchronized wmf-config: Remove upload7 references T129586 (duration: 00m 50s)
  • 13:32 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.18/maintenance/cleanupUploadStash.php: Revert "Clean up user handling in UploadStash" T145228 (duration: 00m 46s)
  • 13:31 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.18/includes/upload/UploadStash.php: Revert "Clean up user handling in UploadStash" T145228 (duration: 00m 46s)
  • 13:27 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.18/extensions/Kartographer: Fix mw.Uri crushing bug T145178 (duration: 00m 49s)
  • 13:21 ema: upgrade cp1099 to varnish 4 T131502
  • 13:17 logmsgbot: hashar@tin Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 48s)
  • 13:16 logmsgbot: hashar@tin Synchronized wmf-config/throttle.php: Add throttling rule for University of Canterbury T145327 (duration: 00m 46s)
  • 13:15 logmsgbot: hashar@tin Synchronized static/images/project-logos/: Add HD logos for hewiki T145017 (duration: 00m 50s)
  • 13:04 ema: upgrade cp1074 to varnish 4 T131502
  • 12:47 ema: upgrade cp1073 to varnish 4 T131502
  • 12:30 ema: upgrade cp1072 to varnish 4 T131502
  • 12:11 ema: upgrade cp1071 to varnish 4 T131502
  • 12:00 ema: upgrade cp1064 to varnish 4 T131502
  • 11:47 ema: upgrade cp1063 to varnish 4 T131502
  • 11:40 mobrovac: change-prop deploying 79b172a
  • 11:32 ema: upgrade cp1062 to varnish 4 T131502
  • 11:17 ema: upgrade cp1050 to varnish 4 T131502
  • 11:03 ema: upgrade cp1049 to varnish 4 T131502
  • 10:48 ema: upgrade cp1048 to varnish 4 T131502
  • 10:28 marostegui: renaming tables in db1015 - T132837
  • 10:19 moritzm: decomissioning mw2061-mw2074 (Bug: T144745)
  • 10:07 volans: reimage mw2198, mw2199 to Jessie (again) T143536
  • 10:04 marostegui: Testing schema change on db1039 - T141951
  • 10:02 jynus: deploying schema change on s4 hosts T139090
  • 09:47 ema: depool cp4006 (503 Could not get storage)
  • 07:25 moritzm: reimaging mw2077-mw2079, mw2017 to jessie
  • 07:16 moritzm: installing openjpeg security updates
  • 04:34 bblack: upgrade cp3049 to varnish 4 T131502
  • 04:20 bblack: upgrade cp3048 to varnish 4 T131502
  • 04:06 bblack: upgrade cp3047 to varnish 4 T131502
  • 03:51 bblack: upgrade cp3046 to varnish 4 T131502
  • 03:36 bblack: upgrade cp3045 to varnish 4 T131502
  • 03:21 bblack: upgrade cp3044 to varnish 4 T131502
  • 03:05 bblack: upgrade cp3039 to varnish 4 T131502
  • 02:49 bblack: upgrade cp3038 to varnish 4 T131502
  • 02:34 bblack: upgrade cp3037 to varnish 4 T131502
  • 02:29 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Sep 12 02:29:27 UTC 2016 (duration 5m 53s)
  • 02:23 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 10m 37s)
  • 02:06 ema: upgrade cp3036 to varnish 4 T131502
  • 01:28 ema: upgrade cp3035 to varnish 4 T131502
  • 00:51 ema: upgrade cp3034 to varnish 4 T131502

2016-09-11

  • 22:33 logmsgbot: aaron@tin Synchronized wmf-config/CommonSettings.php: Lower wgMaxUserDBWriteDuration to 4 (duration: 00m 47s)
  • 08:44 Amir1: ladsgroup@tin:~$ mwscript resetUserEmail.php --wiki=fawiki Sinasalek <email removed>
  • 02:30 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Sep 11 02:30:18 UTC 2016 (duration 5m 55s)
  • 02:24 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 10m 03s)

2016-09-10

  • 22:30 urandom: T144826: Restarting Cassandra on restbase2004-b.codfw.wmnet (scrub complete, re-joining cluster)
  • 12:36 urandom: T144826: Removing compaction rate limit, increasing compactor threads (from 10 to 20), and beginning scrub of local_group_wikipedia_T_parsoid_html.data (restbase2004-b.codfw.wmnet)
  • 02:46 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Sep 10 02:46:34 UTC 2016 (duration 6m 11s)
  • 02:40 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 17m 53s)

2016-09-09

  • 19:29 logmsgbot: demon@tin Synchronized wmf-config/wikitech.php: bizarro config loading (duration: 00m 46s)
  • 18:23 logmsgbot: demon@tin Synchronized wmf-config/: prune old ext messages files (duration: 00m 52s)
  • 16:54 logmsgbot: demon@tin Synchronized multiversion/: rm one more ugly file (duration: 01m 05s)
  • 16:53 logmsgbot: demon@tin Synchronized docroot/noc/conf/: Updating activeMWVersions data (duration: 00m 47s)
  • 16:29 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.18/extensions/JsonConfig/: Unbreak Zero namespace, Check globals in addition to attributes https://gerrit.wikimedia.org/r/309598 (duration: 00m 51s)
  • 16:10 legoktm: live hacking on mw1017
  • 15:15 urandom: T133805: Renabling Pupppet, forcing run, and restarting Cassandra to restore 8M region size on restbase1013-a.eqiad.wmnet
  • 14:52 Jeff_Green: authdns-update for pay-lvs1001 & pay-lvs1002
  • 14:51 mobrovac: change-prop deployed 34b23e7
  • 13:25 elukey: analytics1032 back in service after disk swap
  • 12:45 elukey: running authdns-update on ns0.w.o to pick up the new domain pivot.wikimedia.org (T138262)
  • 12:27 elukey: reimaging mw213[789] and mw2075 to Jessie
  • 12:05 moritzm: reimaging mw2133-mw2136 to jessie
  • 10:19 moritzm: reimaging mw2080, mw2083-mw2085 to jessie
  • 10:04 volans: reimage mw2132 to Jessie
  • 10:03 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: dc=eqiad,cluster=maps,service=kartotherian
  • 09:25 gehel: restarting pybal on lvs1003
  • 09:05 gehel: deploying new LVS configuration for kartotherian.svc.eqiad.wmnet
  • 09:03 elukey: reimage mw2128->mw2131 to Jessie
  • 09:02 godog: reimage ms-be1022 - T140597
  • 08:55 godog: reset power on ms-be2019, cpu "soft lockup"
  • 08:31 moritzm: reimaging mw2124-mw2127 to jessie
  • 07:17 elukey: puppet disabled on analytics1032, Hadoop services stopped - T145170
  • 06:48 moritzm: reimaging mw2120-mw2123 to jessie
  • 05:22 jynus: deploying schema change on s5 hosts T139090
  • 03:32 logmsgbot: aaron@tin Synchronized wmf-config/InitialiseSettings.php: Avoid $wmfMasterDatacenter notices from noc files (duration: 00m 46s)
  • 03:31 logmsgbot: aaron@tin Synchronized docroot/noc/db.php: Avoid $wmfMasterDatacenter notices from noc files (duration: 00m 48s)
  • 02:45 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Sep 9 02:45:47 UTC 2016 (duration 6m 11s)
  • 02:39 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 17m 37s)
  • 01:38 logmsgbot: aaron@tin Synchronized wmf-config/filebackend-production.php: Bump description text expiry for files (duration: 00m 46s)
  • 01:07 logmsgbot: aaron@tin Synchronized tests/Defines.php: (no message) (duration: 00m 46s)
  • 01:02 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.18/extensions/SpamBlacklist: 56effa9 (duration: 00m 49s)
  • 00:06 logmsgbot: hoo@tin Synchronized php-1.28.0-wmf.18/extensions/Wikidata: Don't use multiple return values (T145138) (duration: 02m 24s)

2016-09-08

  • 23:53 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.18/extensions/CentralNotice: Bump production version to 4dbd3f9 (duration: 00m 51s)
  • 23:47 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.18/extensions/Popups/extension.json: ext.popups.core depends on mediawiki.storage (Gerrit:309469) (duration: 00m 46s)
  • 23:41 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.18/extensions/Kartographer/modules/box/Map.js: Switch to geojson for geoshapes srv (T144777) (duration: 00m 48s)
  • 23:36 awight: update fundraising crm from cf19366 to 9309163
  • 23:27 awight: rolling back fundraising crm from 946a3f1 to cf19366
  • 23:25 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.18/resources/src/mediawiki/page/rollback.js: RollbackAction: Allow 'from' to be an empty string (T141985, 2/2) (duration: 00m 46s)
  • 23:23 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.18/includes/actions/RollbackAction.php: RollbackAction: Allow 'from' to be an empty string (T141985, 1/2) (duration: 00m 46s)
  • 23:18 awight: update fundraising crm from cf19366 to 946a3f1
  • 23:02 yurik_: scaped kartotherian https://gerrit.wikimedia.org/r/#/c/309473/
  • 22:52 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.18/includes/jobqueue/jobs:
  • 21:56 awight: update SmashPig from 7f9eb74 to e11af57
  • 20:47 gehel: redeploy wdqs on wdqs2001.codfw.wmnet
  • 20:03 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.18/extensions/Echo/includes/SeenTime.php: Trying to stop some duplicate redis fetches (duration: 00m 52s)
  • 19:57 bblack: repooling normal traffic to cache_upload in ulsfo
  • 19:02 logmsgbot: demon@tin rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.18
  • 18:23 logmsgbot: demon@tin Synchronized multiversion/: So much junk to remove (duration: 01m 06s)
  • 17:58 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase200[1-9]-b.codfw.wmnet
  • 17:53 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1013-b.eqiad.wmnet
  • 17:53 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1012-b.eqiad.wmnet
  • 17:53 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1008-b.eqiad.wmnet
  • 17:46 chasemp: reboot labstore1004 & labstore1005
  • 17:45 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1011-b.eqiad.wmnet
  • 17:45 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1010-b.eqiad.wmnet
  • 17:45 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1007-b.eqiad.wmnet
  • 17:41 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1015-a.eqiad.wmnet
  • 17:41 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1014-a.eqiad.wmnet
  • 17:41 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1009-a.eqiad.wmnet
  • 17:11 gehel: reverting deploying new LVS configuration for kartotherian.svc.eqiad.wmnet - puppet error, let's analyse slowly...
  • 17:06 gehel: deploying new LVS configuration for kartotherian.svc.eqiad.wmnet
  • 17:04 bd808: Updated Striker to 7d7c8ee
  • 16:55 logmsgbot: demon@tin Synchronized multiversion/: removing more junk - getMWVersion (duration: 01m 07s)
  • 16:47 logmsgbot: demon@tin Finished scap: removing obsolete p symlink (duration: 04m 25s)
  • 16:43 logmsgbot: demon@tin Started scap: removing obsolete p symlink
  • 16:13 godog: roll-restart cassandra instances on restbase-test cluster T143044
  • 16:07 moritzm: uploaded linux-meta 1.10 to carbon (pointing to the new 4.4.19 kernel image)
  • 14:48 logmsgbot: addshore@tin Finished scap: SWAT: Update jquery.uls from upstream (duration: 50m 36s)
  • 14:41 gehel: deploying new DNS entries for kartotherian.svc.eqiad.wmnet
  • 14:36 godog: bounce restbase-test2001 cassandra-a instance T143044
  • 13:58 logmsgbot: addshore@tin Started scap: SWAT: Update jquery.uls from upstream
  • 13:43 moritzm: powering down mw2075-mw2079 for hardware maintenance (T142726)
  • 13:28 ema: upgrading cache_upload ulsfo to varnish 4, dns depooled T131502
  • 13:15 logmsgbot: addshore@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable mention status notifications everywhere (duration: 00m 47s)
  • 13:12 logmsgbot: addshore@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add massmessage-sender group to urwiki (duration: 00m 47s)
  • 13:08 logmsgbot: addshore@tin Synchronized wmf-config/extension-list: SWAT: RESTBaseUpdateJobs: Un-deploy the extension 3/3 (duration: 00m 47s)
  • 13:07 logmsgbot: addshore@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: RESTBaseUpdateJobs: Un-deploy the extension 2/3 (duration: 00m 46s)
  • 13:06 logmsgbot: addshore@tin Synchronized wmf-config/CommonSettings.php: SWAT: RESTBaseUpdateJobs: Un-deploy the extension 1/3 (duration: 00m 49s)
  • 12:52 gehel: redeploying wdqs on wdqs2002.codfw.wmnet - T144380
  • 12:44 moritzm: uploaded new linux package for jessie (based on 4.4.19 with bumped kernel ABI=2)
  • 11:35 moritzm: reimaging mw2161, mw2162, mw2081, mw2082 to jessie
  • 10:14 mobrovac: change-prop deploying a991e25
  • 09:46 godog: roll-reboot thumbor machines to apply memory cgroup enablement T144938
  • 08:52 gehel: initial data mimport on wdqs codfw cluster - T144380
  • 08:23 marostegui: Drop tables: ImageMetricsLoadingTime_10078363 and ImageMetricsCorsSupport_11686678 - T141407
  • 06:51 moritzm: reimaging mw2212-mw2214 to jessie
  • 06:44 elukey: reimaging mw2208->mw2211 to jessie
  • 03:33 ottomata: merging dns change to point archiva.wikimedia.org at new archiva node meitnerium
  • 03:24 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Sep 8 03:24:15 UTC 2016 (duration 7m 23s)
  • 03:16 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 18m 14s)
  • 02:40 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.17) (duration: 17m 56s)
  • 00:46 logmsgbot: krenair@tin Synchronized php-1.28.0-wmf.18/extensions/VisualEditor/extension.json: https://gerrit.wikimedia.org/r/#/c/309213/ (duration: 00m 46s)
  • 00:42 yurik: kartotherian synced T145042
  • 00:15 twentyafterfour: upgrade complete. Service restored and everything seems normal.
  • 00:14 twentyafterfour: phabricator upgrade is running database migrations now, taking longer than expected
  • 00:03 twentyafterfour: Phabricator upgrade starting momentarily. Service will be offline for a short time, most likely less than 5 minutes.

2016-09-07

  • 23:55 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Enable Translate on fr.wiktionary (T138972) (duration: 00m 47s)
  • 23:45 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.17/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: Revert "Turn on CirrusSearch bm25 A/B test" (T143588) (duration: 00m 46s)
  • 23:43 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.18/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: Revert "Turn on CirrusSearch bm25 A/B test" (T143588) (duration: 00m 46s)
  • 23:41 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.18/extensions/VisualEditor/modules/ve-mw/ui/pages/ve.ui.MWParameterPage.js: Fix parent constructor call (Gerrit:309180) (duration: 00m 46s)
  • 23:39 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.18/extensions/VisualEditor/lib/ve: Fix bad serialization of DOM elements in cloneElement (through Gerrit:309156) (duration: 00m 47s)
  • 23:38 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.17/extensions/VisualEditor/lib/ve: Fix bad serialization of DOM elements in cloneElement (through Gerrit:309155) (duration: 00m 47s)
  • 23:36 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.18/extensions/ORES/includes/Hooks.php: Get results when the score is not stored too (T144999) (duration: 00m 46s)
  • 23:24 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Correct dblist definition (T143345, 2/2) (duration: 00m 47s)
  • 23:23 logmsgbot: dereckson@tin Synchronized dblists/: Correct dblist definition (T143345, 1/2) (duration: 00m 49s)
  • 22:02 Amir1: ladsgroup@terbium:~$ mwscript extensions/ORES/maintenance/PurgeScoreCache.php --wiki=wikidatawiki --model damaging
  • 21:39 awight: update payments wiki from fafb6b4 to 996ca30
  • 21:36 Dereckson: Created tables for Translate extension on fr.wiktionary (T138972)
  • 21:35 awight: reprocessing 20160906 PayPal audit files, take 2
  • 21:21 awight: update fundraising-tools from b3ed7ab to b0be0f9
  • 21:04 urandom: T139961: Stopping RESTBase htmldumper in codfw
  • 21:03 awight: rollback fundraising-tools from b71c504 to b3ed7ab
  • 20:50 awight: Reprocessing 20160906 PayPal audit files
  • 20:11 mobrovac: restbase deploy end of 38d8c41
  • 20:07 mobrovac: restbase cassandra truncating local_group_wikipedia_T_feed_aggregated.data for T144990
  • 19:59 mobrovac: mobileapps deploying 2cd4f6a
  • 19:52 mobrovac: restbase deploy start of 38d8c41
  • 19:51 awight: update fundraising-tools from b3ed7ab to b71c504
  • 19:40 urandom: T139961: Actually starting RESTBase htmldumper processes in codfw (read testing)
  • 19:12 logmsgbot: demon@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.18
  • 19:01 urandom: T139961: Starting RESTBase htmldumper processes in codfw (read testing)
  • 18:47 thcipriani: Morning SWAT complete
  • 18:46 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.18/extensions/MobileFrontend/resources/mobile.notifications.overlay/NotificationsOverlay.js: SWAT: Count local unread notifications when mark-all-read is clicked (T141404) (duration: 00m 44s)
  • 18:44 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.18/extensions/Echo/modules/model/mw.echo.dm.ModelManager.js: SWAT: Add method to get local unread notifications in the manager (T141404) (duration: 00m 45s)
  • 18:37 gehel: deploying wdqs, fix for T144913
  • 18:35 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Extension:SandboxLink for tcywiki (T144925) (duration: 00m 47s)
  • 18:29 logmsgbot: thcipriani@tin Synchronized dblists/nowikidatadescriptiontaglines.dblist: SWAT: Remove wikidata descriptions from additional projects (duration: 00m 45s)
  • 18:24 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.18/extensions/UniversalLanguageSelector: SWAT: Revert "Update jquery.uls to a9dc11b" (T144871) (duration: 00m 47s)
  • 18:21 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.18/includes/db/loadbalancer/LoadBalancer.php: ddd35a6 (duration: 00m 45s)
  • 18:13 logmsgbot: thcipriani@tin Synchronized dblists/nowikidatadescriptiontaglines.dblist: SWAT: Revert "Enable Wikidata descriptions on all wikipedias" (duration: 00m 47s)
  • 17:23 logmsgbot: aaron@tin Synchronized wmf-config/redis.php: Avoid pointless ChronologyProtector duplicate key notices (duration: 00m 47s)
  • 16:42 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.18/includes/libs/objectcache/WANObjectCache.php: for aaron <3 (duration: 02m 50s)
  • 16:09 volans: restarted ircecho on neon after rotating the irc.log file
  • 16:03 cmjohnson1: db1020 swapping failed disk slot 5
  • 15:59 gehel: deploying wdqs, fix for T144913
  • 15:53 cmjohnson1: graphite1002 swapping failed disk slot10
  • 15:43 madhuvishy: Rebooting host labstore1004
  • 15:36 logmsgbot: krenair@tin Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/309015/ (duration: 02m 50s)
  • 15:08 andrewbogott: re-imaging labnet1002 for T136718
  • 15:02 moritzm: shutting down mw2080-mw2085 for hardware maintenance (T142726)
  • 14:55 mobrovac: mobileapps deploying 8b929cfe
  • 14:49 urandom: T144826: Restarting Cassandra on restbase2004-c.codfw.wmnet (scrub complete, re-joining cluster)
  • 14:45 urandom: T144826: Removing compaction rate limit, increasing compactor threads from 10 to 20, and beginning scrub of local_group_globaldomain_T_mathoid_png.data (restbase2004-c.codfw.wmnet)
  • 14:45 urandom: T144826: Removing compaction rate limit, increasing compactor threads from 10 to 20, and beginning scrub of local_group_globaldomain_T_mathoid_png.data
  • 14:29 hoo: Deployed d4ad9dd of wikidata/query/deploy: UI improvements
  • 14:16 moritzm: shutting down mw2120-mw2139 for hardware maintenance (T142726)
  • 14:13 moritzm: reimaging mw2157-2160 to jessie
  • 13:58 elukey: reimaging mw2204->mw2207 to jessie
  • 13:57 moritzm: upgrading labvirt1014 to Linux 4.4
  • 13:55 mobrovac: restbase start end of 3852f72
  • 13:31 mobrovac: restbase start deploy of 3852f72
  • {{safesubst:SAL entry|1=13:15 logmsgbot: addshore@tin Synchronized php-1.28.0-wmf.18/extensions/RevisionSlider/modules/ext.RevisionSlider.DiffPage.js: SWAT: [[gerrit:308943|Revert "Do not nest mw-content-text element when reloading a diff" (duration: 00m 47s)}}
  • 13:10 logmsgbot: addshore@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable mention status notifications on mediawikiwiki and metawiki (duration: 00m 47s)
  • 13:06 logmsgbot: addshore@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RC patrol for fiwiki and some related changes (duration: 00m 47s)
  • 12:58 gehel: enabling row aware allocation on elasticsearch eqiad - T143571
  • 12:51 ema: depool upload in ulsfo
  • 12:38 moritzm: restarted mailman on fermium
  • 12:35 logmsgbot: filippo@palladium conftool action : set/pooled=yes; selector: ms-fe1001.eqiad.wmnet
  • 12:20 mobrovac: mobileapps deploying fc09d0d
  • 11:48 moritzm: reimaging mw2153-mw2156 to jessie
  • 11:04 logmsgbot: filippo@palladium conftool action : set/pooled=no; selector: ms-fe1001.eqiad.wmnet
  • 10:32 hashar: https://yarn.wikimedia.org/ for the lazies
  • 10:30 elukey: yarn.w.o is now available to all the users in the wmf ldap group (Basic Auth)
  • 10:02 godog: add mw:thumbor to read/write ACLs for thumbnail containers of a subset of wikis T139606
  • 09:05 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: pooled db1064 and removed db1019 which was replacing it - T144723 (duration: 00m 52s)
  • 08:06 moritzm: reimaging mw2200-mw2203 to jessie
  • 07:58 elukey: executed apt-get purge tmpreaper on gallium (T132324)
  • 07:27 elukey: reimaging mw2144->mw2147 to jessie
  • 07:00 gehel: increase cluster_concurrent_rebalance on elasticsearch codfw - T143571
  • 06:40 moritzm: reimaging mw2140-mw2143 to jessie
  • 05:38 gehel: enabling row aware allocation on elasticsearch codfw - T143571
  • 03:16 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.18) (duration: 17m 58s)
  • 02:40 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.17) (duration: 17m 37s)
  • 02:06 urandom: T144826: Restarting Cassandra restbase2004-b.codfw.wmnet (putting back into service)
  • 01:07 hoo: Updated Wikidata's property suggester with data from Monday's json dump and applied the T132839 workarounds
  • 01:04 Krenair: labtest ldap: created dc=codfw,ou=hosts,dc=wikimedia,dc=org
  • 00:13 Dereckson: Ran namespaceDupes maintenance script on frwiki

2016-09-06

  • 23:54 yurik: deployed kartotherian - adjusting geoshapes arg, and bumping deps - https://gerrit.wikimedia.org/r/#/c/308899/
  • 23:52 Dereckson: Ran namespaceDupes maintenance script on skwiki (0 pages to fix, 1 links to fix, fixed) (T143472)
  • 23:47 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Fix User namespace localisation for sk.wikipedia (T143472) (duration: 00m 47s)
  • 23:23 logmsgbot: dereckson@tin Synchronized dblists/nowikidatadescriptiontaglines.dblist: Update wikis where Wikidata descriptions is shown or not (Gerrit:307968 and Gerrit:307969, T143345) (duration: 00m 46s)
  • 23:14 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Do not use $wgExtensionFunctions to set globals (T143055) (duration: 00m 47s)
  • 22:58 arlolra: Parsoid restarted to pick up new wiki config after <maplink> deploy (T144062)
  • 22:52 arlolra: restarting Parsoid
  • 22:26 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/308890/3 (duration: 00m 47s)
  • 22:09 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.17/extensions/Kartographer: https://gerrit.wikimedia.org/r/#/c/308887/ (duration: 00m 48s)
  • 21:48 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.17/extensions/VisualEditor: https://gerrit.wikimedia.org/r/#/c/308300/ (duration: 00m 48s)
  • 21:47 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.18/extensions/VisualEditor: https://gerrit.wikimedia.org/r/#/c/308300/ (duration: 00m 49s)
  • 20:59 urandom: T133805: Restarting Cassandra to apply G1 region size of 16M on restbase1013-a.eqiad.wmnet
  • 20:56 urandom: T133805: Disabling Puppet for GC experiment on restbase1013.eqiad.wmnet
  • 20:53 logmsgbot: demon@tin Finished scap: scap 4 wikidata <3 (duration: 25m 53s)
  • 20:27 logmsgbot: demon@tin Started scap: scap 4 wikidata <3
  • 19:48 logmsgbot: demon@tin Synchronized multiversion/: rm useless file (duration: 01m 05s)
  • 19:43 logmsgbot: demon@tin Finished scap: group0 to wmf.18 (duration: 27m 33s)
  • 19:15 logmsgbot: demon@tin Started scap: group0 to wmf.18
  • 19:05 robh: cache misc updates for wmfusercontent complete
  • 18:58 Pchelolo: restbase deploy 4c239f2fa
  • 18:54 Pchelolo: restbase deploy 4c239f2fa canary on restbase1007
  • 18:49 Pchelolo: restbase deploy 4c239f2fa to staging
  • 18:24 logmsgbot: demon@tin Synchronized multiversion/checkoutMediaWiki: rm branch pointer junk (duration: 00m 45s)
  • 18:23 robh: running updates on cache_misc systems to update wmfusercontent certificate
  • 18:05 logmsgbot: demon@tin Synchronized wmf-config/InitialiseSettings.php: Enable VisualEditor by default for logged-out users on Indic-script wikipædias (duration: 00m 50s)
  • 17:47 logmsgbot: demon@tin Finished scap: 1.28.0-wmf.18 initial scap for l10n build (testwiki) (duration: 51m 52s)
  • 17:39 ejegg: enabled banner history queue consumer
  • 17:38 arlolra: updated Parsoid to version 7863e6ad (T142617)
  • 17:36 ejegg: updated CiviCRM from 7484c90 to cf19366
  • 17:34 ejegg: disabled banner history queue consumer
  • 17:26 arlolra: starting Parsoid deploy
  • 17:20 urandom: T144826: Ephemerally increasing compactor thread count from 10 to 20
  • 17:13 urandom: T144826: Lifting compaction throttle
  • 17:10 urandom: T144826: Starting online scrub
  • 17:02 urandom: T144826: Restaring Cassandra on restbase2004-a.codfw.wmnet
  • 16:55 logmsgbot: demon@tin Started scap: 1.28.0-wmf.18 initial scap for l10n build (testwiki)
  • 16:11 mobrovac: change-prop restarting to pick up https://gerrit.wikimedia.org/r/308230
  • 11:32 mobrovac: change-prop deploying e14892b
  • 11:19 marostegui: disabled puppet on db1064 - going to be reimaged - T144723
  • 10:50 jynus: shutting down db2001-2009
  • 09:45 marostegui: rysnc running from db1064 to dbstore1001 (T144723)
  • 09:23 marostegui: Stopping mysql on db1064 for maintenance - T144723
  • 09:12 moritzm: shutting down mw2153-mw2162 for hardware maintenance (T142726)
  • 08:54 akosiaris: T144174 uploaded to apt.wikimedia.org jessie-wikimedia: apertium-srd-ita_0.9.0~r72554-1+wmf1
  • 08:20 akosiaris: T144174 uploaded to apt.wikimedia.org jessie-wikimedia: apertium-srd_0.9.0~r72792-1+wmf1
  • 08:20 akosiaris: T144174 uploaded to apt.wikimedia.org jessie-wikimedia: apertium-ita_0.9.0~r72553-1+wmf1
  • 07:41 moritzm: correction: shutting down mw2140-mw2147 and mw2200-mw2214 for hardware maintenance (T142726)
  • 07:40 moritzm: shutting down mw2140-mw2214 for hardware maintenance (T142726)
  • 07:22 elukey: Increasing MaxRequestWorkers on Eqiad Imagescalers - mw129[3-8] - from 30 to 100 (one at the time checking metrics)
  • 07:21 moritzm: reimaging mw2170 to jessie
  • 06:54 moritzm: installing chromium security update on osmium
  • 02:45 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Sep 6 02:45:17 UTC 2016 (duration 6m 28s)
  • 02:38 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.17) (duration: 16m 46s)

2016-09-05

  • 23:27 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Set wgMathFileBackend to false for wikitech wikis (T126628) (duration: 00m 48s)
  • 23:23 logmsgbot: ema@palladium conftool action : set/pooled=yes; selector: cp2002.codfw.wmnet
  • 23:17 logmsgbot: ema@palladium conftool action : set/pooled=no; selector: cp2002.codfw.wmnet
  • 23:17 logmsgbot: dereckson@tin Synchronized dblists/closed.dblist: Close wikimania2015 (T139032) dblist update (duration: 00m 47s)
  • 23:17 ema: upgrading cp2002 to varnish 4 T131502
  • 23:14 logmsgbot: dereckson@tin Synchronized wmf-config/: Close wikimania2015 (T139032). So long and thanks for all the fish. (duration: 00m 51s)
  • 22:55 logmsgbot: ema@palladium conftool action : set/pooled=yes; selector: cp2005.codfw.wmnet
  • 22:50 logmsgbot: ema@palladium conftool action : set/pooled=no; selector: cp2005.codfw.wmnet
  • 22:49 ema: upgrading cp2005 to varnish 4 T131502
  • 22:28 logmsgbot: ema@palladium conftool action : set/pooled=yes; selector: cp2008.codfw.wmnet
  • 22:24 logmsgbot: ema@palladium conftool action : set/pooled=no; selector: cp2008.codfw.wmnet
  • 22:24 volans: restarting ircecho on neon to get back icinga-wm
  • 22:24 ema: upgrading cp2008 to varnish 4 T131502
  • 22:08 volans: stopped ircecho on neon to avoid the spam of recovery, monitoring icinga, I'll re-enable it in a bit
  • 22:04 logmsgbot: ema@palladium conftool action : set/pooled=yes; selector: cp2011.codfw.wmnet
  • 22:00 logmsgbot: ema@palladium conftool action : set/pooled=no; selector: cp2011.codfw.wmnet
  • 21:59 ema: upgrading cp2011 to varnish 4 T131502
  • 21:58 volans: restarting ircecho on neon to get back icinga-wm
  • 21:36 logmsgbot: ema@palladium conftool action : set/pooled=yes; selector: cp2014.codfw.wmnet
  • 21:28 logmsgbot: ema@palladium conftool action : set/pooled=no; selector: cp2014.codfw.wmnet
  • 21:28 ema: upgrading cp2014 to varnish 4 T131502
  • 20:56 bd808: Updated striker to b5fdbf9 (T144040, T144296)
  • 20:45 logmsgbot: ema@palladium conftool action : set/pooled=yes; selector: cp2017.codfw.wmnet
  • 20:41 logmsgbot: ema@palladium conftool action : set/pooled=no; selector: cp2017.codfw.wmnet
  • 20:41 ema: upgrading cp2017 to varnish 4 T131502
  • 19:49 logmsgbot: ema@palladium conftool action : set/pooled=yes; selector: cp2020.codfw.wmnet
  • 19:45 logmsgbot: ema@palladium conftool action : set/pooled=no; selector: cp2020.codfw.wmnet
  • 19:45 ema: upgrading cp2020 to varnish 4 T131502
  • 19:26 logmsgbot: ema@palladium conftool action : set/pooled=yes; selector: cp2024.codfw.wmnet
  • 19:22 logmsgbot: ema@palladium conftool action : set/pooled=no; selector: cp2024.codfw.wmnet
  • 19:22 ema: upgrading cp2024 to varnish 4 T131502
  • 18:53 logmsgbot: addshore@tin Synchronized php-1.28.0-wmf.17/extensions/UploadWizard/resources/mw.UploadWizardUpload.js: SWAT: mw.UploadWizardDetails, mw.UploadWizardUpload: Use amenableparser to handle templates in error messages Part 2/2 (duration: 00m 46s)
  • 18:52 logmsgbot: addshore@tin Synchronized php-1.28.0-wmf.17/extensions/UploadWizard/resources/mw.UploadWizardDetails.js: SWAT: mw.UploadWizardDetails, mw.UploadWizardUpload: Use amenableparser to handle templates in error messages Part 1/2 (duration: 00m 48s)
  • 18:51 logmsgbot: addshore@tin Synchronized php-1.28.0-wmf.17/resources/src/mediawiki/mediawiki.Upload.BookletLayout.js: SWAT: mw.Upload.BookletLayout: Use amenableparser to handle templates in error messages (duration: 00m 47s)
  • 18:50 logmsgbot: addshore@tin Synchronized php-1.28.0-wmf.17/resources/src/mediawiki/api/messages.js: SWAT: mw.api.messages: Allow passing extra parameters for the API call (duration: 00m 53s)
  • 18:31 ema: upgrading cp2026 to varnish 4 T131502
  • 17:43 ema: restarting pybal on lvs2002 T134893
  • 17:35 ema: upgrading cp2022 to varnish 4 T131502
  • 17:34 mobrovac: change-prop deploying 222fcf8
  • 14:42 elukey: upgrading apache httpd to the latest version on mw129[3-8] (eqiad image scalers)
  • 14:05 logmsgbot: marostegui@tin Synchronized wmf-config/db-eqiad.php: Changing db-eqiad config to depool db1064 - T144723 (duration: 00m 48s)
  • 14:00 elukey: upgrading mw1306/mw1299 to the latest version of Apache httpd
  • 13:47 mobrovac: change-prop restarting for https://gerrit.wikimedia.org/r/306308
  • 13:40 elukey: upgrading mw130[012345] to the latest version of Apache httpd (eqiad jobrunners, one at the time)
  • 13:23 moritzm: reimaging mw2087 to jessie
  • 13:12 logmsgbot: addshore@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable the RevisionSlider on test.wikidata.org (duration: 00m 48s)
  • 13:07 logmsgbot: addshore@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add source wikis in import page in sawiki (duration: 00m 53s)
  • 12:49 ema: repool cp4005 with varnish 3
  • 12:47 moritzm: depooling/rebooting/repooling sca1002 for upgrade to Linux 4.4 (T144492)
  • 12:41 ema: downgrading cp4005 to varnish 3 T131502
  • 11:40 elukey: Reimaging mw217[89] and mw219[6789] to Debian jessie
  • 10:37 moritzm: depooling/rebooting/repooling sca1001 for upgrade to Linux 4.4 (T144492)
  • 10:18 moritzm: reimaging mw2192-mw2195 to jessie
  • 09:24 elukey: reimaging mw21(8[89]|9[01]) to Debian Jessie
  • 09:14 elukey: reimaging mw218[4567] to Debian Jessie
  • 09:01 moritzm: reimaging mw2174-mw2177 to jessie
  • 07:12 moritzm: reimaging mw2169-mw2172 to jessie
  • 02:29 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Sep 5 02:29:51 UTC 2016 (duration 5m 42s)
  • 02:24 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.17) (duration: 10m 50s)

2016-09-04

  • 20:11 elukey: re-enabled ircecho on neon
  • 20:03 elukey: stopped ircecho on neon temporarily
  • 19:32 elukey: restarting apache2 on rhodium (attempt to fix it)
  • 19:14 hashar: Puppet is falling since ~ 18:05 UTC. At least a couple european ops are looking at it
  • 02:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Sep 4 02:31:20 UTC 2016 (duration 6m 7s)
  • 02:25 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.17) (duration: 09m 57s)

2016-09-03

  • 14:24 ema: depool cp4005
  • 10:21 jynus: deploying schema change on s7 hosts T139090
  • 02:46 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Sep 3 02:46:12 UTC 2016 (duration 6m 22s)
  • 02:39 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.17) (duration: 17m 35s)

2016-09-02

  • 22:31 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.17/includes/page/WikiPage.php: No-op sync - 880c7f9 (duration: 00m 49s)
  • 16:50 thcipriani: jenkins restarted
  • 16:50 mark: rebooting bast3001
  • 16:49 ema: downgrading cp4015 to varnish 3 T131502
  • 16:40 thcipriani: restarting jenkins shortly for plugin upgrade
  • 16:10 ema: downgrading cp4014 to varnish 3 T131502
  • 15:33 ema: downgrading cp4013 to varnish 3 T131502
  • 14:27 ema: downgrading cp4007 to varnish 3 T131502
  • 14:14 mutante: gallium deleting jenkins/config-history files older than an hour
  • 13:48 moritzm: rebooting labnet1002 for kernel update
  • 12:09 ema: pooling ulsfo
  • 11:37 elukey: upgrading mw1283.eqiad.wmnet to the latest httpd version
  • 10:17 moritzm: reimaging mw2180-mw2183 to jessie
  • 10:10 elukey: reimaging mw216[5-8] to jessie (IPMI fixed)
  • 09:48 mark: Raised OSPF metrics on cr2-ulsfo<-->cr1-codfw link from 388 to 1000 in both directions
  • 09:13 elukey: upgrading httpd on mw127[6789] to 2.4.10-10+deb8u6+wmf2 (eqiad api canaries)
  • 09:03 ema: reboot cp4006 for kernel upgrade
  • 08:38 moritzm: reimaging mw2099, mw2117, mw2163, mw2164 to jessie
  • 08:34 elukey: upgrading httpd on mw1289 to 2.4.10-10+deb8u6+wmf2
  • 08:19 elukey: upgrading httpd on mw128[0124] to 2.4.10-10+deb8u6+wmf2
  • 08:04 elukey: upgrading httpd on mw1290 to 2.4.10-10+deb8u6+wmf2
  • 06:40 moritzm: reimage mw2101 to jessie
  • 06:36 moritzm: reimage mw2149-mw2151 to jessie
  • 04:35 mutante: tin - re-enabled puppet
  • 02:57 awight: Began bulk refund for T144489
  • 02:43 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Sep 2 02:43:50 UTC 2016 (duration 5m 15s)
  • 02:38 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.17) (duration: 16m 12s)
  • 01:06 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Customize wgMathDirectory for wikitech (T126628, 2/2) (duration: 00m 46s)
  • 01:05 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Customize wgMathDirectory for wikitech (T126628, 1/2) (duration: 00m 47s)
  • 00:22 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Enable Math on wikitech (T126338) (duration: 00m 47s)
  • 00:13 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: End lazy loading reference experiments (T144240) (duration: 00m 47s)
  • 00:10 logmsgbot: dereckson@tin Synchronized wmf-config/CirrusSearch-common.php: Disable phrase suggester for wikidata (T143260, 2/2) (duration: 00m 46s)
  • 00:09 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Disable phrase suggester for wikidata (T143260, 1/2) (duration: 00m 46s)
  • 00:03 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.17/extensions/ZeroBanner: Update router code (T143425) (duration: 00m 47s)
  • 00:02 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.17/extensions/MobileFrontend/resources/mobile.startup/Skin.js: Ensure lazy image placeholders without height can be loaded (T143768) (duration: 00m 46s)
  • 00:00 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.17/extensions/Flow/modules/flow/ui/widgets/editor/editors/mw.flow.ui.VisualEditorWidget.js: Flow Fixes related to Visual Editor (T138356 and T139972) (duration: 00m 45s)

2016-09-01

  • 23:57 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.17/extensions/VisualEditor/lib/ve: Update lib/ve submodule for Ib9bbaccfff9 (duration: 00m 47s)
  • 23:51 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.17/extensions/CirrusSearch/includes/Query/FullTextQueryStringQueryBuilder.php: Do not use the suggest reverse field if it's a non local search (Gerrit:307955) (duration: 00m 48s)
  • 23:39 mutante: tin removed mw2187 from /etc/dsh/group/scap-proxies
  • 23:38 mutante: tin stopping puppet
  • 23:32 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings-labs.php: Enable Wikidata descriptions taglines on labs (T143345, no-op in prod) (duration: 02m 52s)
  • 23:15 XenoRyet: updated payments-wiki from ef2f2f8 to fafb6b4
  • 23:06 MaxSem: That for https://gerrit.wikimedia.org/r/#/c/308084/
  • 23:06 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 02m 53s)
  • 23:01 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.17/extensions/Kartographer: https://gerrit.wikimedia.org/r/#/c/308085/ (duration: 02m 54s)
  • 22:45 ejegg: enabled fredge queue consumer
  • 22:38 ejegg: updated civicrm from 1678e1f to 7484c90
  • 22:36 ejegg: disabled fredge queue consumer
  • 22:31 ejegg: updated paymentswiki settings from 393944f to 1845193
  • 22:11 awight: reenabling recurring Ingenico job and kicking one-off run
  • 22:03 awight: updating wmf_civicrm schema to 7240
  • 21:59 awight: update fundraising CRM from 0c6bf38 to 1678e1f
  • 21:25 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1013-a
  • 21:25 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1012-a
  • 21:24 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1008-a
  • 21:22 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1011-a
  • 21:22 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1010-a
  • 21:21 urandom: T143226: Perform major compaction on local_group_wikipedia_T_parsoid_html.data, restbase1007-a
  • 20:58 mutante: mw2187 - shut down
  • 20:35 urandom: T143226: Clearing repair status: eqiad, rack 'd' nodes
  • 20:35 urandom: T143226: Clearing repair status: eqiad, rack 'dd' nodes
  • 19:58 hashar: 1.28.0-wmf.17 rolled on group2 and apparently all fine
  • 19:54 gehel: reloading ferm rules on elasticsearch eqiad cluster
  • 19:44 urandom: T143226: Clearing repair status: eqiad, rack 'b' nodes
  • 19:39 urandom: T143226: Clearing repair status restbase1011-c.eqiad.wmnet
  • 19:19 logmsgbot: hashar@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.17
  • 19:14 mutante: tin temp. disabled puppet
  • 19:11 mutante: tin removing mw2167 thru mw2199 from dsh file manually, re-running puppet
  • 18:33 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Test PageAssessments on English Wikivoyage (T142056) (duration: 02m 48s)
  • 18:26 mutante: mw2187 - powercycled
  • 18:19 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: name=elastic1029.eqiad.wmnet
  • 18:19 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: REVERT because proxy down SWAT: Test PageAssessments on English Wikivoyage (T142056) (duration: 03m 15s)
  • 18:11 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Test PageAssessments on English Wikivoyage (T142056) (duration: 04m 54s)
  • 17:55 Jeff_Green: switching fundraising database reader from db1008 to frdb1001
  • 17:12 bd808: Updated striker to ac555bd; fixes T144064
  • 16:49 ema: cp4006 repooled after downgrade
  • 16:44 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: name=elastic1028.eqiad.wmnet
  • 16:24 ema: restarting pybal on lvs4002 T134893
  • 16:10 ema: downgrading cp4006 to varnish 3 T131502
  • 15:36 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: name=elastic102[12].eqiad.wmnet
  • 15:34 chasemp: reboot nova-compute on labvirt1013 as stuck (no logs, not applying any changes or taking any instruction)
  • 14:58 moritzm: powered down several hosts for hardware maintenance (T142726): mw2099, mw2102, mw2117, mw2163-mw2199
  • 14:42 mobrovac: restbase deploy end of 9cca320
  • 14:40 moritzm: powered down several hosts for hardware maintenance (T142726): mw2087, mw2149-mw2151
  • 14:39 elukey: upgrading httpd/apache to 2.4.10-10+deb8u6+wmf2 on mw128[56]
  • 14:27 mobrovac: restbase deploy start of 9cca320
  • 14:11 godog: wipe and reinitialize corrupted xfs on /dev/sdn1 on ms-be1016
  • 13:53 eileen: turned off queue for http://localhost:9000/job/GlobalCollect%20Recurring%20Donations/ in jenkins
  • 13:42 elukey: upgrading httpd/apache to 2.4.10-10+deb8u6+wmf2 on mw128[78]
  • 13:41 bblack: uploaded openssl-1.1.0-1+wmf1 to jessie-wikimedia/experimental
  • 13:39 elukey: not upgrading mw130[01] since I'd need more info before proceeding
  • 13:34 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.17/includes/page/WikiPage.php: T144484 (duration: 00m 49s)
  • 13:30 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.17/includes/page/WikiPage.php: T144484 (duration: 00m 35s)
  • 13:16 elukey: upgrading httpd/apache to 2.4.10-10+deb8u6+wmf2 on mw130[01].eqiad.wmnet
  • 13:01 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: name=elastic1047.eqiad.wmnet
  • 12:53 logmsgbot: ema@palladium conftool action : set/pooled=yes; selector: cp4005.ulsfo.wmnet (tags: ['dc=ulsfo', 'cluster=cache_upload', 'service=varnish-be'])
  • 12:12 gehel: rolling restart of ferm on elasticsearch eqiad cluster to account for moved servers - T143685
  • 10:50 moritzm: installing libidn security updates
  • 10:39 godog: reboot ms-be1016, stuck again
  • 10:24 logmsgbot: hashar@tin Synchronized docroot/noc/index.html: link to conftool and wikitech pages on https://noc.wikimedia.org/ (duration: 00m 47s)
  • 10:17 gehel: repooled elastic104[456] - T144450
  • 09:58 jynus: adding marostegui to wmf and ops on wikitech LDAP
  • 09:57 logmsgbot: gehel@palladium conftool action : set/pooled=inactive; selector: elastic1046.eqiad.wmnet
  • 09:57 logmsgbot: gehel@palladium conftool action : set/pooled=inactive; selector: elastic1045.eqiad.wmnet
  • 09:56 logmsgbot: gehel@palladium conftool action : set/pooled=inactive; selector: elastic1044.eqiad.wmnet
  • 09:51 moritzm: reimaging mw2200-2203 to jessie
  • 09:37 moritzm: reimaging mw2061-2064 to jessie
  • 09:35 elukey: reimaging mw2167 -> mw2170 to Jessie
  • 09:24 moritzm: reimaging mw2163-2166 to jessie
  • 09:06 moritzm: installing postgres security updates on labsdb1004/1006/1007
  • 09:04 godog: reboot ms-be1016, stuck and nothing on console
  • 07:45 moritzm: reimaging mw2116-2119 to jessie
  • 03:12 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Sep 1 03:12:34 UTC 2016 (duration 7m 14s)
  • 03:05 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.17) (duration: 17m 59s)
  • 02:30 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.16) (duration: 12m 21s)
  • 02:22 bblack: ulsfo depool
  • 01:00 Jeff_Green: reboot db1025 for kernel update
  • 01:00 Jeff_Green: reboot db1025 for kernel update
  • 00:29 eileen: CiviCRM upgrade from from d067c47 to 0c6bf38
  • 00:13 eileen: fr campaigns disabled
  • 00:09 eileen: stop Donations q consumer job on jenkins


Archives