Server Admin Log

From Wikitech
Jump to: navigation, search

2016-06-28

  • 23:10 logmsgbot: twentyafterfour@tin Started scap: sync new branch, testwiki to php-1.28.0-wmf.8 refs T137492
  • 23:10 Krenair: wikitech-static working now, poke me on IRC or file a #wikitech.wikimedia.org ticket if you find any issues
  • 23:10 twentyafterfour: syncing new branch 1.28.0-wmf.8 refs T137492
  • 23:04 logmsgbot: ebernhardson@tin Synchronized php-1.28.0-wmf.7/extensions/EventBus/EventBus.php: SWAT: EventBus: Match the expected format of response log key (duration: 00m 31s)
  • 23:01 Krenair: Updating MW version on wikitech-static to 1.27 (LTS) - https://lists.wikimedia.org/pipermail/mediawiki-announce/2016-June/000191.html
  • 21:59 halfak: deploying ores beec291
  • 21:33 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.7
  • 21:31 logmsgbot: twentyafterfour@tin Synchronized php-1.28.0-wmf.7/extensions/AbuseFilter/: deploy https://gerrit.wikimedia.org/r/#/c/296464/ refs T138550 T136973 (duration: 00m 36s)
  • 21:24 twentyafterfour: deploying wmf.7 yet again, once CI finishes testing https://gerrit.wikimedia.org/r/#/c/296464/ refs T138550 T136973
  • 20:24 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: once again rolling back to wmf.6 refs T136973 T138550
  • 20:11 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.7
  • 20:09 logmsgbot: twentyafterfour@tin Synchronized php-1.28.0-wmf.7/extensions/AbuseFilter/: deploying https://gerrit.wikimedia.org/r/#/c/296440/ refs T138550, T136973 (duration: 02m 06s)
  • 20:09 twentyafterfour: deploying https://gerrit.wikimedia.org/r/#/c/296440/ to hopefully unblock wmf.7 deployments. refs T138550, T136973
  • 20:08 gehel: disabling puppet on wdqs100[12] to cleanup after failed scap3 deplyoment
  • 19:33 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: Rolling back to wmf.6: save time regression is still present in wmf.7
  • 19:32 twentyafterfour: Rolling back to wmf.6: T138550 is still a problem
  • 19:24 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.7
  • 19:23 twentyafterfour: Deploying 1.28.0-wmf.7 to all wikis
  • 18:23 mutante: zosma - fresh install, sign puppet certs, initial puppet run
  • 16:16 gehel: starting rolling restart of elasticsearch codfw cluster (T138811)
  • 15:25 logmsgbot: thcipriani@tin Synchronized portals: SWAT: Bumping portals to master (T136874) (duration: 00m 29s)
  • 15:24 logmsgbot: thcipriani@tin Synchronized portals/prod/wikipedia.org/assets: SWAT: Bumping portals to master (T136874) (duration: 00m 24s)
  • 15:16 logmsgbot: thcipriani@tin Synchronized dblists/visualeditor-default.dblist: SWAT: Enable VisualEditor by default for all users of the French (T136993), English (T136992), and German (T136991) Wikivoyage (duration: 00m 24s)
  • 15:09 logmsgbot: thcipriani@tin Synchronized dblists/visualeditor-default.dblist: SWAT: Enable VisualEditor by default for all users of the Italian Wikivoyage (T136994) (duration: 00m 25s)
  • 14:52 gehel: powercycling elastic1004 (server not coming up during restart - T138811)
  • 13:47 godog: bounce carbon on graphite machines after applying https://gerrit.wikimedia.org/r/266567
  • 13:40 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: aqs1001.eqiad.wmnet
  • 12:50 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1023, 24, 33, 35, 39, 44, 52, 61, 64, 68, 63, 67, 72, 73 (duration: 02m 39s)
  • 12:00 gehel: powercycling elastic1002 (server not coming up during restart - T138811)
  • 11:43 gehel: powercycling elastic1001 (server not coming up during restart - T138811)
  • 11:21 gehel: rolling restart of elasticsearch eqiad
  • 10:44 moritzm: rolling reboot of mediawiki in codfw for kernel security update
  • 09:39 moritzm: powercycling mw1021, didn't come up after reboot
  • 09:32 elukey: restarted hhvm on mw1238, memory pressure ok but hhvm stuck (hhvm-dump-debug in /tmp/hhvm.14788.bt.)
  • 09:28 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: aqs1003.eqiad.wmnet
  • 09:25 moritzm: powercycling mw1019, didn't come up after reboot
  • 09:25 logmsgbot: reedy@tin Synchronized wmf-config/interwiki.php: Updated IW map (duration: 00m 49s)
  • 09:13 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: aqs1003.eqiad.wmnet
  • 08:57 moritzm: powercycling mw1018, didn't come up after reboot
  • 08:47 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Prepare old servers for decom by sending all queries to new servers (duration: 01m 39s)
  • 08:32 moritzm: rolling reboot of mediawiki canaries for kernel security update
  • 08:30 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: aqs1002.eqiad.wmnet
  • 08:17 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: aqs1002.eqiad.wmnet
  • 08:15 elukey: rebooting aqs100[23].eqiad for kernel upgrades
  • 02:54 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jun 28 02:54:56 UTC 2016 (duration 7m 16s)
  • 02:47 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 08m 59s)
  • 02:27 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 51s)
  • 00:26 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.7/includes/api/ApiMain.php: UsageException to try to catch T138585 issue (duration: 00m 27s)
  • 00:21 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Enable Wikibase descriptions on Catalan and Polish wikis (T135429) (duration: 00m 26s)
  • 00:09 logmsgbot: dereckson@tin Synchronized wmf-config/mobile.php: Introduce config variable to control tagline (T138738, 2/2) (duration: 00m 27s)
  • 00:08 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Introduce config variable to control tagline (T138738, 1/2) (duration: 00m 32s)
  • 00:07 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings-labs.php: Introduce config variable to control tagline (no-op) (duration: 00m 27s)
  • 00:05 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.6/extensions/MobileFrontend/: Introduce config variable to control tagline (T138738) (duration: 00m 29s)
  • 00:02 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.7/extensions/MobileFrontend/: Introduce config variable to control tagline (T138738) (duration: 00m 39s)

2016-06-27

  • 20:13 mdholloway: mobileapps deployed 30cc12e
  • 20:08 subbu: finished deploying parsoid sha dd8e644d
  • 20:04 subbu: synced new parsoid code; restarted parsoid on wtp1001 as a canary
  • 20:01 subbu: starting parsoid deploy
  • 17:23 gehel: deploying new logstash config for transition to elasticsearch 2.x (T138335)
  • 15:21 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Increase move rate limit for extendedmovers in enwiki to 16/60 (duration: 00m 28s)
  • 15:19 logmsgbot: thcipriani@tin Synchronized wmf-config/throttle.php: SWAT: Delete old throttle rules (duration: 00m 26s)
  • 15:16 gehel: banning elastic1001 to prepare its decommissioning (T138329)
  • 15:13 logmsgbot: thcipriani@tin Synchronized dblists/clldefault.dblist: SWAT: Deploy Compact Language Links as default (Stage 3) PART II (duration: 00m 23s)
  • 15:07 logmsgbot: thcipriani@tin Synchronized wmf-config: SWAT: Deploy Compact Language Links as default (Stage 3) (duration: 00m 40s)
  • 15:00 elukey: mw1136 powercycled - not responsive to ssh and root login
  • 14:49 logmsgbot: gehel@palladium conftool action : set/pooled=no; selector: dc=eqiad,cluster=elasticsearch,service=elasticsearch,name=elastic101[0-6].eqiad.wmnet
  • 14:39 logmsgbot: gehel@palladium conftool action : set/pooled=no; selector: dc=eqiad,cluster=elasticsearch,service=elasticsearch,name=elastic100[0-9].eqiad.wmnet
  • 14:37 logmsgbot: gehel@palladium conftool action : get/pooled; selector: dc=eqiad,cluster=elasticsearch,service=elasticsearch,name=elastic100[0-9]..eqiad.wmnet
  • 14:34 gehel: removing old elasticsearch servers in eqiad from LVS (elastic1001-1016 - T138329)
  • 10:10 moritzm: pooled mw1291 (jessie imagescaler)
  • 09:48 jynus: stopping and reimporting db2010 (m1)
  • 09:47 gehel: removing maps-test*.codfw.wmnet servers from LVS (T138092)
  • 09:19 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: dc=eqiad,cluster=elasticsearch,service=elasticsearch-ssl,name=elastic104..eqiad.wmnet
  • 09:19 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: dc=eqiad,cluster=elasticsearch,service=elasticsearch,name=elastic104..eqiad.wmnet
  • 09:18 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: dc=eqiad,cluster=elasticsearch,service=elasticsearch-ssl,name=elastic103..eqiad.wmnet
  • 09:18 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: dc=eqiad,cluster=elasticsearch,service=elasticsearch,name=elastic103..eqiad.wmnet
  • 09:10 logmsgbot: gehel@palladium conftool action : get/pooled; selector: elastic10??\.eqiad\.wmnet (tags: ['dc=eqiad', 'cluster=elasticsearch', 'service=elasticsearch'])
  • 09:07 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: elastic1032.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=elasticsearch', 'service=elasticsearch-ssl'])
  • 09:06 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: elastic1032.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=elasticsearch', 'service=elasticsearch'])
  • 09:00 gehel: adding new elasticsearch servers in eqiad to LVS
  • 08:54 godog: swift codfw-prod ms-be202[234] weight 2000
  • 07:15 elukey: puppet stopped on analytics1049 to remove it completely from the Hadoop cluster - broken disk
  • 02:51 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Jun 27 02:51:41 UTC 2016 (duration 7m 5s)
  • 02:44 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 08m 09s)
  • 02:27 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 54s)

2016-06-26

  • 02:52 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Jun 26 02:52:48 UTC 2016 (duration 6m 19s)
  • 02:46 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 08m 15s)
  • 02:28 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 48s)

2016-06-25

  • 09:37 mutante: install2001 killing ganglia aggregator processes, running puppet, for debugging
  • 02:51 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jun 25 02:51:43 UTC 2016 (duration 6m 26s)
  • 02:45 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 07m 58s)
  • 02:28 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 53s)
  • 01:07 chasemp: sign labstore1005 puppet certs and bootstrap the server
  • 00:53 chasemp: hand hack apache on labmon to make it work temporarily

2016-06-24

  • 18:41 logmsgbot: krenair@tin Synchronized dblists/mobilemainpagelegacy.dblist: https://gerrit.wikimedia.org/r/#/c/295958/4 - fix mobile main page rendering on a bunch of wikis, effectively putting them back to how they were a few days ago (duration: 00m 37s)
  • 17:19 mobrovac: change-prop deploying df88a75b
  • 17:05 _joe_: re-started changeprop after disabling the dependency module
  • 14:18 paravoid: shutting down ms-fe3002 due to on-site work
  • 14:05 logmsgbot: krinkle@tin Synchronized php-1.28.0-wmf.7/includes/OutputPage.php: T138586 hotfix (duration: 00m 47s)
  • 14:02 mobrovac: scb100x disabled puppet to clear changeprop queues
  • 13:22 gehel: re-enabling puppet on maps1002 (still in pre-configuration state, only default role)
  • 12:34 hashar: Random resource loader entries are apparently faulty causing issues with css and/or javascript T138586
  • 12:04 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: aqs1001.eqiad.wmnet
  • 12:03 elukey: rebooting aqs1001.eqiad.wmnet for kernel upgrades
  • 10:55 jynus: updated m1-slave dns to be db1001
  • 10:20 hashar: gallium: restarted apache2 , potentially stuck proxy
  • 10:18 moritzm: upgrade nodejs on scb systems in codfw and restart node-based services
  • 09:59 ema: nginx rolling restart to enable TFO on all tlsproxies (T108827)
  • 09:52 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1059 with low weight, increase weight of db1061, db1062 (duration: 00m 33s)
  • 09:48 moritzm: upgrade nodejs on restbase test systems (xenon/praseodymium/cerium/restbase-test) and restart restbase on those
  • 09:09 mobrovac: scb100x stopping puppet to stop change-prop and clear the queue
  • 08:29 moritzm: uploaded nodejs 4.4.6 for jessie-wikimedia to carbon
  • 07:10 elukey: memcached on mc1007 restarted with growth factor 1.05 (T129963)
  • 03:54 robh: data copy for labmon1001 verified complete with proper permissions, re-enabling and running puppet to start back up services
  • 03:19 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jun 24 03:19:55 UTC 2016 (duration 7m 4s)
  • 03:12 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 17m 24s)
  • 02:38 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 17m 08s)
  • 01:22 bblack: stream.wikimedia.org (RCStream) DNS moved to cache_misc termination. If anyone reports bugs with rcstream services, revert https://gerrit.wikimedia.org/r/295385

2016-06-23

  • 23:17 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/295600/ (duration: 00m 29s)
  • 23:15 logmsgbot: maxsem@tin Synchronized dblists/mobilemainpagelegacy.dblist: https://gerrit.wikimedia.org/r/#/c/295600/ (duration: 00m 28s)
  • 22:33 chasemp: reimage labstore1005 post io testing
  • 22:12 chasemp: powercycle labstore1005
  • 21:24 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group2 wikis to wmf.6
  • 21:11 chasemp: silence alerts for labstore1004 for setup
  • 20:31 ebernhardson: synced out latest logstash-plugins via trebuchet
  • 20:17 Dereckson: Run initSiteStats.php on cebwiki (T138533)
  • 20:04 logmsgbot: jzerebecki@tin Synchronized wmf-config/CommonSettings.php: Log PHP/HHVM errors in CLI mode to stderr, not stdout T138291 (duration: 00m 28s)
  • 20:03 robh: labmon1001 data restore at 100gb 50minutes in, 298gb total for restoration
  • 19:29 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.7
  • 19:24 greg-g: 19:21 < RoanKatto>  !log Synced patches for T137288 and T137593
  • 18:31 elukey: mw130[0134] - new jobrunners installed and pooled (happened automatically after the fist puppet run)
  • 18:09 robh: labmon1001 powering down for reimage
  • 17:45 subbu: finished deploying parsoid sha 18022c96
  • 17:40 subbu: synced new code; restarted parsoid on wtp1001 as a canary
  • 17:37 subbu: starting parsoid deploy
  • 17:29 robh: labmon1001 cpy changed back to local usb, errors on network transfer for ownership. resumed rsync with append flag to local usb disk.
  • 17:03 bblack: cache perf tuning marker: start rollout of tcp_no_metrics_save:0
  • 16:27 chasemp: remove old log files on ytterbium for T114395
  • 16:18 godog: swift: add ms-be202[234] weight 1000 - T136630
  • 15:31 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings-labs.php: SWAT: LABS: Enable geoshapes graph protocol (duration: 00m 29s)
  • 15:26 akosiaris: stop etherpad-lite, etherpad is down
  • 15:16 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy Compact Language Links as default (Stage 2) PART III (duration: 00m 24s)
  • 15:16 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Deploy Compact Language Links as default (Stage 2) PART II (duration: 00m 28s)
  • 15:15 logmsgbot: thcipriani@tin Synchronized dblists/clldefault.dblist: SWAT: Deploy Compact Language Links as default (Stage 2) PART I (duration: 00m 41s)
  • 15:11 robh: puppet disabled on labmon1001 along with all icinga alerting. data migration to usb in progress via root screen session
  • 15:05 robh: starting data backup of labmon1001, halting statsite/graphite/carbon-relay on system
  • 14:47 akosiaris: change the default message in etherpad to indicate problems
  • 14:47 mobrovac: change-prop deploying 05c72ed24ca
  • 14:45 akosiaris: debugging etherpad. Started the service with a blank db, looks like it's working
  • 14:38 akosiaris: stopping etherpad-lite on etherpad1001, disabling puppet
  • 14:32 jynus: restarting etherpad-lite.service
  • 13:53 hashar: Zuul/CI are slowly catching up. I had to drop a few changes that got force merged on the SmashPig repo.
  • 13:37 awight: update SmashPig from a435adeb130217bda8b95d3c5c6331ace8ad1228 to 917138e159f0341e3dfbb35818c3ce479927875b
  • 13:36 hashar: CI is slowed down due to surge of jobs and lack of instances to build them on ( T133911 ). Queue is 50 for Jessie and 25 for Trusty.
  • 13:30 jynus: db1059 backup and reimage
  • 13:28 awight: update SmashPig from c0cc2a1a6062ad8d114473ea1a444786a0d50833 to a435adeb130217bda8b95d3c5c6331ace8ad1228
  • 13:16 jynus: running scap pool on mw1301
  • 13:13 mobrovac: restarting zotero on sca, 6g mem
  • 13:13 jynus: running scap pool on mw1300
  • 13:11 mobrovac: citoid deploying 0129ab0b
  • 13:11 elukey: purged some puppet output logs on compiler02.puppet3-diffs.eqiad.wmflabs to free space (disk full)
  • 13:09 moritzm: depooled jessie image scaler (mw1291) again, works fine, to be permanently pooled on Monday
  • 12:49 moritzm: pooling new jessie image scaler mw1291 for short production smoke testing
  • 12:35 awight: update SmashPig from f7d65c54bed3ff9c478b0dbcaa1b2d27cc665ace to c0cc2a1a6062ad8d114473ea1a444786a0d50833
  • 12:18 awight: update SmashPig from 90757321a3bfa1045202e06e3dd1960a0043493a to f7d65c54bed3ff9c478b0dbcaa1b2d27cc665ace
  • 12:07 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1059; Repool db1061 & db1062; increase weight of db1068 (duration: 00m 39s)
  • 11:33 gehel: rolling restart of elasticsearch10(01|30|08|36|13|40) to activate new masters
  • 10:13 andrewbogott: restarting rabbitmq-server on labcontrol1001 (random debugging attempt for T138106)
  • 09:49 godog: reimage ms-be202[567] with incorrect raid settings
  • 09:11 jynus: syncing etherpadlite.store (m1) on db2010, which had 2 bad chunks
  • 08:39 mobrovac: change-prop restarting on scb to pick up ores rules https://gerrit.wikimedia.org/r/295576
  • 08:06 mobrovac: change-prop deploying 45db4f84827
  • 06:59 moritzm: installing spice security updates
  • 02:48 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Jun 23 02:47:59 UTC 2016 (duration 6m 44s)
  • 02:41 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 07m 05s)
  • 02:26 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 11m 19s)

2016-06-22

  • 23:24 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/295560/ (duration: 00m 25s)
  • 23:23 logmsgbot: maxsem@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/295560/ (duration: 00m 24s)
  • 23:23 logmsgbot: maxsem@tin Synchronized dblists/mobilemainpagelegacy.dblist: https://gerrit.wikimedia.org/r/#/c/295560/ (duration: 00m 24s)
  • 23:14 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/294247/ (duration: 00m 24s)
  • 23:09 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/295558/ (duration: 00m 40s)
  • 22:25 ori: Ran hacked maintain-replicas.pl on labsdb100[13] for T135029
  • 21:06 bblack: cache perf: start deploy of -autocorking (probably last experiment I can squeeze in today)
  • 21:00 Dereckson: Run namespaceDupes.php on ptwikinews (T138230) and frwikinews (T138442)
  • 20:33 mdholloway: mobileapps: finished deploying 8046ee2
  • 20:26 yurik: deployed & restarted tilerator https://gerrit.wikimedia.org/r/#/c/295447/
  • 20:25 mdholloway: starting mobileapps deployment
  • 20:20 Reedy: created tmplog_begin_devices on tmplog_end_devices on testwiki.cn_template_log
  • 20:18 yurik: deployed & restarted kartotherian https://gerrit.wikimedia.org/r/#/c/295449/
  • 19:32 bblack: start rollout of first batch of cache sysctl stuff (un-mysterious + disable prequeue timestamps)
  • 19:29 jynus: archiving and dropping reviewdb on m1 shard
  • 19:06 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.7
  • 18:46 jynus: shutting down and reimaging db1001
  • 18:20 papaul: ms-be202[3-7] - signing puppet certs, salt-key, initial run
  • 17:23 akosiaris: restart apache on ununpentium for m1 migration. Hosts RT, just did it for good measure
  • 17:21 akosiaris: restarted bacula-director on helium
  • 17:15 jynus: killing puppet, rt, librenms user connections on db1001
  • 17:10 jynus: failovered m1-master from db1001 to db1016
  • 16:20 gehel: new elasticsearch servers elastic1032-1047 are configured and have joined the eqiad cluster
  • 15:26 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.6/extensions/OATHAuth: SWAT: Fixup qrcode-generating js, to stop race condition. (duration: 00m 33s)
  • 15:23 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Improve style (duration: 00m 33s)
  • 15:18 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.7/extensions/OATHAuth: SWAT: Fixup qrcode-generating js, to stop race condition. (duration: 00m 27s)
  • 15:13 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add www.wpc.ncep.noaa.gov to wgCopyUploadsDomains (duration: 00m 54s)
  • 15:01 elukey: rebooting bohrium.eqiad.wmnet (running piwik) for kernel upgrades
  • 14:32 jynus: checksumming m1 databases in preparation for failover
  • 14:29 tgr: running https://phabricator.wikimedia.org/diffusion/ECAU/browse/master/maintenance/checkLocalUser.php for some users T119736
  • 14:04 moritzm: rolling restart of hhvm/apache on app servers in eqiad for expat security update
  • 13:42 godog: add 500G to fluorine /a (almost full)
  • 13:31 gehel: configuring new elasticsearch servers elastic1038-1042 in eqiad
  • 13:03 hashar: Manually moved some missing build records. Restarting Jenkins
  • 12:49 hashar: T80385 Restarting Jenkins with builds dir set to "${JENKINS_HOME}/builds/${ITEM_FULL_NAME}" which is /var/lib/jenkins/builds/XXX
  • 12:35 gehel: starting reimage of mw1292
  • 12:34 _joe_: disabling puppet on mw1017, live-hacking it
  • 12:34 hashar: T80385 stopping Jenkins and migrating all build records to /var/lib/jenkins/builds
  • 12:06 gehel: configuring new elasticsearch servers elastic1033-1037 in eqiad
  • 10:46 godog: upload libphutil/arcanist 0~git20160620-0wmf1 to carbon
  • 10:32 elukey: mw1140 powercycle after freeze issues due to memory pressure (was not able to ssh to it)
  • 10:18 moritzm: rolling restart of restbase in eqiad to pick up firejail change in service::node
  • 09:46 moritzm: rolling restart of restbase in codfw to pick up firejail change in service::node
  • 09:43 legoktm: live-hacking on mw1017 to debug T115119
  • 09:19 jynus: stopping and reconfiguring mysql on dbstore1001
  • 07:59 moritzm: rolling restart of hhvm/apache on canary app servers in eqiad for expat security update
  • 07:30 jynus: stopping, backing up and reimaging db1061 and db1062
  • 07:06 moritzm: restarted hhvm on mw1131
  • 04:29 chasemp: fix salt key on labtestmetal2001
  • 03:12 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jun 22 03:12:33 UTC 2016 (duration 6m 44s)
  • 03:05 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 17m 49s)
  • 02:31 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 24s)

2016-06-21

  • 23:14 yurik: updated/restarted kartotherian & tilerator - https://gerrit.wikimedia.org/r/#/c/295440/ https://gerrit.wikimedia.org/r/#/c/295441/
  • 23:05 tgr: deleted localuser rows for Mahir256@orwikisource and A879071@enwiki for T119736
  • 22:19 bd808: Backfilled missing 2016-06-20 data to https://tools.wmflabs.org/sal/production?d=2016-06-20
  • 22:08 logmsgbot: ori@tin Synchronized static/images/mobile: I8f09e825: Optimize mobile static images (duration: 00m 34s)
  • 19:27 bd808: Restarted dead logstash process on logstash1001. Looks to have stopped itself due to the the Elasticsearch OOM earlier
  • 19:18 logmsgbot: thcipriani@tin Purged l10n cache for 1.28.0-wmf.5
  • 19:17 bd808: Restarted ElasticSearch on logstash1001; dead from OOM
  • 19:14 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.28.0-wmf.7
  • 18:50 bblack: enabled tcp_notsent_lowat optimization on all caches (marking this time for investigation of perf graphs later) - https://gerrit.wikimedia.org/r/#/c/295376/
  • 17:16 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.7/extensions/Graph/lib/graph2.compiled.js: pre-train backport: Updated to latest graph2 lib (duration: 00m 31s)
  • 17:10 yurik_: deployed graphoid https://gerrit.wikimedia.org/r/#/c/295367/
  • 17:06 logmsgbot: thcipriani@tin Synchronized wmf-config/throttle.php: Temporary IP Cap Lift on es.wiki and commons (duration: 00m 24s)
  • 16:33 yurik_: deployed and restarted graphoid with scap3
  • 16:32 gehel: starting installation of new elasticsearch server elastic1032.eqiad.wmnet
  • 15:58 gehel: puppet run on tin to enable scap3 deployment for graphoid
  • 15:53 logmsgbot: catrope@tin Synchronized php-1.28.0-wmf.7/extensions/Echo/: (no message) (duration: 00m 33s)
  • 15:44 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy Compact Language Links as default (Stage 1) (duration: 00m 25s)
  • 15:42 logmsgbot: thcipriani@tin Synchronized wmf-config/db-eqiad.php: Repool db1068 with low weight; depool db1061 and db1062 (duration: 00m 30s)
  • 15:20 logmsgbot: hashar@tin Finished scap: testwiki to group0 (previously was labtestwiki which does not work) (duration: 51m 45s)
  • 14:47 moritzm: rolling restart of aqs service on aqs1001-aqs1006 to pick up new firejail settings
  • 14:28 logmsgbot: hashar@tin Started scap: testwiki to group0 (previously was labtestwiki which does not work)
  • 14:14 moritzm: correction: restbase1007 was already depooled for cassandra maintenance, thus only rebooting to 4.4
  • 14:12 moritzm: depooling restbase1007 for upgrade to Linux 4.4
  • 14:09 logmsgbot: hashar@tin scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="labtestwiki" --outdir="/tmp/scap_l10n_87423667" --threads=4 --lang en --quiet' returned non-zero exit status 255 (duration: 02m 58s)
  • 14:06 logmsgbot: hashar@tin Started scap: (no message)
  • 14:03 gehel: disabling alerting for maps100?\.eqiad\.wmnet during initial installation
  • 14:02 logmsgbot: hashar@tin scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="labtestwiki" --outdir="/tmp/scap_l10n_2087727834" --threads=4 --lang en --quiet' returned non-zero exit status 255 (duration: 06m 37s)
  • 13:55 logmsgbot: hashar@tin Started scap: testwiki to 1.28.0-wmf.7 (take three) T136973
  • 13:55 logmsgbot: hashar@tin scap aborted: testwiki to 1.28.0-wmf.7 (take two) T136973 (duration: 01m 35s)
  • 13:53 logmsgbot: hashar@tin Started scap: testwiki to 1.28.0-wmf.7 (take two) T136973
  • 13:53 logmsgbot: hashar@tin scap aborted: testwiki to 1.28.0-wmf.7 T136973 (duration: 04m 17s)
  • 13:48 logmsgbot: hashar@tin Started scap: testwiki to 1.28.0-wmf.7 T136973
  • 13:15 hashar: T136973 applied all security patches to 1.28.0-wmf.7
  • 13:11 RoanKattouw: Running extensions/Echo/maintenance/removeOrphanedEvents.php on all Echo-enabled wikis for T136425
  • 12:57 moritzm: rolling restart of hhvm/apache in codfw for expat security update
  • 12:49 RoanKattouw: Running extensions/Echo/maintenance/backfillReadBundles.php on all Echo-enabled wikis for T136368
  • 12:49 RoanKattouw: Running extensions/Echo/maintenance/backfillReadBundles.php on all Echo-enabled wikis
  • 12:36 hoo: Started a new JSON dump creation on snapshot1003 (after the last one was inconsistent, per T138291)
  • 12:35 gehel: lowering throttling limit for index recovery on codfw elasticsearch cluster
  • 12:33 hoo: Removed Wikidata json dumps from 20160620 (inconsistent, per T138291).
  • 12:30 hashar: T136973 started cut of branch wmf/1.28.0-wmf.7
  • 12:25 gehel: lowering throttling limit for index recovery on eqiad elasticsearch cluster
  • 11:06 jynus: reimaging db1068
  • 10:32 godog: reboot ms-be2003 for disk ordering - T137785
  • 10:22 moritzm: installing expat security updates on Ubuntu systems
  • 10:03 moritzm: installing wget security updates on Ubuntu systems
  • 09:43 gehel: lowering disk high watermark to rebalance elasticsearch eqiad cluster disk space
  • 09:25 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1068; repool db1070 and db1071 as api (duration: 00m 27s)
  • 09:22 moritzm: rolling reboot of logstash cluster to Linux 4.4
  • 07:41 elukey: restarted hhvm on mw1141 - hhvm was getting SEGV (dump in /tmp/hhvm.8735.bt.)
  • 07:39 elukey: restarted hhvm on mw1139 (hhvm-dump in /tmp/hhvm.20736.bt.)
  • 06:41 moritzm: restarted hhvm on mw1252
  • 02:10 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jun 21 02:10:55 UTC 2016 (duration 6m 36s)
  • 02:04 logmsgbot: l10nupdate@tin LocalisationUpdate failed (1.28.0-wmf.6) at 2016-06-21 02:04:19+00:00

2016-06-20

  • 23:22 Dereckson: `mwscript namespaceDupes.php ptwikinews --fix` (T138230). Some links and revisions are still to fix.
  • 23:16 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Fix pt.wikinews namespace issue (T138230) (duration: 00m 24s)
  • 23:13 logmsgbot: dereckson@tin Synchronized wmf-config/mobile.php: Remove old mobile workaround for Wikidata descriptions (T127250, T138085) (duration: 00m 33s)
  • 21:05 logmsgbot: aude@tin Synchronized php-1.28.0-wmf.6/extensions/Wikidata: Fix property suggester (duration: 01m 59s)
  • 19:50 chasemp: cleaning up /scratch NFS share as it ran out of inodes
  • 19:17 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/includes/api/ApiStashEdit.php: 82e14dc66f478fbdb9ca6eab1eeb4f9c68c99bd1 (duration: 00m 36s)
  • 18:09 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Pool db1071 with low weight after maintenance (duration: 00m 26s)
  • 17:32 bd808: https://tools.wmflabs.org/sal missing events between 2016-06-19T12:29 and 2016-06-20T17:26.
  • 17:26 gehel: deploying latest WDQS
  • 17:19 godog: upload libphutil / arcanist 0~git20160616-0wmf1 to jessie-wikimedia T137770
  • 17:18 mark: Rebooting pfw-codfw
  • 17:00 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: revert cll patch (duration: 00m 25s)
  • 15:44 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow sysops to add to/remove from confirmed on ca.wikinews (duration: 00m 25s)
  • 15:37 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable NewUserMessage on pl.wikipedia (duration: 00m 25s)
  • 15:31 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.6/extensions/CentralAuth: SWAT: queryAttached into cheap and expensive part (duration: 00m 31s)
  • 15:20 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Flow beta feature on frwikiquote (duration: 00m 28s)
  • 15:13 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy Compact Language Links as default (Stage 1) PART III (duration: 00m 30s)
  • 15:12 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Deploy Compact Language Links as default (Stage 1) PART II (duration: 00m 29s)
  • 15:12 logmsgbot: thcipriani@tin Synchronized dblists/cll-nondefault.dblist: SWAT: Deploy Compact Language Links as default (Stage 1) PART I (duration: 00m 29s)
  • 15:11 logmsgbot: jmm@palladium conftool action : select; selector: name=mw1099.eqiad.wmnet
  • 15:04 logmsgbot: thcipriani@tin Synchronized dblists/visualeditor-default.dblist: SWAT: Enable VisualEditor by default for all users of French Wikinews (duration: 00m 29s)
  • 13:27 elukey: restarted hhvm on mw1145 after temp. freeze due to memory pressure (hhvm debug in /tmp/hhvm.17794.bt.)
  • 13:27 paravoid: reactivating peerings with Telia Carrier/AS1299 (eqiad/codfw/ulsfo)
  • 13:06 Amir1: full deployment for 8e65182 in ores nodes
  • 13:04 Amir1: deploying 8e65182 to scb2001
  • 12:56 gehel: installing maps1001.eqiad.wmnet (secondary cluster, no traffic there yet) - T138092
  • 12:56 paravoid: deactivating peerings with Telia Carrier/AS1299 (eqiad/codfw/ulsfo)
  • 12:41 moritzm: rebooting ms1001 for update to Linux 4.4
  • 12:13 Amir1: started deploying ores in scb2001 bdc1e2bd
  • 11:36 godog: roll-restart swift on ms-be1* to apply https://gerrit.wikimedia.org/r/294691
  • 11:27 Amir1: for ores in scb nodes
  • 11:27 Amir1: rollbacking ae71d842dfc0958e06922062dd09d49243332a6a
  • 11:13 _joe_: restarting uwsgi orse service
  • 10:58 Amir1: deploying bdc1e2b in ores nodes
  • 10:53 godog: roll-restart swift on ms-be2* to apply https://gerrit.wikimedia.org/r/294691
  • 10:44 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1071 completelly (duration: 00m 25s)
  • 10:35 jynus: db1071 stop, backup and reimage
  • 10:31 mobrovac: restbase started mobile-sections dump for eswiki on restbase1009 for T136964
  • 10:05 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1073 at 100% weight; depool db1071 for reimaging (duration: 00m 27s)
  • 09:50 moritzm: rolling reboot of restbase2001/restbase2002 for upgrade to Linux 4.4
  • 08:57 Amir1: deploying 5dfe738 in ores nodes
  • 08:15 moritzm: installing libxlst security updates
  • 07:43 gehel: rebalancing shards on elasticsearch eqiad cluster
  • 06:47 _joe_: activating the jessie jobrunner, mw1299
  • 05:57 logmsgbot: ori@tin Synchronized wmf-config/CommonSettings.php: Id5804a80: Better cache headers for 'Powered by MediaWiki' badge (2/2) (duration: 00m 35s)
  • 05:56 logmsgbot: ori@tin Synchronized static/images: Id5804a80: Better cache headers for 'Powered by MediaWiki' badge (1/2) (duration: 00m 33s)
  • 02:29 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Jun 20 02:29:01 UTC 2016 (duration 5m 44s)
  • 02:23 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 09m 54s)

2016-06-19

  • 12:29 elukey: restarted hhvm on mw1138 - trace in /tmp/hhvm.25048.bt, hhvm killed by OOM
  • 12:27 elukey: restarted hhvm on mw1114 - trace in /tmp/hhvm.11092.bt, hhvm killed by OOM
  • 02:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Jun 19 02:31:25 UTC 2016 (duration 5m 47s)
  • 02:25 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 50s)

2016-06-18

  • 02:32 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jun 18 02:32:26 UTC 2016 (duration 6m 18s)
  • 02:26 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 04s)

2016-06-17

  • 21:21 urandom: Reenabling puppet and resetting configuration on xenon.eqiad.wmnet : T137419
  • 20:39 urandom: Restarting Cassandra on xenon.eqiad.wmnet to apply -XX:+PreserveFramePointer : T137419
  • 20:35 urandom: Disabling puppet on xenon.eqiad.wmnet : T137419
  • 20:23 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.6/extensions/WikimediaEvents/: https://gerrit.wikimedia.org/r/#/c/294958/ (duration: 00m 33s)
  • 18:56 urandom: Restarting Cassandra on xenon.eqiad.wmnet with -XX:+PreserveFramePointer : T137419
  • 18:32 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1073 with low weight after reimage (duration: 00m 35s)
  • 16:29 moritzm: installing squid security updates on carbon
  • 15:59 urandom: Starting html dumps from xenon.eqiad.wmnet and cerium.eqiad.wmnet : T137419
  • 15:54 urandom: Restarting Cassandra on xenon.eqiad.wmnet to enable large pages : T137419
  • 14:55 mobrovac: scb disabling puppet for stopping change-prop to clear transclusion queues
  • 14:16 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Increase db1072 weight after repooling (duration: 00m 36s)
  • 12:57 jynus: stopping, backuping and reimaging db1073
  • 12:49 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1072 with low weight, depool db1073 (duration: 00m 27s)
  • 12:49 moritzm: rolling reboot of mw1157-mw1160 into new kernels
  • 12:27 moritzm: restarted hhvm on mw1133 and mw1135
  • 11:14 moritzm: stopping puppet on hosts using service::node (restbase, sca, scb, aqs) for step-by-step rollout of two puppet patches for firejail/service::node
  • 09:31 _joe_: powercycling mw1140, OOMd
  • 09:30 moritzm: rolling reboot of mw1153,mw1155,mw1156 into new kernels
  • 08:29 hashar: Restarting Jenkins on gallium. Web interface at least is deadlocked somehow
  • 07:23 jynus: backuping and reimaging db1072
  • 07:18 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1072 for maintenance (duration: 00m 31s)
  • 07:11 mobrovac: restbase started mobile-sections dump on restbase1009 for T136964
  • 07:02 mobrovac: change-prop restarting it to apply https://gerrit.wikimedia.org/r/294880
  • 06:40 moritzm: installing apache update on palladium
  • 06:16 akosiaris: _joe_ restarted zotero on sca1001
  • 06:16 akosiaris: restarted zotero on sca1002
  • 06:04 logmsgbot: root@palladium conftool action : set/weight=25; selector: cluster=api_appserver,name=mw127.*
  • 05:58 logmsgbot: root@palladium conftool action : set/pooled=yes:weight=20; selector: cluster=api_appserver,name=mw127.*
  • 02:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jun 17 02:31:00 UTC 2016 (duration 6m 26s)
  • 02:24 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 09m 46s)

2016-06-16

  • 23:44 logmsgbot: ebernhardson@tin Synchronized php-1.28.0-wmf.6/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: T137167: TextCat A/B test for Language Identification (duration: 00m 25s)
  • 23:24 logmsgbot: ebernhardson@tin Synchronized php-1.28.0-wmf.6/extensions/WikimediaEvents/extension.json: T137167: TextCat A/B test for Language Identification (duration: 00m 24s)
  • 23:19 logmsgbot: ebernhardson@tin Synchronized php-1.28.0-wmf.6/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: T137167: TextCat A/B test for Language Identification (duration: 00m 24s)
  • 23:16 logmsgbot: ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: T137167: search: Dependent config for textcat AB test. (duration: 00m 26s)
  • 23:11 logmsgbot: ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: T137888: Two permission changes at urwiki (duration: 00m 27s)
  • 23:07 logmsgbot: ebernhardson@tin Synchronized wmf-config/InitialiseSettings-labs.php: T127250: Prepare Wikidata descriptions on mobile for production rollout (duration: 00m 27s)
  • 22:33 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.6/extensions/Kartographer: https://gerrit.wikimedia.org/r/294856 https://gerrit.wikimedia.org/r/294855 (duration: 00m 30s)
  • 22:24 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/294854/ (duration: 00m 26s)
  • 21:15 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.6/extensions/VisualEditor/ApiVisualEditor.php: Pass empty summary to parseAndStash() to avoid warnings T137995 (duration: 00m 39s)
  • 19:05 logmsgbot: hashar@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.6
  • 18:37 tgr: running invalidateUserSessions.php for T137799
  • 18:22 mobrovac: change-prop deploying bc87a1fecfa
  • 16:36 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Set all new slaves to medium weight (300) after warm up (duration: 00m 25s)
  • 15:37 jynus: deleted sqldata.s6 from labsdb1008 - space issues caused by queries creating temporary tables
  • 15:27 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.6/extensions/ORES/includes/Hooks.php: SWAT: Performance boost on hidenondamaging (duration: 00m 35s)
  • 15:23 moritzm: rolling reboot of restbase1008 - restbase1011 for upgrade to Linux 4.4
  • 15:21 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.6/extensions/ORES: SWAT: Skip when an edit is errored in PopulateDatabase.php (duration: 00m 30s)
  • 15:04 logmsgbot: root@palladium conftool action : set/pooled=yes; selector: name=mw1262.eqiad.wmnet
  • 14:31 twentyafterfour: re-enabled and ran puppet agent --test on iridium. Everything appears to be normal.
  • 13:04 mobrovac: scb1001 enabled puppet back
  • 12:57 gehel: rebalancing shards on elasticsearch equiad cluster
  • 12:33 Amir1: manually restarted celery-ores-worker in scb1001
  • 12:32 moritzm: installing apache2 trusty update on graphite1001
  • 12:32 Amir1: manually restarted celery-ores-worker in scb1002
  • 12:10 moritzm: restarted hhvm on mw1137, got stuck
  • 10:44 moritzm: depooling mw1154 for kernel update/reboot
  • 10:14 mobrovac: scb1001 disabling puppet for a while to manually test changeprop with transclusion rules
  • 09:59 mobrovac: restbase deploy end of ebeaa46
  • 09:56 _joe_: powercycling mw1143, unresponsive on ssh, console
  • 09:48 mobrovac: restbase deploy start of ebeaa46
  • 09:18 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.6/extensions/MobileFrontend: MobileFrontend RL registration issue preventing Special:Nearby from working properly T137919 (duration: 00m 36s)
  • 08:41 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Pool db1085, increase weight of all new db servers (duration: 00m 29s)
  • 08:15 jynus: rebooting db1085 before putting it back into production
  • 02:34 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 15m 49s)
  • 00:57 twentyafterfour: puppet disabled on iridium because https://gerrit.wikimedia.org/r/#/c/294653/ needs to merge (hotfix in preamble.php which puppet will undo if it's allowed to run)
  • 00:43 twentyafterfour: phabricator upgrade/maintenance complete. Everything appears to be back up and running normally.
  • 00:41 twentyafterfour: taking phabricator offline momentarily for scheduled maintenance.
  • 00:24 robh: mw1147 rebooted and manually running scap pull
  • 00:21 robh: mw1147 seems to have died during scap, unresponsive from serial console, powercycled
  • 00:16 logmsgbot: mattflaschen@tin Synchronized php-1.28.0-wmf.6/extensions/Kartographer: Search for maplinks inside and outside of content. (duration: 01m 08s)

2016-06-15

  • 23:38 logmsgbot: mattflaschen@tin Synchronized php-1.28.0-wmf.6/extensions/Echo: Sync Echo fix for cross-wiki notifications: 62324e3 (duration: 00m 33s)
  • 21:32 logmsgbot: aaron@tin Synchronized wmf-config/filebackend-production.php: Set "sync" filebackend replication to measure latency effect (duration: 00m 25s)
  • 21:27 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/includes/libs/objectcache/WANObjectCache.php: faff8f1ef1bfefd1804a3f46e58566711faa3224 (duration: 00m 27s)
  • 21:16 dapatrick: Deployed patch for T137264 to wmf.5 and wmf.6
  • 20:17 logmsgbot: hashar@tin Synchronized wmf-config/throttle.php: Temporary IP Cap Lift on es.wiki T137917 (duration: 00m 30s)
  • 20:09 subbu: finished deploying parsoid sha 3445eceb
  • 20:05 bblack: cache frontend restarts complete
  • 20:04 subbu: synced new code; restarted parsoid on wtp1001 as a canary
  • 20:02 subbu: starting parsoid deploy
  • 19:25 bblack: rolling restart of global varnish frontends (salt -b 1: depool -> sleep 15 -> restart -> repool) - estimated ~35 mins to completion - T107236 (...._
  • 19:15 bblack: varnish frontend restart halted - v4 compat issue to address :P
  • 19:11 bblack: rolling restart of global varnish frontends (salt -b 1: depool -> sleep 15 -> restart -> repool) - estimated ~30 mins to completion - T107236
  • 19:05 logmsgbot: hashar@tin rebuilt wikiversions.php and synchronized wikiversions files: (no message)
  • 18:54 ori: Started MySQL on es2019 (T130702)
  • 16:32 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1023; pool db1085 (disabled), db1088, db1092 w/low weight (duration: 00m 25s)
  • 16:07 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix autopatrolled group for ko.wikipedia (duration: 00m 31s)
  • 16:00 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.6/resources/src/mediawiki.special/mediawiki.special.search.styles.css: SWAT: Explicitly specify the width of the search input on Special:Search (duration: 00m 25s)
  • 15:53 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add autopatrolled group in kowiki (duration: 00m 24s)
  • 15:33 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy ORES beta feature in wikidatawiki (duration: 00m 24s)
  • 15:23 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: kafka1002.eqiad.wmnet
  • 15:23 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/extensions/ORES: SWAT: Skip when an edit is errored in PopulateDatabase.php (duration: 00m 27s)
  • 15:17 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Send authentication events to logstash (duration: 00m 28s)
  • 15:15 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: kafka1002.eqiad.wmnet
  • 15:11 logmsgbot: thcipriani@tin Synchronized wmf-config/logging.php: SWAT: Fix logging config for authmanager metrics channel rename (duration: 00m 24s)
  • 15:10 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: kafka1001.eqiad.wmnet
  • 15:06 logmsgbot: thcipriani@tin Synchronized wmf-config/throttle.php: SWAT: Remove old throttle rules (duration: 00m 30s)
  • 15:00 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: kafka1001.eqiad.wmnet
  • 15:00 mobrovac: scb disabled puppet for stopped change-prop during kafka nodes upgrade
  • 15:00 elukey: rebooting Eqiad Event Bus for kernel upgrades (one node at the time)
  • 14:24 moritzm: installing php security updates on jessie systems
  • 13:55 moritzm: remove unused PHP packages from the recently provisioned jessie app servers (new installation are fixed in puppet to only install php5-cli, but the initial set needs fixed up manually)
  • 13:40 gehel: rolling back update of firejail on maps2001
  • 13:16 _joe_: stopped jobchron, jobrunner on mw1299, masked in systemd
  • 13:15 mobrovac: change-prop deployed 6ad337
  • 13:06 moritzm: installing libav security updates
  • 12:37 _joe_: rebooting mw1299
  • 12:06 gehel: upgrade of firejail on maps server stopped, pending a patch to service::node
  • 11:46 mobrovac: scb enabled puppet back
  • 11:44 gehel: upgrading firejail to 0.9.38 on maps servers
  • 11:32 mobrovac: scb disabled puppet for 5 min to keep change-prop down
  • 11:30 mobrovac: change-prop deploying 353b926
  • 11:29 jynus: stopping db1023 for cloning to new s6 hosts
  • 11:22 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Increase new enwiki dbs weight, depool db1023 for cloning (duration: 00m 27s)
  • 11:13 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1033, first pool of db1079, db1086, db1094 with low weight (duration: 00m 25s)
  • 11:11 moritzm: enabed firejail wrapper for imagemagick's convert (for image scalers and the Score extension)
  • 10:59 paravoid: rebooting install2001 again
  • 10:48 logmsgbot: jmm@tin Synchronized wmf-config/CommonSettings.php: firejail security hardening for image scalers (duration: 00m 26s)
  • 09:48 godog: bounce ms-be2003, xfs high load
  • 09:13 moritzm: repooled mw1154 (kernel still the same ATM)
  • 08:53 moritzm: depooling mw1154 (image scaler) for kernel update
  • 08:29 jynus: turning down db1033 for cloning to new s7 slaves
  • 08:15 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1033 for cloning (duration: 00m 38s)
  • 06:59 moritzm: installing apache trusty updates on eqiad app servers
  • 03:51 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/includes/parser/Parser.php: 4e6e1bc1f2de000f0fdd84dcf04f63a21127d24a (duration: 00m 30s)
  • 03:49 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/includes/parser/Parser.php: 23bac8905a9d60cdc0a068ca025644e091b9027f (duration: 00m 32s)
  • 03:10 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jun 15 03:10:57 UTC 2016 (duration 6m 55s)
  • 03:04 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 16m 29s)
  • 02:30 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 12m 24s)
  • 02:29 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.5/extensions/Scribunto/engines/LuaCommon/TitleLibrary.php: revert: ad-hoc debug of vary-revision in scribunto (duration: 00m 29s)
  • 02:22 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.5/extensions/Scribunto/engines/LuaCommon/TitleLibrary.php: ad-hoc debug of vary-revision in scribunto (duration: 00m 26s)
  • 01:51 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.5/resources/src/mediawiki.action/mediawiki.action.edit.stash.js: Idfad8407: Improve client-side edit stash change detection (duration: 00m 24s)
  • 01:31 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/includes/parser: 78de24a20c4662ea709e1f8af84bb5fae4aea2fa (duration: 00m 33s)
  • 01:30 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/includes/parser: 48652dfc27d1bbaab41b3a4d8f7d6be23e2da6b6 (duration: 00m 34s)

2016-06-14

  • 23:40 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.6/resources/src/mediawiki.action/mediawiki.action.edit.stash.js: Idfad8407c8e: Improve client-side edit stash change detection (duration: 00m 25s)
  • 23:30 logmsgbot: ori@tin Synchronized wmf-config/InitialiseSettings.php: Id800a9d35b: Set import sources for he.wikipedia (T137074) and If66f307a2e: Set import sources for pt.wikinews (T137633) (duration: 00m 27s)
  • 23:28 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.6/extensions/AntiSpoof: I2e407a3ac8: Revert "Make sure AntiSpoof mappings are mapping in the correct direction." (duration: 00m 27s)
  • 23:15 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.6/extensions/Echo: If07369cb1: Allow the primary link to set all bundled notifications as read (T136368) (duration: 00m 34s)
  • 23:09 logmsgbot: ori@tin Synchronized wmf-config/abusefilter.php: I4e5e4d227: Set $wgAbuseFilterConditionLimit = 2000 for commonswiki (T132048) (duration: 00m 28s)
  • 22:43 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/includes/deferred: 0d038de1414c0b4faed1cc9882151e68d86d3b2d (duration: 00m 25s)
  • 22:15 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/includes/deferred: 29863094805baed7a5fa493c99c87745ce041f49 (duration: 00m 27s)
  • 21:50 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/resources: 7898fd2fa969342a5cc30df6a5757f4642cd6118 (duration: 00m 28s)
  • 21:44 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/includes: 7898fd2fa969342a5cc30df6a5757f4642cd6118 (duration: 01m 12s)
  • 21:33 logmsgbot: gehel@palladium conftool action : set/pooled=no; selector: name=maps-test2.*
  • 21:28 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: name=maps-test2*
  • 21:28 gehel: sending traffic back to old maps servers (T137620)
  • 21:10 logmsgbot: gehel@palladium conftool action : set/pooled=no; selector: name=maps-test2*
  • 21:09 logmsgbot: gehel@palladium conftool action : set/pooled=no; selector: maps-test2001.codfw.wmnet (tags: ['dc=codfw', 'cluster=maps', 'service=kartotherian'])
  • 21:09 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: maps2004.codfw.wmnet (tags: ['dc=codfw', 'cluster=maps', 'service=kartotherian'])
  • 21:08 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: maps2003.codfw.wmnet (tags: ['dc=codfw', 'cluster=maps', 'service=kartotherian'])
  • 21:08 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: maps2002.codfw.wmnet (tags: ['dc=codfw', 'cluster=maps', 'service=kartotherian'])
  • 20:58 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: maps2001.codfw.wmnet (tags: ['dc=codfw', 'cluster=maps', 'service=kartotherian'])
  • 20:55 gehel: pooling maps2001 (new map server) - T137620
  • 20:50 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/includes: ca9068daffb49cc0cdfb84385a29aea34df155cd (duration: 01m 51s)
  • 20:46 gehel: adding new maps servers to LVS
  • 20:09 logmsgbot: demon@tin Finished scap: wikidata submodule update for wmf.6 (duration: 25m 51s)
  • 19:43 logmsgbot: demon@tin Started scap: wikidata submodule update for wmf.6
  • 19:30 logmsgbot: demon@tin Finished scap: group0 to 1.28.0-wmf.6 (duration: 26m 43s)
  • 19:03 logmsgbot: demon@tin Started scap: group0 to 1.28.0-wmf.6
  • 18:56 logmsgbot: demon@tin Purged l10n cache for 1.27.0-wmf.23
  • 18:54 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.4
  • 18:54 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.3
  • 18:54 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.2
  • 18:53 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.1
  • 17:22 Dereckson: Run initSiteStats.php for arcwiki and htwiki (T137827)
  • 16:48 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1040, pool for the first time db1081, db1084, db1091 (duration: 00m 34s)
  • 16:41 godog: reimage ms-fe3002 with jessie T117972
  • 15:57 yurik: deployed & restarted kartotherian (fixing spec.config tests)
  • 15:54 urandom: Restarting cassandra-metrics-collector on restbase1007 : T137304
  • 15:53 logmsgbot: thcipriani@tin Synchronized wmf-config: SWAT: Beta: Enable Compact Language Links for new users (duration: 00m 31s)
  • 15:41 logmsgbot: hashar@tin scap aborted: testwiki to php-1.28.0-wmf.6 and rebuild l10n cache (duration: 01m 31s)
  • 15:40 logmsgbot: hashar@tin Started scap: testwiki to php-1.28.0-wmf.6 and rebuild l10n cache
  • 15:35 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/extensions/CentralAuth/includes/CentralAuthHooks.php: SWAT: Account for changed login process (duration: 00m 26s)
  • 15:27 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add nonecho.dblist and echo.dblist PART III (duration: 00m 27s)
  • 15:26 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Add nonecho.dblist and echo.dblist PART II (duration: 00m 26s)
  • 15:26 godog: reimage ms-fe3001 with jessie T117972
  • 15:25 logmsgbot: thcipriani@tin Synchronized dblists: SWAT: Add nonecho.dblist and echo.dblist PART I (duration: 00m 28s)
  • 15:19 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add nonecho.dblist and echo.dblist PART III (duration: 00m 28s)
  • 15:18 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Add nonecho.dblist and echo.dblist PART II (duration: 00m 30s)
  • 15:18 logmsgbot: thcipriani@tin Synchronized dblists: SWAT: Add nonecho.dblist and echo.dblist PART I (duration: 00m 30s)
  • 15:09 yurik: deployed & restarted kartotherian
  • 15:07 logmsgbot: thcipriani@tin Synchronized dblists/visualeditor-default.dblist: SWAT: Enable VisualEditor by default on eleven Wikivoyages (duration: 01m 49s)
  • 13:47 hashar: T136971 Cutting MediaWiki branches 1.28.0-wmf.6
  • 13:40 moritzm: installing apache trusty updates on codfw app servers
  • 13:28 paravoid: rebooting install2001, T137647
  • 12:58 moritzm: installing apache trusty updates on canary app servers
  • 12:55 mobrovac: change-prop deployed f34fb06c99
  • 12:27 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1024, pool for the first time db1090 with low weight (duration: 00m 38s)
  • 11:30 mobrovac: scb disabling puppet for 10 mins or so to keep change-prop down
  • 11:15 akosiaris: T134242 rebooting alsafi.wikimedia.org hassaleh.codfw.wmnet kraz.wikimedia.org mx2001.wikimedia.org planet2001.codfw.wmnet pollux.wikimedia.org pybal-test2001.codfw.wmnet pybal-test2002.codfw.wmnet pybal-test2003.codfw.wmnet for qemu-kvm upgrade
  • 11:13 akosiaris: T134242 install qemu-system-common, qemu-system-x86 1:2.5+dfsg-4~bpo8+1 from jessie-backports on ganeti200{1,2,3,4,5,6}
  • 11:04 _joe_: pooling all the new codfw appservers that have been installed - mw2215-mw2240 (T135466)
  • 10:56 _joe_: pooling the new jessie appservers, mw1263-71
  • 10:52 logmsgbot: oblivian@palladium conftool action : set/weight=30; selector: cluster=appserver,dc=eqiad,name=mw12[67].*
  • 09:27 godog: roll-restart swift proxy in codfw and eqiad
  • 09:04 hashar: gallium: manually removing cron entry zuul_repack from user zuul. Causes cron spam due to zuul merger no more being on gallium T137418
  • 08:59 jynus: stopping db1040 for cloning to new s4 hosts
  • 08:28 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1040 for cloning (duration: 00m 32s)
  • 08:23 _joe_: powercycling mw1154, unresponsive
  • 07:19 jynus: powercycling mw1156, could not regain control after OOM
  • 07:18 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1024, increase weight of db1082, db1087 and db1092 (duration: 10m 50s)
  • 07:05 _joe_: rolling reboot of mw2233-40
  • 06:47 _joe_: rebooting mw2228
  • 06:43 _joe_: rebooting mw2228
  • 06:29 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Pool db1052, db1080, db1083, db1089 (duration: 01m 31s)
  • 02:39 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jun 14 02:39:50 UTC 2016 (duration 5m 59s)
  • 02:33 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 12m 14s)

2016-06-13

  • 23:50 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Add ORES to whitelisted beta features (T130211) (duration: 00m 23s)
  • 23:42 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.5/extensions/ORES/includes/Hooks.php: Update links to beta features (duration: 00m 25s)
  • 23:33 ejegg: updated payments from 44102c59ac897c9acab470bf83369d233f9b736f to 2fc573cbb94e833c4144aa9dad79de8ec374bb09
  • 23:29 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Update cross-wiki upload configuration (Gerrit:293355) (duration: 00m 23s)
  • 23:10 logmsgbot: dereckson@tin Synchronized portals: (no message) (duration: 00m 24s)
  • 23:10 logmsgbot: dereckson@tin Synchronized portals/prod/wikipedia.org/assets: (no message) (duration: 00m 24s)
  • 22:51 logmsgbot: demon@tin Synchronized wmf-config/CommonSettings.php: Update extension distributor settings (duration: 00m 24s)
  • 22:42 yurik: switched to scap3 and deployed tilerator. Deployed kartotherian. Restarted.
  • 22:41 dapatrick: Deployed patches for T129738 to wmf5
  • 22:36 awight: update fundraising CRM revert from e684b7823e751558772a4de4ac23819bc601eb74 to bb9bf136dc0fa82d5d07ebeb33d696e54672b2d6
  • 22:11 awight: Updating fundraising CRM from b7b46740d701942507dca0a98a75f3f87b6b31b1 to e684b7823e751558772a4de4ac23819bc601eb74
  • 19:15 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/resources: ee2da9c2ae6fac93bf65d17b5ea48e5c47c87d47 (duration: 00m 35s)
  • 18:20 bblack: upgrading nginx (etc) on deployment-prep caches
  • 18:11 gehel: deploying latest GUI on WDQS,
  • 17:58 urandom: Upgrade of restbase1007.eqiad.wmnet (https://people.wikimedia.org/~eevans/debian/cassandra_2.2.6-wmf1_all.deb) complete : T137474
  • 17:55 urandom: Restarting restbase1007-c.eqiad.wmnet : T137474
  • 17:52 urandom: Restarting restbase1007-b.eqiad.wmnet : T137474
  • 17:47 awight: Whitelist Special:PaypalExpressGatewayResult
  • 17:43 godog: enable proxy_http apache module on graphite1003 / graphite2002 and restart apache
  • 17:38 urandom: Restarting restbase1007-a.eqiad.wmnet : T137474
  • 17:37 urandom: Upgrading restbase1007.eqiad.wmnet w/ https://people.wikimedia.org/~eevans/debian/cassandra_2.2.6-wmf1_all.deb : T137474
  • 17:35 awight: update paymentswiki from 63fbe39fbc4d671fd2705ce9e42762b7c49564c2 to 44102c59ac897c9acab470bf83369d233f9b736f
  • 16:51 _joe_: powercycling mw1115
  • 16:49 logmsgbot: thcipriani@tin Finished scap: Update l10n cache for ores (duration: 32m 04s)
  • 16:17 logmsgbot: thcipriani@tin Started scap: Update l10n cache for ores
  • 15:59 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Enable ORES on fawiki PART II (duration: 00m 24s)
  • 15:58 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable ORES on fawiki PART I (duration: 00m 25s)
  • 15:58 logmsgbot: thcipriani@tin Synchronized wmf-config/extension-list: SWAT: Add ORES to extension-list (duration: 00m 25s)
  • 15:42 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add images.nypl.org to $wgCopyUploadsDomains for commons (duration: 00m 24s)
  • 15:37 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable VE in NS_PROJECT in cswiki (duration: 00m 25s)
  • 15:32 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/extensions/Echo: SWAT: Use localized weekdays on Special:Notifications (duration: 00m 32s)
  • 15:26 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable transwiki import for la.wiktionary (duration: 00m 26s)
  • 15:22 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Permission changes in zhwiki (duration: 00m 26s)
  • 15:18 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable VisualEditor by default for logged-out users on four Wikipedias too (duration: 00m 24s)
  • 15:10 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/extensions/MobileFrontend: Do Not strip srcset on API mobileview action PART II (duration: 00m 38s)
  • 15:09 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/extensions/MobileFrontend/includes/MobileContext.php: Do Not strip srcset on API mobileview action PART I (duration: 00m 49s)
  • 15:05 godog: reboot ms-be2012 to fix disk ordering T136395
  • 14:51 godog: truncate syslog.1 on ms-be2012
  • 14:26 bblack: upgrading cp* nginx (and other oustanding minor package updates)
  • 14:23 bblack: uploaded nginx-1.11.1-1+wmf2 to carbon
  • 13:55 dcausse: restarting logstash on logstash1001
  • 11:59 mobrovac: change-prop deployed 54f98b7
  • 11:31 _joe_: rolling reboot of the new appservers in codfw + scap pull
  • 09:55 _joe_: powercycling mw1138, oom, console non-responsive
  • 09:53 jynus: stopping db1052 and cloning it to db1080, db1083 and db1089
  • 09:43 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1052 for cloning (duration: 00m 26s)
  • 08:51 moritzm: removed /var/log/logstash/logstash.log.1 on logstash1001, depleted disk space on the root partition, fallout of T137400
  • 08:43 jynus: powercycling mw1155.eqiad.wmnet , unresponsive on ssh, serial console
  • 08:31 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Increase weight of db1082, db1087, db1092 (duration: 02m 36s)
  • 08:25 logmsgbot: oblivian@palladium conftool action : set/weight=30; selector: name=mw1261.eqiad.wmnet
  • 08:17 logmsgbot: oblivian@palladium conftool action : set/pooled=yes; selector: name=mw1261.eqiad.wmnet
  • 08:01 logmsgbot: oblivian@palladium conftool action : set/pooled=no:weight=20; selector: name=mw1262.eqiad.wmnet
  • 08:00 logmsgbot: oblivian@palladium conftool action : set/pooled=no:weight=20; selector: name=mw1261.eqiad.wmnet
  • 08:00 logmsgbot: oblivian@palladium conftool action : set/pooled=no; selector: name=mw1261.eqiad.wmnet
  • 06:32 logmsgbot: oblivian@palladium conftool action : set/pooled=yes; selector: name=mw126.*
  • 02:28 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 13m 02s)

2016-06-12

  • 02:28 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 12m 24s)

2016-06-11

  • 03:14 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.5/includes/parser/CacheTime.php: remove ad-hoc logging of updateCacheExpiry(0) traces (duration: 00m 23s)
  • 03:11 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.5/includes/parser/CacheTime.php: ad-hoc logging of updateCacheExpiry(0) traces (duration: 00m 25s)
  • 02:36 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jun 11 02:36:22 UTC 2016 (duration 6m 30s)
  • 02:29 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 11m 42s)
  • 01:01 mutante: rutherfordium ganeti lockup, gnt-instance console .. and it recovered

2016-06-10

  • 23:37 awight: Update PayPal Express Checkout configuration: add API certificate path
  • 21:58 logmsgbot: ori@tin Synchronized wmf-config/mobile.php: I3d8155d7e14: Remove old config hack that disabled $wgResponsiveImages on mobile (duration: 00m 24s)
  • 19:38 mutante: cp1043/cp1044 - revoke puppet cert, salt key
  • 19:30 logmsgbot: thcipriani@tin Synchronized wmf-config/throttle.php: Fix for ip lift cap for eswiki and Temporary IP Cap Lift for eswiki (duration: 00m 23s)
  • 19:19 logmsgbot: ori@tin Synchronized multiversion: Id432e25c: MWMultiVersion: allow wiki to be specified via the environment (duration: 00m 56s)
  • 17:12 elukey: Updated the puppet compiler with new hosts/facts
  • 16:32 mutante: cp1043,cp1044 shutdown -h, confirmed not in pybal/confctl
  • 16:27 mutante: cp1043/cp1044 - decom'ing, were already "Unused spare system" but running, scheduling downtime in icinga, shutting them down and removing from torrus config and puppet (T133614)
  • 14:17 urandom: Testing patched Cassandra (dpkg -i ...; service cassandra-{a,b} restart) on restbase-test200[1-2] : T137474
  • 14:06 urandom: Testing patched Cassandra (dpkg -i ...; service cassandra-a restart) on restbase-test2001 : T137474
  • 13:59 urandom: Testing patched Cassandra (dpkg -i ...; service cassandra-a restart) on praseodymim : T137474
  • 13:58 urandom: Testing patched Cassandra (dpkg -i ...; service cassandra-a restart) on cerium : T137474
  • 13:15 urandom: Starting html dump(s) in RESTBase staging : T137474
  • 13:13 urandom: Testing patched Cassandra (dpkg -i ...; service cassandra-a restart) on xenon : T137474
  • 11:07 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/includes/specials/SpecialUserLogin.php: deploy gerrit:293704 to fix AuthManager metrics (duration: 00m 32s)
  • 11:06 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/includes/specials/SpecialCreateAccount.php: deploy gerrit:293704 to fix AuthManager metrics (duration: 00m 52s)
  • 10:56 mobrovac: scb100x enabled puppet back
  • 10:05 mobrovac: scb100x disabling puppet and stopping change-prop to look at zookeeper znodes
  • 09:22 elukey: restarted uwsgi-ores on scb200[12] as deployment follow up
  • 08:27 elukey: restarted uwsgi-ores (after a deployment + puppet run) - service was down
  • 08:01 Amir1: deploying 38df031 into scb100[12] for ores service. Expecting some down time
  • 07:59 dcausse: refilling ttmserver index on all ttm enabled wikis
  • 06:42 moritzm: bounced hhvm on mw1264 (backtrace in /tmp/hhvm.2197.bt)
  • 06:28 papaul: mw2215-mw2238 -signing puppet certs, salk-key initial run
  • 05:54 mutante: re-enabling puppet on carbon
  • 04:48 moritzm: installing squid3 security updates on Ubuntu systems
  • 03:17 logmsgbot: aaron@tin Synchronized wmf-config/CommonSettings.php: Lower $wgAPIMaxLagThreshold to 5 (duration: 00m 36s)
  • 02:35 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jun 10 02:35:15 UTC 2016 (duration 6m 2s)
  • 02:29 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 11m 32s)
  • 01:44 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/includes/specialpage/LoginSignupSpecialPage.php: deploying gerrit:293668: fix AuthManager warning spam (duration: 00m 25s)
  • 01:43 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/includes/specials/SpecialUserLogin.php: deploying gerrit:293667: fix AuthManager dashboard (duration: 00m 33s)
  • 01:42 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/includes/specials/SpecialCreateAccount.php: deploying gerrit:293667: fix AuthManager dashboard (duration: 00m 25s)
  • 00:56 kaldari: ran mwscript maintenance/updateCollation.php --wiki=tawiki --force
  • 00:40 kaldari: ran mwscript maintenance/updateCollation.php --wiki=tawikibooks --force
  • 00:39 kaldari: ran mwscript maintenance/updateCollation.php --wiki=tawikinews --force
  • 00:36 mutante: git pull on strontium because i merged a non-change
  • 00:31 kaldari: ran mwscript maintenance/updateCollation.php --wiki=tawikiquote --force
  • 00:27 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings-labs.php: ores.wikimedia.org instead of ores.wmflabs.org (duration: 00m 25s)
  • 00:21 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.5/extensions/Wikidata/extensions/ArticlePlaceholder/includes/SearchHookHandler.php: Update Wikidata - Fix uncaught exception in ArticlePlaceholder (3/3) (duration: 00m 25s)
  • 00:20 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.5/extensions/Wikidata/vendor/composer/installed.json: Update Wikidata - Fix uncaught exception in ArticlePlaceholder (2/3, no-op) (duration: 00m 25s)
  • 00:19 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.5/extensions/Wikidata/composer.lock: Update Wikidata - Fix uncaught exception in ArticlePlaceholder (1/3, no-op) (duration: 00m 27s)
  • 00:12 kaldari: ran mwscript maintenance/updateCollation.php --wiki=tawikisource --force
  • 00:07 kaldari: ran mwscript maintenance/updateCollation.php --wiki=tawiktionary --force

2016-06-09

  • 23:57 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Set Tamil projects to use uca-ta collation II (T75453) (duration: 00m 25s)
  • 23:53 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Enable Flow beta feature on frwiki (T136684) (duration: 00m 27s)
  • 23:47 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Remove HiddenPrefs hack for turning off cross-wiki notifications (T135266) (duration: 00m 27s)
  • 23:31 logmsgbot: tgr@tin Synchronized wmf-config/InitialiseSettings.php: enable AuthManager on group2 wikis T135504 (duration: 00m 24s)
  • 23:29 logmsgbot: tgr@tin Synchronized wmf-config/CommonSettings.php: enable use of group1, group2 dblists in config (duration: 00m 23s)
  • 23:28 logmsgbot: tgr@tin Synchronized dblists/group2.dblist: add dblist for group2 (duration: 00m 22s)
  • 23:20 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/includes/specialpage/LoginSignupSpecialPage.php: deploying gerrit:293636 for AuthManager T135504 (duration: 00m 25s)
  • 23:19 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/extensions/MobileFrontend/resources/skins.minerva.special.userlogin.styles/userlogin.less: deploying gerrit:293638 for AuthManager T135504 (duration: 00m 25s)
  • 23:18 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/extensions/ConfirmEdit/FancyCaptcha/resources/ext.confirmEdit.fancyCaptcha.js: deploying gerrit:293637 for AuthManager T135504 (duration: 00m 24s)
  • 22:48 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.5
  • 22:35 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/includes/site/DBSiteStore.php: Revert "Map dummy language codes in sites" Part II (duration: 00m 31s)
  • 22:35 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/includes/ServiceWiring.php: Revert "Map dummy language codes in sites" Part I (duration: 00m 23s)
  • 21:41 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/includes: 904dd4ae088a8f67942c09b2b28178377955d6a6 (duration: 01m 18s)
  • 20:57 logmsgbot: hoo@tin Synchronized wmf-config/InitialiseSettings.php: Enable the ArticlePlaceholder on nnwiki (T130997) (duration: 00m 24s)
  • 20:53 logmsgbot: hoo@tin Synchronized wmf-config/InitialiseSettings.php: Enable the ArticlePlaceholder on lvwiki (T136100) (duration: 00m 26s)
  • 20:48 logmsgbot: hoo@tin Synchronized wmf-config/InitialiseSettings.php: Enable the ArticlePlaceholder on guwiki (T136517) (duration: 00m 24s)
  • 20:40 logmsgbot: hoo@tin Synchronized php-1.28.0-wmf.4/extensions/Wikidata: Update ArticlePlaceholder (duration: 01m 54s)
  • 20:36 logmsgbot: hoo@tin Synchronized php-1.28.0-wmf.5/extensions/Wikidata: Update ArticlePlaceholder (without unrelated T136598 fixes this time) (duration: 01m 51s)
  • 20:33 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/includes/user/User.php: c3b1f80a701d61dc57ccac0c8b1dc7daf03fa925 (duration: 00m 29s)
  • 19:59 urandom: Restarting Cassandra on xenon.eqiad.wmnet (removing patched test build; restoring state) : T137474
  • 19:53 logmsgbot: hoo@tin Synchronized php-1.28.0-wmf.5/extensions/Wikidata: revert, possible s5 master overload (duration: 01m 57s)
  • 19:47 logmsgbot: hoo@tin Synchronized php-1.28.0-wmf.5/extensions/Wikidata: Update ArticlePlaceholder (duration: 02m 04s)
  • 19:44 bearND: mobileapps deployed 71ff97c
  • 19:42 bearND: starting mobileapps deploy
  • 19:11 ejegg: updated cancel page settings on payments-wiki
  • 17:43 urandom: Restarting Cassandra on xenon.eqiad.wmnet (use exponentially decaying resevoirs for metrics histograms) : T126629
  • 17:19 mobrovac: change-prop deploying ecfda93f09d
  • 17:10 ejegg: updated payments-wiki from 3dcf58e3b4e1d02ad4f1874a3e87e55b7e169bfe to 053aaa259382c94aa59e4d0da7317fcafab635cd
  • 15:31 elukey: added topic override retention.bytes=536870912000 to Kafka webrequest_text (T136690)
  • 15:22 hashar: Cleaning git-daemon on gallium (was used by zuul-merger) T137418
  • 15:19 logmsgbot: aude@tin Synchronized wmf-config/InitialiseSettings.php: Add *.nara.gov to wgCopyUploadDomains (duration: 00m 40s)
  • 14:47 mobrovac: change-prop stopped on scb1002
  • 14:38 elukey: Tested temp setting retention.bytes=2G for Analytics kafka topic webrequest_misc
  • 14:37 hashar: Removing zuul-merger from gallium
  • 14:33 hashar: stopped / disabled zuul-merger on gallium T137418
  • 14:12 mobrovac: change-prop restarting on scb1001 for update
  • 14:07 urandom: Re-enabling puppet on xenon.eqiad.wmnet, forcing a run, and restarting Cassandra : T137419
  • 13:52 mobrovac: change-prop restarting on scb1002 for update
  • 13:45 mobrovac: change-prop deploying 2161403c
  • 13:26 urandom: Restarting Cassandra on xenon.eqiad.wmnet to apply 2G file cache : T137419
  • 12:51 urandom: Restarting Cassandra on xenon.eqiad.wmnet : T126629
  • 12:33 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/extensions/LdapAuthentication/LdapPrimaryAuthenticationProvider.php: deploy gerrit:293459 to fix wikitech API login / morebots (T137377) (duration: 00m 47s)
  • 12:19 tgr: !log deploying gerrit:293459 to fix morebots (T137377)
  • 12:11 urandom: !log Restarting Cassandra on xenon.eqiad.wmnet : T126629
  • 12:06 urandom: !log Temporarily disabling puppet on xenon.eqiad.wmnet to test settings : T126629
  • 11:33 Amir1: !log manually restarting ores-uwsgi and celery-ores-worker in scb100[12]
  • 10:51 urandom: !log Restarting Cassandra on {cerium,praseodymium}.eqiad.wmnet (RESTBase staging) : T126629
  • 09:16 gehel: !log lowering disk high watermark to rebalance disk usage on elasticsearch eqiad cluster
  • 09:05 Amir1: !log restarting uwsgi-ores celery-ores-worker in scb1001 and scb1002
  • 08:55 moritzm: !log installing libtasn security updates
  • 08:38 moritzm: !log rolling restart of app server canaries for libtasn security update
  • 07:22 moritzm: !log removed /var/log/logstash/logstash.1 on logstash1001, logspam (similar to the what is described in https://github.com/logstash-plugins/logstash-output-elasticsearch/issues/144) depleted the space on the root partition
  • 02:55 logmsgbot: !log l10nupdate@tin ResourceLoader cache refresh completed at Thu Jun 9 02:55:20 UTC 2016 (duration 6m 19s)
  • 02:53 mutante: !log ms-be2012 ran out of disk due to huge syslog, deleted log, restarted rsyslogd
  • 02:49 logmsgbot: !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 11m 02s)
  • 02:26 logmsgbot: !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 11m 00s)
  • 00:03 twentyafterfour: !log Preparing to deploy phabricator update. Tagged release/2016-06-08/1

2016-06-08

  • 23:17 logmsgbot: !log maxsem@tin Synchronized php-1.28.0-wmf.5/extensions/WikimediaEvents/: https://gerrit.wikimedia.org/r/#/c/293439/ (duration: 00m 23s)
  • 23:15 logmsgbot: !log maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/293438 (duration: 00m 25s)
  • 23:07 logmsgbot: !log maxsem@tin Synchronized php-1.28.0-wmf.4/extensions/LiquidThreads/: https://gerrit.wikimedia.org/r/#/c/293247/ (duration: 00m 26s)
  • 23:05 logmsgbot: !log maxsem@tin Synchronized php-1.28.0-wmf.5/extensions/LiquidThreads/: https://gerrit.wikimedia.org/r/#/c/293247/ (duration: 00m 26s)
  • 22:51 hoo: !log Re-started dumpwikidatattl on snapshot1003
  • 22:44 logmsgbot: tgr@tin Synchronized wmf-config/InitialiseSettings.php: enable AuthManager on group1 for reals T135504 (duration: 00m 25s)
  • 22:27 logmsgbot: tgr@tin Synchronized wmf-config/InitialiseSettings.php: enable AuthManager on group1 T135504 (duration: 00m 23s)
  • 22:21 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/extensions/OpenStackManager/: backport gerrit:293130 for AuthManager deploy T135504 (duration: 00m 28s)
  • 22:05 ottomata: starting kafka broker on kafka1012 after swapping disk and copying data directory
  • 22:01 logmsgbot: krinkle@tin Synchronized wmf-config/CommonSettings.php: Bump wgResourceLoaderStorageVersion (T134368) (duration: 00m 28s)
  • 21:12 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.5
  • 21:04 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/includes/specials/SpecialSearch.php: Add a visual clear to Special:Search input box and profile-tabs (duration: 00m 23s)
  • 20:57 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/extensions/Renameuser/RenameuserSQL.php: Use master DB when touching the user to signal rename end (duration: 00m 22s)
  • 20:50 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/includes/libs/objectcache/WANObjectCache.php: Avoid getWithSetCallback() warnings on unversioned key migration (duration: 00m 24s)
  • 20:21 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/extensions/Kartographer/styles/kartographer.less: Fixed <maplink> autostyling (duration: 00m 26s)
  • 20:18 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.4/extensions/Kartographer: late SWAT: Fix color extraction (duration: 00m 36s)
  • 19:30 mobrovac: change-prop deploying 08a1b1d
  • 19:27 hashar: gallium enabling puppet again now that zuul/jenkins are back
  • 19:18 hashar: Bringing back Jenkins and Zuul on gallium T137265
  • 18:59 logmsgbot: ori@palladium conftool action : set/pooled=yes; selector: name=scb1002.eqiad.wmnet
  • 18:57 yurik: switched kartotherian to scap3, deployed, restarted
  • 18:20 gehel: switching maps to scap3 deployment
  • 16:50 jynus: cloning /var/lib/jenkins from db1085 to contint1001
  • 16:46 ottomata: stopping kafka broker and puppet on kafka1012 to replace sdf
  • 16:37 ottomata: powercycling scb1002
  • 16:36 hashar: Disabled puppet on contint1001 to prevent it from bringing back Jenkins
  • 16:32 logmsgbot: otto@palladium conftool action : set/pooled=no; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mathoid'])
  • 16:32 logmsgbot: otto@palladium conftool action : set/pooled=no; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 16:32 logmsgbot: otto@palladium conftool action : set/pooled=no; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mobileapps'])
  • 16:32 logmsgbot: otto@palladium conftool action : set/pooled=no; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=cxserver'])
  • 16:32 logmsgbot: otto@palladium conftool action : set/pooled=no; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=citoid'])
  • 16:32 logmsgbot: otto@palladium conftool action : set/pooled=no; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=graphoid'])
  • 16:24 ottomata: restarting hadoop-yarn-resourcemanager on analytics1002 to make analytics1001 active
  • 16:07 mobrovac: scb1002 enabling back puppet
  • 16:02 elukey: temporary set a 10TB upperbound to the Kafka webrequest_text topic to free space (T136690)
  • 15:43 ottomata: restarting zk in codfw and eqiad 1 by 1 to apply maxClientCnxns=1024
  • 15:12 ottomata: restarting zookeeper 1 by 1 in eqiad
  • 15:03 _joe_: contint1001: systemctl mask zuul,zuul-merger
  • 14:57 elukey: rolling out the new Varnishkafka version in cache misc (didn't do it before since there was an outage ongoing)
  • 14:53 jynus: rebooting gallium with netboot for hardware maintenance
  • 14:44 mobrovac: scb1001 enabling and running puppet on scb1001
  • 13:44 jynus: running fsck.ext3 /dev/sda2 in read-write mode for gallium
  • 13:42 ottomata: powercycling scb2001 and scb2002
  • 13:30 akosiaris: disabling puppet on scb1001 & scb1002
  • 13:30 mobrovac: change-prop stopped on scb1002
  • 13:29 akosiaris: stopping changeprop on scb1001
  • 13:26 ottomata: powercycling scb1002
  • 13:18 ottomata: powercycling scb1001
  • 13:08 elukey: rolling out new varnishkafka package in cache misc
  • 12:09 jynus: mounted temporarily / partition from gallium sda on db1085:/mnt
  • 10:40 moritzm: uploaded jenkins 1.651.2 for jessie-wikimedia to carbon
  • 10:13 elukey: rolling out the new varnishkafka package to cache maps
  • 10:04 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/includes/deferred/LinksDeletionUpdate.php: fd44d649787ede78687b4cd2ef21e44a4c8b843b (duration: 00m 33s)
  • 08:28 hashar: stopping Jenkins / zuul / zuul-merger / puppet on gallium
  • 08:15 elukey: lowering down webrequest_text kafka topic retention time from 7 days to 4 days to free disk space (T136690)
  • 08:14 hashar: Jenkins has bunch of executors dead for what ever reason preventing jobs from running :(
  • 07:53 mobrovac: change-prop deploying 84d56e53a
  • 06:59 moritzm: enabling ferm on palladium (will lead to temporary puppet failures)
  • 02:58 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jun 8 02:58:28 UTC 2016 (duration 6m 31s)
  • 02:51 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 06m 49s)
  • 02:51 legoktm: / on gallium is currently read-only for some reason
  • 02:29 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 11m 11s)
  • 00:11 awight_: update fundraising-tools from b2425aef2154d6b689900f4848cca02880321230 to 28bc2da677caa795c58f906db76a1f8d612ac899

2016-06-07

  • 23:46 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/includes/deferred/LinksUpdate.php: 6d85caaa9bb5918cb2888fc82f2c7c346cf746a2 (duration: 00m 25s)
  • 23:36 SMalyshev: redeploying WDQS to update the Updater for T128947 fix
  • 23:35 logmsgbot: tgr@tin Synchronized wmf-config/InitialiseSettings.php: SWAT gerrit:292518 User rights configuration for meta. wmf-supportsafety group (duration: 00m 26s)
  • 23:20 logmsgbot: tgr@tin Finished scap: (no message) (duration: 24m 51s)
  • 23:02 awight: update paymentswiki from 28e10141454ef53085aed4c6619a34d3a4b43c58 to de11bfe2273d0bcaa0e713389b2d91e8b3567a1d; add PP cert
  • 22:56 tgr: scapping AuthManager backports + feature switch enabled on group0 T135504
  • 22:56 logmsgbot: tgr@tin Started scap: (no message)
  • 22:10 mutante: icinga config broken: Error: Could not find any host matching 'relforge1001'
  • 21:35 twentyafterfour: restarted apache on iridium to deploy D250
  • 20:02 andrewbogott: dist-upgrade on labvirt1010, in hopes of resolving a nova-compute lockup (possibly related to a kvm upgrade earlier today)
  • 20:00 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.28.0-wmf.5
  • 19:44 jynus: restarting es2017 due to a bunch of ACPI errors (probably memory-caused)
  • 19:35 logmsgbot: thcipriani@tin Finished scap: testwiki to php-1.28.0-wmf.5 and rebuild l10n cache (duration: 26m 40s)
  • 19:08 logmsgbot: thcipriani@tin Started scap: testwiki to php-1.28.0-wmf.5 and rebuild l10n cache
  • 18:30 andrewbogott: rebooting labvirt1011
  • 17:51 ottomata: restarting broker on kafka1020
  • 17:44 Dereckson: `mwscript initSiteStats.php --wiki kshwiki --update` on Terbium (T137234)
  • 17:33 mutante: furud - shutdown, decom, deleteV VM
  • 17:30 ejegg: updated payments-wiki from 3df3329f75fdbc679baf37bfd3955880091b3ae1 to 28e10141454ef53085aed4c6619a34d3a4b43c58
  • 17:06 logmsgbot: krinkle@tin Synchronized wmf-config/CommonSettings.php: clean-up
  • 17:05 ejegg: rolled back payments-wiki to 3df3329f75fdbc679baf37bfd3955880091b3ae1
  • 17:04 thcipriani: starting branch-cut for mediawiki and extensions for version 1.28.0-wmf.5
  • 17:04 ejegg: updated payments-wiki from 3df3329f75fdbc679baf37bfd3955880091b3ae1 to 413bd3ea92ac570c081532c71891c31391194984
  • 16:01 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Update audit hooks for AuthManager (duration: 00m 24s)
  • 15:53 logmsgbot: thcipriani@tin Synchronized wmf-config/wikitech.php: SWAT: Do not set wgAuth to LdapAuth when AuthManager is enabled (duration: 00m 23s)
  • 15:48 logmsgbot: thcipriani@tin Synchronized portals: SWAT: T135902 adding readme and license to wikipedia.org portal (duration: 00m 25s)
  • 15:48 logmsgbot: thcipriani@tin Synchronized portals/prod/wikipedia.org/assets: SWAT: T135902 adding readme and license to wikipedia.org portal (duration: 00m 25s)
  • 15:41 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: huwiki: Enable Popups A/B test for 50% of users (duration: 00m 24s)
  • 15:32 logmsgbot: thcipriani@tin Synchronized wmf-config: SWAT: Revert "Send wmf.4 search and ttmserver traffic to codfw" (duration: 00m 26s)
  • 15:24 logmsgbot: thcipriani@tin Synchronized wmf-config/PrivateSettings.php: SWAT: Use bot password for TNBot after touch wmf-config/PrivateSettings.php (duration: 00m 25s)
  • 15:16 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Use bot password for TNBot (duration: 00m 34s)
  • 15:15 logmsgbot: thcipriani@tin Synchronized private/PrivateSettings.php: SWAT: password update for Translation Notification Bot (duration: 00m 41s)
  • 14:47 elukey: installing varnishkafka 1.0.10-1 on cp1046 manually to test the new version.
  • 14:23 jynus: stopping mysql and the OS @ es2017 for hardware maintenance
  • 13:53 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Pool new s5 db hosts: db1082, db1087, db1092 with low weight (duration: 00m 23s)
  • 13:52 logmsgbot: jynus@tin Synchronized wmf-config/db-codfw.php: Add new coredb servers to alias configuration (duration: 00m 38s)
  • 13:49 jynus: about to pool new dewiki/wikidata servers T133398
  • 12:27 moritzm: rolling out gdk-pixbuf security updates
  • 12:23 moritzm: rolling restart of sca cluster for libxml2 security update
  • 11:27 moritzm: restarting apache2 on californium (hosting horizon dashboard) for libxml2 update
  • 11:23 moritzm: restarting apache2 on silver (hosting wikitech) for libxml2 update
  • 11:08 hashar: restarted apache2 on gallium for libxml2 update
  • 10:53 moritzm: restarting apache2 on iridium (hosting Phabricator) for libxml2 update
  • 10:18 moritzm: rolling restart of hhvm on eqiad appservers to pick up libxml2 update
  • 09:09 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1070 after maintenance (duration: 00m 27s)
  • 09:04 hashar: Upgrading Jenkins IRC plugin 2.25..2.27 and instant messaging plugin 1.34..1.35 . The former should fix a deadlock on shutdowning Jenkins
  • 09:00 moritzm: rolling restart of hhvm on codfw appservers to pick up libxml2 update
  • 08:53 moritzm: rolling restart of hhvm on appserver canaries to pick up libxml2 update
  • 08:28 moritzm: deploying libxml2 security updates on Ubuntu systems (Debian systems already upgraded last week)
  • 07:19 jynus: stopping and cloning db1070 to new s5 servers
  • 07:08 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1070 for cloning (duration: 00m 29s)
  • 02:30 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jun 7 02:30:57 UTC 2016 (duration 5m 32s)
  • 02:25 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 09m 36s)
  • 01:10 logmsgbot: aude@tin Synchronized php-1.28.0-wmf.4/extensions/Wikidata: Fix bug (T136093) in display of labels after edit (duration: 02m 03s)
  • 00:39 Krenair: (TXT record for SPF, actually)
  • 00:39 Krenair: Created MX and SPF records directly for wmflabs.org. for https://phabricator.wikimedia.org/T137160#2359786
  • 00:35 ejegg: updated settings on payments-wiki
  • 00:26 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.4/extensions/CentralAuth/includes/CentralAuthHooks.php: I79cbb1dc: Prefetch $wgCentralAuthLoginWiki DNS (T92864) (duration: 00m 29s)

2016-06-06

  • 23:41 logmsgbot: maxsem@tin Synchronized wmf-config/abusefilter.php: https://gerrit.wikimedia.org/r/#/c/292758/ (duration: 00m 24s)
  • 23:32 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.4/extensions/GeoData/: (no message) (duration: 00m 25s)
  • 23:29 logmsgbot: maxsem@tin Synchronized private/PrivateSettings.php: Updated Zero password (duration: 00m 25s)
  • 23:21 Amir1: deploying ae71d84 into ores in prod
  • 23:17 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/293037/ (duration: 00m 24s)
  • 23:14 logmsgbot: maxsem@tin Synchronized portals: https://gerrit.wikimedia.org/r/#/c/292992/ (duration: 00m 31s)
  • 23:13 logmsgbot: maxsem@tin Synchronized portals/prod/wikipedia.org/assets: https://gerrit.wikimedia.org/r/#/c/292992/ (duration: 00m 30s)
  • 23:04 logmsgbot: maxsem@tin Synchronized docroot/wikipedia.org/.well-known/apple-app-site-association: https://gerrit.wikimedia.org/r/#q,287190,n,z (duration: 00m 25s)
  • 22:05 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.4/includes/api/ApiStashEdit.php: 50ce579046e07 (duration: 00m 23s)
  • 20:25 arlolra: updated Parsoid to version e8d6092e
  • 20:09 arlolra: starting Parsoid deploy
  • 19:15 ottomata: restarting kafka broker on kafka1020 to test python consumption client
  • 19:12 bblack: restarted nginx on rcs1002 (was stuck half-shut-down for reload?), started nginx on rcs1001 (wasn't running at all)
  • 19:08 mutante: ran puppet on carbon because icinga said fail, saw it change STS headers, but no fail
  • 19:06 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.4/includes/page/WikiPage.php: 661c22db3a352 (duration: 00m 30s)
  • 18:08 ori: Running rebuildrecentchanges.php for test2wiki for T133225
  • 17:14 gehel: deploying latest GUI for wikidata query service
  • 16:58 logmsgbot: tgr@tin Synchronized wmf-config/PrivateSettings.php: (no message) (duration: 00m 23s)
  • 16:57 logmsgbot: tgr@tin Synchronized private/PrivateSettings.php: (no message) (duration: 00m 23s)
  • 16:44 logmsgbot: tgr@tin Synchronized wmf-config/PrivateSettings.php: (no message) (duration: 00m 23s)
  • 16:39 tgr: PrivateSettings changes were for T135074
  • 16:39 logmsgbot: tgr@tin Synchronized wmf-config/PrivateSettings.php: (no message) (duration: 00m 27s)
  • 16:37 logmsgbot: tgr@tin Synchronized private/PrivateSettings.php: (no message) (duration: 00m 26s)
  • 16:23 _joe_: rebooting mw1262
  • 16:22 logmsgbot: tgr@tin Synchronized wmf-config/CommonSettings.php: creating zeroscript grant group on zerowiki, gerrit: 292951 (duration: 00m 28s)
  • 16:00 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Math: Set wgMathFullRestbaseURL to point to wikimedia.org in production (duration: 00m 24s)
  • 15:45 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: ULS: Stop using /static/current (duration: 00m 24s)
  • 15:37 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.4/extensions/VisualEditor/modules/ve-mw/init/targets/ve.init.mw.MobileArticleTarget.js: SWAT: Fix config of mobile surfaces (duration: 00m 24s)
  • 15:32 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Use wfLoadExtension for LocalisationUpdate (duration: 00m 27s)
  • 15:21 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT Switch Wikivoyages to Single Edit Tab mode for VE Beta Feature (duration: 00m 24s)
  • 15:14 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable VisualEditor by default for logged-in users on four Wikipedias PART II (duration: 00m 30s)
  • 15:14 logmsgbot: thcipriani@tin Synchronized dblists/visualeditor-default.dblist: SWAT: Enable VisualEditor by default for logged-in users on four Wikipedias PART I (duration: 00m 29s)
  • 15:04 jynus: dropping old outreach databases on m1
  • 14:10 jynus: dropping old bugzilla databases from m1
  • 14:00 jynus: dropping database blog from m1
  • 12:34 hashar: restarted Jenkins, deadlock in IRC plugin
  • 10:46 elukey: re-added kafka1001 to eventbus.svc.eqiad.wmflabs without rebooting since some concerns were raised from the Services team. Will have a discussion with them before proceeding.
  • 10:45 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: kafka1001.eqiad.wmnet
  • 10:33 moritzm: installing perl updates (bugfixes and CVE-2015-8853)
  • 10:27 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: kafka1001.eqiad.wmnet
  • 10:25 elukey: rebooting kafka100[12] for kernel upgrades (one at the time with de-pool/re-pool actions)
  • 09:12 moritzm: installing dpkg bugfix updates on jessie systems
  • 08:45 mobrovac: change-prop deployed 9b04e475
  • 08:27 gehel: lowering elasticsearch high watermark on eqiad cluster to rebalance disk space
  • 08:17 _joe_: rebooting mw1262
  • 07:57 jynus: enabling GTID on pending coredb servers on eqiad
  • 06:18 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.4/includes/cache/LinkBatch.php: c2ba764f38e44e7 (duration: 00m 30s)
  • 05:34 robh: db2034 locked up via serial console. details on T137084, rebooting since its unresponsive to ssh or serial.
  • 02:28 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Jun 6 02:28:50 UTC 2016 (duration 5m 56s)
  • 02:22 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 09m 34s)

2016-06-05

  • 14:55 Dereckson: `mwscript initSiteStats.php --wiki csbwiki --update` (T137060)
  • 02:27 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Jun 5 02:27:38 UTC 2016 (duration 5m 35s)
  • 02:22 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 08m 55s)

2016-06-04

  • 20:18 apergos: rebooting mw1135, unresponsive to ssh or console login
  • 09:51 elukey: restarted hhvm on mw1144 after the host was hanging (OOM killer restored basic host functionalities but not hhvm)
  • 09:47 elukey: removed temporary Analytics Kafka upload retention override
  • 09:38 elukey: Lowering down temporarily the Analytics kafka upload retention time to 24h to free space (T136690)
  • 02:30 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jun 4 02:30:50 UTC 2016 (duration 5m 39s)
  • 02:25 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 09m 08s)

2016-06-03

  • 22:57 Krinkle: Purged https://en.wikipedia.org/static/images/project-logos/bawiki.png
  • 22:53 logmsgbot: krinkle@tin Synchronized static/images/project-logos/bawiki.png: (no message) (duration: 00m 24s)
  • 21:57 YuviPanda: started copying graphite data from usb back
  • 21:27 awight: update paymentswiki from 28b98ec254b2a15c8df61c568b62f221b328222f to 3df3329f75fdbc679baf37bfd3955880091b3ae1
  • 20:47 ejegg: updated payments-wiki de86eadcd98922ee4207a0c46112585f3ba5c48d to 28b98ec254b2a15c8df61c568b62f221b328222f
  • 20:25 ejegg: updated GatewayReady hook on paymentswiki
  • 19:37 logmsgbot: krinkle@tin Synchronized php-1.28.0-wmf.4/extensions/WikimediaEvents/extension.json: T136920 (duration: 00m 28s)
  • 19:04 mutante: releases apt repo on bromine: export fresh jessie-mediawiki indexes
  • 17:41 mutante: uploaded parsoid 0.5.1 to releases
  • 17:14 robh: bast4001 coming down for second hdd installation. (there are currently no active users on system)
  • 16:58 mutante: magnesium - shutdown -h now, bye
  • 15:30 logmsgbot: tgr@tin Finished scap: revert AbuseFilter + config to pre-extension-registration state T136929 (duration: 06m 13s)
  • 15:24 logmsgbot: tgr@tin Started scap: revert AbuseFilter + config to pre-extension-registration state T136929
  • 14:38 gehel: un-freezing writes from CirrusSearch to eqiad cluster during upgrade (T133126)
  • 13:27 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: kafka2001.codfw.wmnet
  • 13:22 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: kafka2001.codfw.wmnet
  • 13:22 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: kafka2002.codfw.wmnet
  • 13:16 hasharAway: Reenabling puppet on gallium. Forgot to put it back yesterday
  • 13:14 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: kafka2002.codfw.wmnet
  • 13:11 elukey: rebooting kafka200[12] (codfw EventBus) for kernel upgrades
  • 11:18 gehel: freezing writes from CirrusSearch to eqiad clsuter during upgrade (T133126)
  • 10:48 gehel: taking elasticsearch eqiad cluster down for upgrade to 2.3 (T133126)
  • 10:39 gehel: Starting upgrade of elasticsearch eqiad cluster to 2.3 (T133126)
  • 10:35 moritzm: restarting apache on bohrium (serving piwik.wikimedia.org) for libxml2 security update
  • 10:23 moritzm: restarting apache on planet1001 (serving planet.wikimedia.org) for libxml2 security update
  • 08:42 moritzm: rolling restart of scb cluster (mathoid, ores-uwsgi) in eqiad to pick up libxml2 security updates
  • 08:38 jynus: archiving again syslog.1 from ms-be2012 on /srv/swift-storage/sdl1/tmp
  • 08:35 jynus: created new LDAP group grafana-admin, gid=1007
  • 08:34 elukey: rebooting kafka1012 for kernel upgrades.
  • 08:08 moritzm: installing libxml2 security updates on jessie systems
  • 07:19 kart_: Update cxserver to 19a71f1
  • 06:29 moritzm: installing nginx security updates on Ubuntu systems (Debian installs updated some days ago)
  • 02:36 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jun 3 02:36:39 UTC 2016 (duration 5m 58s)
  • 02:30 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 08m 35s)
  • 01:09 mutante: bromine - puppet currently stopped needs some permission fixes for release upload
  • 01:08 mutante: uploaded parsoid 0.5.0 deb to releases.wm.org
  • 00:24 logmsgbot: awight@tin Finished scap: Deploying labtestwiki AuthManager config; Enabling Popups experiment; CentralNotice fixes for T136408, T136387; Special:Notifications fixes (duration: 25m 08s)

2016-06-02

  • 23:59 logmsgbot: awight@tin Started scap: Deploying labtestwiki AuthManager config; Enabling Popups experiment; CentralNotice fixes for T136408, T136387; Special:Notifications fixes
  • 23:32 logmsgbot: awight@tin Synchronized wmf-config/InitialiseSettings.php: Add namespace translation 'Portal' for diq (duration: 00m 24s)
  • 23:28 logmsgbot: awight@tin Synchronized wmf-config/InitialiseSettings.php: Enable AuthManager on beta wikitech (duration: 00m 25s)
  • 23:24 logmsgbot: awight@tin Synchronized wmf-config/InitialiseSettings.php: Enable Hovercards experiment for 1% of users on huwiki (duration: 00m 24s)
  • 23:23 logmsgbot: awight@tin Synchronized php-1.28.0-wmf.4/extensions/Popups: Do not show Hovercards when NavPopups gadget is enabled on huwiki (duration: 00m 24s)
  • 23:21 logmsgbot: awight@tin Synchronized wmf-config/extension-list-labs: Test PageAssessments on Beta Labs (duration: 00m 25s)
  • 23:20 logmsgbot: awight@tin Synchronized wmf-config/InitialiseSettings-labs.php: Test PageAssessments on Beta Labs (duration: 00m 26s)
  • 23:20 logmsgbot: awight@tin Synchronized wmf-config/CommonSettings-labs.php: Test PageAssessments on Beta Labs (duration: 00m 24s)
  • 22:37 logmsgbot: ori@tin Synchronized wmf-config/InitialiseSettings.php: I9dc532b3: Enable "purge" log group (duration: 00m 42s)
  • 22:20 mutante: removed my gerrit admin flag
  • 20:20 mutante: magnesium (formerly RT) remove from puppet and icinga, revoked cert and salt key, just waiting another day or before shutdown
  • 20:18 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: logging: disable Wikibase\Client\Changes\WikiPageUpdater channel (duration: 00m 26s)
  • 20:12 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.4
  • 19:53 ottomata: stopping kafka broker and restarting kafka1014
  • 19:52 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.4/extensions/CheckUser/specials/SpecialCheckUser.php: Fix Special:Checkuser for log entries when cuc_title = "" (duration: 00m 31s)
  • 19:37 ejegg: re-enabled adyen job runner
  • 19:35 logmsgbot: akosiaris@palladium conftool action : set/pooled=yes; selector: scb2002.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=ores'])
  • 19:35 logmsgbot: akosiaris@palladium conftool action : set/pooled=yes; selector: scb2001.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=ores'])
  • 19:35 logmsgbot: akosiaris@palladium conftool action : set/pooled=yes; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 19:35 logmsgbot: akosiaris@palladium conftool action : set/pooled=yes; selector: scb1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 18:57 ejegg: disabled adyen job runner
  • 18:47 jynus: restarting replication on db1016
  • 18:43 YuviPanda: powercycle labmon1001 again, get into bios
  • 18:29 YuviPanda: going to try to intentionally trip the NFS check on tools-checker. This will not page
  • 18:24 YuviPanda: powercycle labmon1001 again
  • 18:19 mutante: db2007, revoke puppet cert, delete salt key, nuke from stored configs / icinga
  • 18:19 bearND: mobileapps deployed b2fee30
  • 18:18 mutante: db2007 shutdown, schedule eternal downtime
  • 18:04 bearND: starting mobileapps deploy
  • 17:40 subbu: finished deploying parsoid version 7188080b
  • 17:34 subbu: synced new code; restarted parsoid on wtp1001 as a canary
  • 17:29 subbu: starting deploy of new parsoid code
  • 17:21 mutante: ran ALTER TABLE character set utf8 .. (https://phabricator.wikimedia.org/T119112#2311402) on RT db
  • 17:16 mutante: running RT database upgrade from 4.0.4 to 4.2.8
  • 17:13 awight: update paymentswiki from d26426c4225080c95f0bd5a6a31c54e4826287b1 to de86eadcd98922ee4207a0c46112585f3ba5c48d
  • 17:05 mutante: stopped exim on magnesium
  • 17:05 jynus: stopping replication from db1001 to db1016 (pasive m1 node) before schema change
  • 16:52 mutante: magnesium (RT), tmp. stopped RT and puppet
  • 16:50 YuviPanda: begin reinstall of labmon1001
  • 15:19 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.4/extensions/Math: SWAT: Use img instead of meta tags for SVGs and Fix iterator in batchGetMathML (duration: 00m 28s)
  • 15:12 logmsgbot: thcipriani@tin Synchronized portals: deploying new localized top-links on wikipedia.org (duration: 00m 31s)
  • 15:11 logmsgbot: thcipriani@tin Synchronized portals/prod/wikipedia.org/assets: deploying new localized top-links on wikipedia.org (duration: 00m 32s)
  • 14:33 jynus: acked ores icinga checks on some scb hosts and pointing to T124201 (it seems the checks arrived before the actual setup)
  • 13:52 moritzm: installing imagemagick security updates on Ubuntu systems (but affected decoders already neutralised by policy changes) (also Debian systems already addressed)
  • 13:34 hashar: Downgrading Zuul back to zuul_2.1.0-95-g66c8e52-wmf1precise1_amd64.deb . Paramiko cant acquire ssh connection with Gerrit for some reason... https://phabricator.wikimedia.org/P3204
  • 12:10 hashar: Upgraded Zuul upstream code being 66c8e52..30a433b package is 2.1.0-151-g30a433b-wmf1precise1
  • 11:39 logmsgbot: jmm@tin Synchronized wmf-config/CommonSettings.php: disable firejail security hardening for image scalers, needs more work for the Score extension (duration: 00m 36s)
  • 10:55 hashar: Restarted Zuul and reenabled puppet on gallium
  • 10:50 hashar: gallium: stopped puppet agent
  • 10:49 hashar: gracefully stopping Zuul, will upgrade / take traces etc over the next half hour or so
  • 10:14 jynus: archiving again syslog.1 from ms-be2012 on /srv/swift-storage/sdl1/tmp
  • 10:08 mobrovac: restbase enabling puppet back in production
  • 08:40 mobrovac: restbase deploy end of 19f25925
  • 08:29 mobrovac: restbase deploy start of 19f25925
  • 08:09 mobrovac: restbase disabling puppet in production for testing https://gerrit.wikimedia.org/r/#/c/292109/ in staging
  • 07:23 moritzm: rebooting etherpad1001 (hosting etherpad.wikimedia.org) for upgrade to Linux 4.4
  • 07:02 jynus: performing schema change for db1057
  • 03:04 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Jun 2 03:04:44 UTC 2016 (duration 6m 40s)
  • 02:58 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 15m 37s)
  • 02:24 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.3) (duration: 10m 06s)
  • 01:52 mutante: scb1001/2001 ores - connection refused
  • 01:52 mutante: mw1136 service hhvm restart
  • 01:37 mutante: labsdb1001 /etc/init.d/mysql start
  • 01:32 YuviPanda: service mysql start on labsdb1001
  • 01:25 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Set $wgSpamBlacklistEventLogging to true on testwiki (duration: 00m 22s)
  • 01:25 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Set $wgSpamBlacklistEventLogging to true on testwiki (duration: 00m 23s)
  • 01:23 YuviPanda: reboot labsdb1001
  • 01:21 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.4/extensions/Flow/handlebars/: HACK: Hide reply form for locked topics (T135848) (duration: 00m 24s)
  • 01:19 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.4/extensions/Echo/includes/special/NotificationPager.php: Fix notification pager (T136759) (duration: 00m 25s)
  • 01:18 YuviPanda: restart mysql on labsdb1001
  • 01:00 bearND: mobileapps reverted to 8d6d648c943074b7d3999baf31d60ad99249cd51
  • 00:55 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings-labs.php: Revert "Test PageAssessments extension on Labs" (no-op) (duration: 00m 22s)
  • 00:55 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings-labs.php: Revert "Test PageAssessments extension on Labs" (no-op) (duration: 00m 23s)
  • 00:26 logmsgbot: awight@tin Synchronized php-1.28.0-wmf.4/extensions/CentralNotice: Fix for T136387 (duration: 00m 38s)
  • 00:05 urandom: Deploy of cdff5e3 to RESTBase production complete
  • 00:03 YuviPanda: started nfs-exports on labstore1001

2016-06-01

  • 23:57 urandom: Deploying cdff5e3 to RESTBase production
  • 23:51 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Revert Use extension registration for SpamBlacklist (T119117) (duration: 00m 24s)
  • 23:49 urandom: Deploying cdff5e3 to restbase1008.eqiad.wmnet (canary node)
  • 23:44 urandom: Deploy of RESTBase to staging environment complete
  • 23:40 urandom: Deploying RESTBase to staging environment
  • 23:39 urandom: RESTBase deploy to xenon.eqiad.wmnet (canary node) complete
  • 23:38 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings-labs.php: Test PageAssessments extension on Labs (no-op) (duration: 00m 26s)
  • 23:37 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings-labs.php: Test PageAssessments extension on Labs (no-op) (duration: 00m 30s)
  • 23:36 urandom: Deploying RESTBase to xenon.eqiad.wmnet (canary node)
  • 23:26 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.4/extensions/VisualEditor/modules/ve-mw/init/targets/ve.init.mw.DesktopArticleTarget.js: Simplify teardown of toolbar save button (T136421) (duration: 00m 23s)
  • 23:21 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Use full URL in $wgNoticeHideUrls (T130442) (duration: 00m 23s)
  • 23:17 urandom: Deploying d8fa5c0 to RESTBase production
  • 23:10 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Use HTTPS URL to citoid instead of protocol-relative (T136423) (duration: 00m 32s)
  • 23:06 urandom: Update restbase staging to f05b66f
  • 22:36 cwd: updated paymentswiki from 44bd699d6700ac4faf3c2d772ba713b093ae8cb8 to d26426c4225080c95f0bd5a6a31c54e4826287b1
  • 22:30 logmsgbot: twentyafterfour@tin Synchronized php-1.28.0-wmf.4/extensions/CentralNotice/: deploy https://gerrit.wikimedia.org/r/#/c/292279/ (duration: 00m 26s)
  • 21:38 twentyafterfour: train has left the station
  • 21:37 logmsgbot: twentyafterfour@tin Synchronized wmf-config/InitialiseSettings.php: deploy /wmf-config/InitialiseSettings.php for eranroz ( T132972 ) (duration: 00m 25s)
  • 21:31 logmsgbot: twentyafterfour@tin Synchronized php-1.28.0-wmf.4/includes/specials/SpecialPrefixindex.php: sync https://gerrit.wikimedia.org/r/#/c/292228/ ( T136738 ) (duration: 00m 26s)
  • 21:26 logmsgbot: twentyafterfour@tin Synchronized php-1.28.0-wmf.3/includes/specials/SpecialPrefixindex.php: sync https://gerrit.wikimedia.org/r/#/c/292234/ ( T136738 ) (duration: 00m 30s)
  • 20:58 bearND: mobileapps deployed ed0e2e4
  • 20:56 gehel: restarting postgresql on maps2001
  • 20:55 bearND: starting mobileapps deploy
  • 20:49 ejegg: updated paymentswiki from 7d222320b35ad8a44d8c77a4c3019364a49e53f2 to 44bd699d6700ac4faf3c2d772ba713b093ae8cb8
  • 20:44 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.4
  • 20:39 logmsgbot: twentyafterfour@tin Synchronized php-1.28.0-wmf.4/includes/cache/LinkBatch.php: deploy https://gerrit.wikimedia.org/r/#/c/292217/ (duration: 00m 27s)
  • 20:17 subbu: finished deploying parsoid sha afb0d522
  • 20:17 urandom: Rolling restart of RESTBase (redistribute Cassandra client connections?) : T126629
  • 20:10 subbu: synced new code; restarted parsoid on wtp1001 as a canary
  • 20:07 subbu: starting parsoid deploy
  • 19:43 ema: cp* hosts rebooted (T131928)
  • 19:40 bblack: restarting pybals for healthcheck config changes
  • 18:25 urandom: restarting Cassandra on restbase1007.eqiad.wmnet
  • 18:19 ejegg: updated payments-wiki from 5bb160e9898224e1d7d0a5c57fe408edb998a262 to 7d222320b35ad8a44d8c77a4c3019364a49e53f2
  • 18:16 ottomata: stopping kafka broker on kafka1018 and rebooting node
  • 17:51 urandom: Restarting Cassandra on restbase1007.eqiad.wmnet : T126629
  • 17:48 ema: depooled reboot of cp4* hosts (T131928)
  • 17:47 urandom: Temporarily disabling puppet to test setting on restbase1007.eqiad.wmnet : T126629
  • 17:15 ejegg: rolled back payments-wiki from a335a3a6f8909d1e7e1a79877512a12a0561aa2a to 5bb160e9898224e1d7d0a5c57fe408edb998a262
  • 17:06 ejegg: updated payments-wiki from 5bb160e9898224e1d7d0a5c57fe408edb998a262 to a335a3a6f8909d1e7e1a79877512a12a0561aa2a
  • 17:05 akosiaris: powered on lvs2006. disk change did not happen
  • 17:05 akosiaris: powered off lvs2006 for disk swap
  • 16:54 logmsgbot: tgr@tin Synchronized wmf-config/InitialiseSettings-labs.php: T135504: enable AuthManager in beta (duration: 00m 32s)
  • 16:39 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.4/extensions/NewUserMessage/: backport gerrit:292168 to update NewUserMessage for AuthManager (duration: 00m 29s)
  • 16:22 urandom: Disabling traces on restbase1008-a.eqiad.wmnet : T126629
  • 16:01 logmsgbot: thcipriani@tin Finished scap: SWAT: Update for AuthManager (duration: 26m 05s)
  • 15:58 urandom: Setting trace probability on restbase1008-a.eqiad.wmnet to 5% : T126629
  • 15:58 jynus: updating dns entry for db1080.eqiad.wment
  • 15:58 urandom: Disabling trace probability on restbase1007-a.eqiad.wmnet : T126629
  • 15:48 urandom: Setting trace probability to 5% on restbase1007-a.eqiad.wmnet : T126629
  • 15:35 logmsgbot: thcipriani@tin Started scap: SWAT: Update for AuthManager
  • 15:33 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.4/resources/src/moment-locale-overrides.js: SWAT: Avoid passing integers to mw.RegExp.escape (duration: 00m 24s)
  • 15:29 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Remove centralauth-autoaccount right (duration: 00m 25s)
  • 15:26 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable bot passwords on zerowiki (duration: 00m 24s)
  • 15:19 paravoid: Re-enabling OSPF on all cr1-codfw row subnets
  • 15:18 paravoid: Re-enabling cr1-codfw et-0/* interfaces
  • 15:18 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Enable RC patrol on ta.wikiquote" (duration: 00m 25s)
  • 15:15 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove no longer used Echo configuration PART II (duration: 00m 26s)
  • 15:14 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Remove no longer used Echo configuration PART I (duration: 00m 33s)
  • 15:13 paravoid: Rebooting cr1-codfw FPC 0
  • 15:09 paravoid: Upgrading cr1-codfw FPC 0 all PICs firmware
  • 15:08 logmsgbot: thcipriani@tin Synchronized static/images/sul: SWAT: Make SUL icons square and use global defaults (duration: 00m 41s)
  • 15:07 paravoid: Disabling cr1-codfw et-0/* (all row uplinks)
  • 15:03 akosiaris: restarted grrrit-wm after gerrit restart
  • 15:03 paravoid: Disabling OSPF on all cr1-codfw row subnets to drain FPC0
  • 15:02 akosiaris: restarted gerrit to enforce 100m maxObjectSizeLimit
  • 14:59 paravoid: Restoring VRRP priority on cr2-codfw
  • 14:57 bblack: depooled reboot of cp3048 (T131928)
  • 14:57 paravoid: Re-enabling OSPF on all cr2-codfw row subnets
  • 14:54 paravoid: Re-enabling cr2-codfw et-0/* interfaces
  • 14:49 paravoid: Rebooting cr2-codfw FPC 0
  • 14:48 paravoid: Upgrading cr2-codfw FPC 0 all PICs firmware
  • 14:42 paravoid: Disabling cr2-codfw et-0/2/0, et-0/2/1 (row C/D uplinks)
  • 14:34 paravoid: Disabling cr2-codfw et-0/0/0 (row A uplink)
  • 14:29 paravoid: Disabling cr2-codfw et-0/0/1 (row B uplink)
  • 14:15 paravoid: Disabling OSPF on all cr2-codfw row subnets to drain FPC0
  • 14:08 ema: depooled reboot of cp1* hosts (T131928)
  • 12:49 paravoid: draining cr2-codfw for firmware upgrade
  • 12:26 bblack: upgrade nginx to 1.11.1-1+wmf1 on all clusters
  • 11:50 elukey: rebooting kafka1022 for kernel upgrade (4.4)
  • 11:05 ema: rebooting cp3* spares (T131928)
  • 10:47 Dereckson: Script done for uca-it collation on itwiki: 10 599 758 rows processed
  • 10:47 ema: depooled reboot of cp3046 (T131928)
  • 10:47 ema: depooled reboot of cp3003 (T131928)
  • 10:45 ema: depooled reboot of cp3034 (T131928)
  • 10:39 ema: depooled reboot of cp3005 (T131928)
  • 10:38 ema: depooled reboot of cp3044 (T131928)
  • 10:35 ema: depooled reboot of cp3047 (T131928)
  • 10:31 ema: depooled reboot of cp3004 (T131928)
  • 10:28 ema: depooled reboot of cp3009 (T131928)
  • 10:14 ema: depooled reboot of cp3037 (T131928)
  • 10:11 jynus: moved syslog1 to ms-be2012:/srv/swift-storage/sdl1/tmp to avoid / fillup
  • 10:10 ema: depooled reboot of cp3008 (T131928)
  • 10:09 ema: depooled reboot of cp3035 (T131928)
  • 09:37 moritzm: installing libgd security updates
  • 09:28 ema: depooled reboot of cp3039 (T131928)
  • 09:23 ema: depooled reboot of cp3045 (T131928)
  • 09:21 ema: depooled reboot of cp3010 (T131928)
  • 09:18 ema: depooled reboot of cp3006 (T131928)
  • 09:16 ema: depooled reboot of cp3007 (T131928)
  • 09:10 ema: depooled reboot of cp3036 (T131928)
  • 08:25 mobrovac: mobileapps deploying 8d6d648
  • 08:24 ema: depooled reboot of cp3049 (T131928)
  • 08:22 hashar: Nodepool came back up just fine after labnodepool1001 reboot and is fully operational.
  • 08:15 jynus: deleting mysql logrotate scripts to avoid root spam
  • 08:14 moritzm: reboot labnodepool1001 for update to Linux 4.4
  • 07:56 elukey: event logging restarted on eventlog1001.eqiad.wmnet
  • 07:46 elukey: stopping kafka on kafka1020.eqiad and rebooting the host for Linux 4.4 upgrades
  • 07:43 moritzm: rolling reboot of scb in eqiad for update to Linux 4.4
  • 07:32 moritzm: restarted hhvm on mw1180
  • 07:05 mobrovac: change-prop restarting to apply https://gerrit.wikimedia.org/r/291201
  • 05:41 mobrovac: restbase deploy end of 5c99693
  • 05:26 mobrovac: restbase deploy start of 5c99693
  • 04:31 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.4/includes/: reapplied new version of I03739e94 (duration: 01m 21s)
  • 04:27 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.3/includes/: reapplied new version of I03739e94 (duration: 01m 34s)
  • 03:11 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jun 1 03:11:11 UTC 2016 (duration 6m 39s)
  • 03:04 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 15m 42s)
  • 02:30 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.3) (duration: 09m 30s)
  • 00:04 Dereckson: Started `mwscript updateCollation.php itwiki --previous-collation=uppercase` on Terbium (T136647)


Archives