Server Admin Log

From Wikitech
Jump to: navigation, search

2016-07-30

  • 02:25 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jul 30 02:25:55 UTC 2016 (duration 5m 37s)
  • 02:20 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.12) (duration: 08m 10s)

2016-07-29

  • 23:56 YuviPanda: hardreset xenon
  • 23:44 urandom: Rebooting xenon.eqiad.wmnet
  • 23:13 bblack: installed openssl-1.0.2h-1~wmf2 on pinkunicorn for the weekend (not on carbon yet) - https://gerrit.wikimedia.org/r/301903
  • 21:02 ostriches: gerrit: raised log level on sshd to ERROR from WARN. Irrelevant logspam.
  • 20:56 mutante: restarted grrrit-wm but this time only because it died by itself
  • 20:22 mutante: gerrit restarting to apply config change 301873 - tuning caches
  • 20:11 YuviPanda: restart gerrit-wm bot
  • 19:56 mutante: gerrit restarting to apply config change 300446 - up heap size limit
  • 18:59 logmsgbot: ebernhardson@tin Synchronized wmf-config/InitialiseSettings-labs.php: labs only change to enable mobile language bar (duration: 00m 27s)
  • 18:37 urandom: T134016: Bootstrapping restbase2009-c.codfw.wmnet
  • 16:39 YuviPanda: granted addshore admin on labs grafana
  • 15:17 anomie: starting maintenance script for phab:T140811
  • 13:15 Dereckson: Purged static resources related to mk.wiktionary (T141610)
  • 12:43 elukey: upgrading zuul-merger to zuul_2.1.0-391-gbc58ea3-wmf2jessie1_amd64.deb on scandium
  • 10:23 elukey: restarting cassandra on aqs100[123] to apply the latest config (https://gerrit.wikimedia.org/r/#/c/301780/1 - T140869)
  • 10:14 jynus: applying new grants to all s1 servers
  • 09:38 hashar: Upgrading Zuul to get rid of a forced sleep(300) whenever a patch is merged T93812. zuul_2.1.0-391-gbc58ea3-wmf2precise1
  • 08:51 godog: switch back statsite flush period to 60s T101141
  • 07:30 jynus: schema change continues for s2, s1, s4 and s5 T140108
  • 07:24 jynus: fixing s3 replication lag created by TokuDB insert problem
  • 06:59 jynus: powercycling db2069 T141601
  • 04:04 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.12/extensions/TorBlock/extension.json: Move basic torunblocked line to GrantPermissions, not GroupPermissions, see wikitech-l (duration: 00m 38s)
  • 02:26 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jul 29 02:26:09 UTC 2016 (duration 5m 57s)
  • 02:20 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.12) (duration: 07m 30s)

2016-07-28

  • 23:29 ejegg: updated payments-wiki from 2d9dd79507a42ced0a99bde87b3c45b804610e40 to 3a724bfb1a3e20e17b5886dae0ba7572020abd6b4
  • 23:23 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/301724/1 (duration: 00m 24s)
  • 23:19 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/301644/ (duration: 00m 29s)
  • 22:42 logmsgbot: maxsem@tin Synchronized wmf-config/: Labs only (duration: 00m 45s)
  • 20:25 urandom: T134016: Bootstrapping restbase2006-c.codfw.wmnet
  • 20:02 Pchelolo: deploy restbase cdd164c4e
  • 19:43 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.12/includes/api/ApiQueryUserContributions.php: Fix Undefined variable issue in ApiQueryUserContributions (duration: 00m 32s)
  • 19:42 Pchelolo: deploy restbase cdd164c4e canary on restbase1007
  • 19:22 urandom: T134016: Bootstrapping restbase1014-c.eqiad.wmnet
  • 19:06 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.12
  • 19:01 mutante: restarted grrrit-wm
  • 17:37 ottomata: powercycling analytics1032
  • 16:50 godog: bounce statsite on graphite1001 T101141
  • 16:38 jynus: stopping dbproxy1001 haproxy service
  • 15:41 logmsgbot: aude@tin Synchronized wmf-config/Wikibase.php: Update entityNamespaces setting (duration: 00m 27s)
  • 15:28 logmsgbot: aude@tin Synchronized php-1.28.0-wmf.12/extensions/Wikidata: Fix exception when undeleting items and fix css bug (duration: 01m 52s)
  • 15:20 logmsgbot: aude@tin Synchronized php-1.28.0-wmf.11/extensions/ContentTranslation: Fix DumpCorpora script (duration: 00m 27s)
  • 15:19 logmsgbot: aude@tin Synchronized php-1.28.0-wmf.12/extensions/ContentTranslation: Fix DumpCorpora script (duration: 00m 31s)
  • 15:15 godog: bounce statsite on graphite1001 - T101141
  • 15:10 logmsgbot: aude@tin Synchronized dblists/clldefault.dblist: Enable content translation on more wikis (duration: 00m 23s)
  • 15:09 logmsgbot: aude@tin Synchronized wmf-config/InitialiseSettings.php: Enable content translation on more wikis (duration: 00m 25s)
  • 15:07 jynus: adding new index (schema change) to recentchanges T140108
  • 14:59 godog: bounce carbon-cache on graphite1001 - T101141
  • 14:11 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1055 after maintenance (duration: 00m 35s)
  • 12:51 elukey: upgrading zuul-merger to zuul_2.1.0-391-gbc58ea3-wmf1jessie (T140894)
  • 12:14 akosiaris: installing updates on mendelevium
  • 11:18 jynus: deploying schema change to all ores databases T140803
  • 11:17 godog: replace statsdlb with statsd-proxy on graphite1001
  • 11:09 jynus: testing schema change on db2038
  • 09:25 moritzm: installing PHP security updates
  • 08:49 godog: swift eqiad-prod: ms-be102[3456] weight 3000
  • 07:43 elukey: starting decom process for old api servers - mw11(1[4-9]|20|3[0-9]|4[0-8]).eqiad.wmnet (tracked in https://etherpad.wikimedia.org/p/appservers-decom)
  • 07:16 _joe_: regenerated the ssl key for rhodium, 1024 bits
  • 06:54 moritzm: installing java security updates on restbase staging systems
  • 06:44 _joe_: installed puppet 3.8 from backports on rhodium
  • 06:36 moritzm: installing perl security updates on eqiad and codfw jessie systems
  • 06:36 _joe_: refreshed puppet facts for the compiler
  • 05:56 jynus: shutting down db1055 for upgrade
  • 02:47 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Jul 28 02:47:28 UTC 2016 (duration 6m 22s)
  • 02:41 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.12) (duration: 08m 03s)
  • 02:24 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.11) (duration: 08m 49s)
  • 02:13 twentyafterfour: restarted apache2 on iridium to deploy 4305a9bb0300650ea40de433261c7e59cc88e4bc
  • 01:30 twentyafterfour: Deploying #phab-2016.30 (https://phabricator.wikimedia.org/project/profile/2118/) - no downtime is expected.

2016-07-27

  • 23:45 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings-labs.php: Test numeric sorting on Beta Cluster (Gerrit:301520, labs only, no-op in prod) (duration: 00m 23s)
  • 23:22 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.12/extensions/ORES/maintenance/PopulateDatabase.php: Add revision_id to log for errors (T141368, 2/2, no-op) (duration: 00m 29s)
  • 23:21 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.12/extensions/ORES/includes/Cache.php: Add revision_id to log for errors (T141368, 1/2) (duration: 00m 31s)
  • 22:50 Pchelolo: restbase deploy cdd164c4e8 to staging
  • 21:34 mutante: planet2001 tmp disable puppet for testing
  • 21:17 ejegg: updated payments from 79cb53998c41f72d0fa49130ed1f66dc112b478c to 2d9dd79507a42ced0a99bde87b3c45b804610e40
  • 21:13 bearND: deployed mobileapps e561edf
  • 21:08 bearND: starting mobileapps deploy
  • 20:47 andrewbogott: restarting rabbitmq-server on labcontrol1001 because sometimes that fixes a thing
  • 19:59 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.12
  • 19:38 Pchelolo: deploy restbase 8f5e2897e to staging
  • 19:16 andrewbogott: rebuilding labnet1001 (it's a spare and shouldn't affect Labs)
  • 19:04 urandom: T134016: Bootstrapping restbase2005-c.codfw.wmnet
  • 18:50 Pchelolo: restart changeprop to apply config changes 300681 and 301305
  • 18:08 Pchelolo: deploy restbase 8efbc9282e to staging
  • 17:58 urandom: T134016: Restarting Cassandra instance to apply disabled streaming socket timeout (restbase2009-b.codfw.wmnet)
  • 17:51 mutante: restarted grrriit-wm
  • 17:45 mutante: gerrit is restarting to deploy config change 301381, a couple seconds downtime
  • 17:23 jynus: running analyze table on enwiki.logging db1055 (depooled)
  • 17:22 urandom: Truncating "local_group_wikipedia_T_parsoid_section_offsets".data, "local_group_wikipedia_T_parsoid_dataW4ULtxs1oMqJ".data, and "local_group_wikipedia_T_parsoid_html".data in RESTBase staging
  • 17:21 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1055 also from the rc/log role (duration: 00m 28s)
  • 16:33 urandom: T134016: Restarting Cassandra instance to apply disabled streaming socket timeout (restbase2009-a.codfw.wmnet)
  • 16:22 logmsgbot: addshore@tin Synchronized php-1.28.0-wmf.12/extensions/MobileFrontend/includes/skins/SkinMinerva.php: 301387 Fix watchstar for logged-out user (duration: 00m 32s)
  • 16:04 logmsgbot: addshore@tin Synchronized php-1.28.0-wmf.11/includes/EditPage.php: 301356 Count edit conflicts for each namespace separately (duration: 00m 32s)
  • 15:54 logmsgbot: thcipriani@tin Synchronized wmf-config/throttle.php: SWAT: IP cap lift for Wikipedia Edit-a-thon on 2016-08-03 (duration: 00m 23s)
  • 15:49 urandom: T134016: Restarting Cassandra instance to apply disabled streaming socket timeout (restbase2006-b.codfw.wmnet)
  • 15:45 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add "upwizcampeditors" to $wgAddGroups, $wgRemoveGroups for commonswiki (duration: 00m 24s)
  • 15:38 logmsgbot: thcipriani@tin Synchronized wmf-config/CirrusSearch-common.php: SWAT: Turn on textcat based language detection for search PART II (duration: 00m 23s)
  • 15:37 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Turn on textcat based language detection for search PART I (duration: 00m 27s)
  • 15:21 urandom: T134016: Restarting Cassandra instance to apply disabled streaming socket timeout (restbase2006-a.codfw.wmnet)
  • 15:06 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: beta wgEchoMentionStatusNotifications default true (duration: 01m 28s)
  • 14:58 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1055 for database maintenance (duration: 00m 29s)
  • 14:50 urandom: T134016: Restarting Cassandra instance to apply disabled streaming socket timeout (restbase2005-b.codfw.wmnet)
  • 14:16 urandom: T134016: Cancelling bootstrap of restbase2005-c.codfw.wmnet
  • 14:12 urandom: T134016: Restarting Cassandra instance to apply disabled streaming socket timeout (restbase2005-a.codfw.wmnet)
  • 13:29 elukey: Restart Cassandra on aqs100[123] to apply the latest configuration (T140869)
  • 12:50 elukey: disabling puppet on restbase*, aqs* and maps* as extra careful step for https://gerrit.wikimedia.org/r/301083 (no-op but better safe than sorry)
  • 12:28 bblack: starting wipe of cache_misc caches
  • 12:23 akosiaris: puppet enabled on netmon1001 (correction of previous log line)
  • 12:22 akosiaris: puppet enabled on net1001
  • 12:05 akosiaris: disable puppet on netmon1001, debugging servermon
  • 11:25 Dereckson: Run initSiteStats to update statistics count on ast.wikipedia (T141432)
  • 10:42 jynus: add extra grants to db1016 and all of m1 for servermon
  • 10:20 moritzm: restarting slapd on serpens
  • 08:43 elukey: Decomissioning mw1018-25 (T139353)
  • 08:30 logmsgbot: hashar@tin Synchronized wmf-config/InitialiseSettings.php: Lower loglevel for resourceloader to info https://gerrit.wikimedia.org/r/#/c/301336/ (duration: 00m 26s)
  • 07:49 jynus: update m1-master to point to dbproxy1006
  • 07:11 moritzm: installing perl security updates in esams and codfw
  • 06:56 jynus: dropping tables from m4 shard T141407
  • 03:06 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jul 27 03:06:43 UTC 2016 (duration 6m 49s)
  • 02:59 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.12) (duration: 15m 29s)
  • 02:27 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.11) (duration: 09m 11s)
  • 02:09 mutante: restarted grrrit-wm after removing bugzilla password from gerrit
  • 01:57 mutante: lead removed reviewer_count job from root's crontab
  • 01:41 mutante: ytterbium - shutdown -h now, over and out
  • 01:05 ejegg: rolled back paymentswiki from 79d2b67067fd7e579372b63e0d619eccfa3b9143 to 79cb53998c41f72d0fa49130ed1f66dc112b478c
  • 00:56 mutante: xenon - rsync cassandra-test data to restbase-test2001 /srv/backups/eqiad/
  • 00:53 mutante: restbase-test2001-2003 - test rsyncing, create temp data dir. mkdir -p $(grep path /etc/rsync.d/frag-parsoid-html | cut -d= -f2)

2016-07-26

  • 23:59 logmsgbot: reedy@tin Synchronized php-1.28.0-wmf.12/extensions/MobileFrontend/: Deploy revert for group0 for T141386 (duration: 00m 30s)
  • 23:25 logmsgbot: reedy@tin Synchronized dblists/commonsuploads.dblist: Disabling local uploads on ms.wikipedia.org (duration: 00m 23s)
  • 23:18 logmsgbot: reedy@tin Synchronized wmf-config/event-schemas: Bump event-schemas submodule commit to master (duration: 00m 28s)
  • 23:08 ejegg: updated payments from 79cb53998c41f72d0fa49130ed1f66dc112b478c to 79d2b67067fd7e579372b63e0d619eccfa3b9143
  • 23:07 logmsgbot: reedy@tin Synchronized wmf-config/: Remove rest of ImageMetrics config (duration: 00m 33s)
  • 23:04 logmsgbot: reedy@tin Synchronized wmf-config/CommonSettings.php: Undeploy ImageMetrics (duration: 00m 27s)
  • 21:05 logmsgbot: reedy@tin Synchronized wmf-config/extension-list: moar extension.json (duration: 00m 33s)
  • 20:43 mdholloway: mobileapps deployed fd3f33b
  • 20:41 mdholloway: starting mobileapps deployment
  • 20:37 urandom: Bootstrapping restbase2005-c.eqiad.wmnet
  • 20:23 urandom: T134016: Bootstrapping restbase1009-c.eqiad.wmnet
  • 19:58 urandom: T134016, T140825: Restarting Cassandra to disable trickle_fsync and streaming socket timeouts (restbase1015-b.eqiad.wmnet)
  • 19:54 urandom: T134016, T140825: Restarting Cassandra to disable trickle_fsync and streaming socket timeouts (restbase1015-a.eqiad.wmnet)
  • 19:53 urandom: T140825: Setting vm.dirty_background_bytes=24576 (restbase1015.eqiad.wmnet)
  • 19:49 urandom: T134016, T140825: Restarting Cassandra to disable trickle_fsync and streaming socket timeouts (restbase1014-b.eqiad.wmnet)
  • 19:43 urandom: T134016, T140825: Restarting Cassandra to disable trickle_fsync and streaming socket timeouts (restbase1014-a.eqiad.wmnet)
  • 19:42 urandom: T140825: Setting vm.dirty_background_bytes=24576 (restbase1014.eqiad.wmnet)
  • 19:40 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.28.0-wmf.12
  • 19:37 urandom: T134016, T140825: Restarting Cassandra to disable trickle_fsync and streaming socket timeouts (restbase1009-b.eqiad.wmnet)
  • 19:34 logmsgbot: thcipriani@tin Finished scap: testwiki to php-1.28.0-wmf.12 and rebuild l10n cache (duration: 25m 29s)
  • 19:33 urandom: T134016, T140825: Restarting Cassandra to disable trickle_fsync and streaming socket timeouts (restbase1009-a.eqiad.wmnet)
  • 19:33 urandom: T140825: Setting vm.dirty_background_bytes=24576 (restbase1009.eqiad.wmnet)
  • 19:09 logmsgbot: thcipriani@tin Started scap: testwiki to php-1.28.0-wmf.12 and rebuild l10n cache
  • 19:07 logmsgbot: thcipriani@tin Purged l10n cache for 1.28.0-wmf.10
  • 18:36 logmsgbot: addshore@tin Synchronized php-1.28.0-wmf.11/extensions/WikimediaEvents/WikimediaEventsHooks.php: dewiki_diffstats add rev timestamps & feature state 301119 (duration: 00m 28s)
  • 18:28 logmsgbot: addshore@tin Synchronized wmf-config/InitialiseSettings.php: Enable RevisionSlider on mediawikiwiki 301105 (duration: 01m 28s)
  • 18:07 bd808: Restarted elasticsearch on logstash1003, couldn't find master
  • 17:12 subbu: finished deploying parsoid version 285b6983
  • 17:10 thcipriani: starting branch cut for 1.28.0-wmf.12
  • 17:06 subbu: synced new parsoid code; restarted parsoid on wtp1007 as a canary
  • 17:03 subbu: starting parsoid deploy
  • 15:55 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Configuration changes for mk.wiktionary.org PART III (duration: 00m 26s)
  • 15:54 logmsgbot: thcipriani@tin Synchronized static/images/project-logos/mkwiktionary.png: SWAT: Configuration changes for mk.wiktionary.org PART II (duration: 00m 24s)
  • 15:54 logmsgbot: thcipriani@tin Synchronized static/favicon/wiktionary/mk.ico: SWAT: Configuration changes for mk.wiktionary.org PART I (duration: 00m 24s)
  • 15:47 godog: reimage mw1292 as thumbor1002
  • 15:12 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove EchoBundleEmailInterval (T135446) PART II (duration: 00m 26s)
  • 15:12 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Remove EchoBundleEmailInterval (T135446) PART I (duration: 00m 34s)
  • 15:00 paravoid: installed cr2-eqiad FPC 3
  • 14:36 moritzm: uploading openjdk-8 security update (8u102-b14-1~bpo8+1) to carbon
  • 14:32 godog: reimage mw1291 as thumbor1001
  • 13:55 jynus: compressing 300GB table on dbstore2002 (expect warnings, slowdown, lag -but it is a passive analytics slave)
  • 12:42 moritzm: installing perl security updates
  • 11:48 moritzm: installing exim4 updates related to perl security release
  • 11:39 logmsgbot: filippo@palladium conftool action : set/pooled=inactive; selector: name=mw1291.eqiad.wmnet
  • 11:39 logmsgbot: filippo@palladium conftool action : set/pooled=inactive; selector: name=mw1292.eqiad.wmnet
  • 11:29 logmsgbot: filippo@palladium conftool action : set/pooled=no; selector: name=mw1292.*
  • 11:28 logmsgbot: filippo@palladium conftool action : set/pooled=no; selector: name=mw1291.*
  • 10:43 elukey: restarting cassandra on aqs100[456] instances (not serving live traffic)
  • 09:18 moritzm: updating debhelper, cdbs, devscripts, libintl-perl, libmodule-build-perl and libnet-dns-perl on jessie systems for compatibility with perl security update
  • 07:38 kart_: Update cxserver to 447a6c9 - registry: Remove 'en' as target from Apertium MT - disables machine translation to English in ContentTranslation
  • 02:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jul 26 02:31:43 UTC 2016 (duration 6m 8s)
  • 02:25 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.11) (duration: 09m 00s)
  • 01:51 mutante: cerium testing is over?
  • 01:11 Amir1: deploying from 2d9817b to a291da1 for ores in scb nodes
  • 00:53 mutante: lead - stopped rsyncd
  • 00:49 urandom: T134016: Bootstrapping restbase2008-c.codfw.wmnet
  • 00:06 Pchelolo: restbase deploy ae5fbac to staging

2016-07-25

  • 23:11 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Add dewiki_diffstats to wmgMonologChannels (Gerrit:288158, T134861) (duration: 00m 25s)
  • 22:48 legoktm: restarted zuul due to depends-on lockup
  • 22:46 logmsgbot: reedy@tin Synchronized docroot/noc/conf/: Update dblist symlinks (duration: 00m 37s)
  • 21:52 tgr: deployed security patch for T137551
  • 20:37 Pchelolo: restbase deploy 8efbc92
  • 20:10 Pchelolo: restbase deploy 8efbc92 canary deploy to restbase1007
  • 20:05 Pchelolo: restbase deploy 8efbc92 to staging
  • 20:00 Pchelolo: restbase deploy 8efbc92 to deployment-prep
  • 19:21 urandom: T134016: Bootstrapping restbase1013-c.eqiad.wmnet
  • 18:31 ottomata: upgrading kafka to 0.9 in main-codfw, first kafka2001 then 2002
  • 18:15 mutante: ytterbium - revoke puppet cert, delete salt-key, remove from icinga
  • 16:16 urandom: T134016: Restarting Cassandra to apply stream timeout (restbase1013-b.eqiad.wmnet)
  • 16:10 urandom: T134016: Restarting Cassandra to apply stream timeout (restbase1013-a.eqiad.wmnet)
  • 16:06 urandom: T140825, T134016: Restarting Cassandra to apply stream timeout, and disable trickle_fsync (restbase1012-c.eqiad.wmnet)
  • 16:02 urandom: T140825, T134016: Restarting Cassandra to apply stream timeout, and disable trickle_fsync (restbase1012-b.eqiad.wmnet)
  • 15:54 urandom: T140825, T134016: Reststarting Cassandra to apply stream timeout, and disable trickle_fsync (restbase1012-a.eqiad.wmnet)
  • 15:53 urandom: T140825: Setting vm.dirty_background_bytes=24M on restbase1012.eqiad.wmnet
  • 15:43 urandom: T140825, T134016: Reststarting Cassandra to apply stream timeout, and 8MB trickle_fsync (restbase1008-c.eqiad.wmnet)
  • 15:39 urandom: T140825, T134016: Reststarting Cassandra to apply stream timeout, and 8MB trickle_fsync (restbase1008-b.eqiad.wmnet)
  • 15:34 urandom: T140825, T134016: Reststarting Cassandra to apply stream timeout, and 8MB trickle_fsync (restbase1008-a.eqiad.wmnet)
  • 15:28 elukey: Standardized the jmxtrans GC metric names to pick up automatically variations in settings. This introduces metric name changes in Hadoop, Zookeeper, Kafka. (https://gerrit.wikimedia.org/r/#/c/299118/)
  • 12:53 moritzm: installing squid security updates
  • 10:10 _joe_: remove spurious puppet facts
  • 10:10 _joe_: remove spurious puppet facts
  • 10:04 moritzm: installing Django security updates
  • 09:18 godog: swift eqiad-prod: ms-be102[3456] weight 1500
  • 03:26 hashar: scandium: migrating zuul-merger repos from lead to gerrit.wikimedia.org: find /srv/ssd/zuul/git -path '*/.git/config' -print -execdir sed -i -e 's/lead.wikimedia.org/gerrit.wikimedia.org/' config \;
  • 02:28 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Jul 25 02:28:21 UTC 2016 (duration 5m 52s)
  • 02:22 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.11) (duration: 09m 09s)
  • 02:03 ostriches: gerrit: reindexing lucene now that we have new data. searches/dashboards may look a tad weird for a bit
  • 01:53 hashar: starting Zuul
  • 01:51 mutante: restarted grrrit-wm
  • 01:39 ostriches: lead: turning puppet back on, here we go
  • 01:38 jynus: m2 replication on db2011 stopped, master binlog pos: db1020-bin.000968:1013334195
  • 01:37 hashar: scandium: restarted zuul-merger
  • 01:36 ostriches: ytterbium: Stopped puppet, stopped gerrit process.
  • 01:34 mutante: switched gerrit-new to gerrit in DNS
  • 01:30 ostriches: lead: stopped puppet for a few minutes
  • 01:17 hashar: scandium: migrating zuul-merger repos to lead find /srv/ssd/zuul/git -path '*/.git/config' -print -execdir sed -i -e 's/ytterbium.wikimedia.org/lead.wikimedia.org/' config \;
  • 01:10 hashar: stopping CI
  • 01:09 jynus: reviewdb backup finished, available on db1020:/srv/tmp/2016-07-25_00-54-31/
  • 01:02 ostriches: rsyncing latest git data from ytterbium to lead
  • 00:57 mutante: manually deleted reviewer-counts cron from gerrit2 user, runs as root and puppet does not remove crons unless ensure=>absent
  • 00:55 jynus: starting hot backup of db1020's reviewdb

2016-07-24

  • 02:25 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Jul 24 02:25:08 UTC 2016 (duration 4m 34s)
  • 02:20 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.11) (duration: 08m 59s)

2016-07-23

  • 15:38 godog: stop swift in esams test cluster, lots of logging from there
  • 15:37 godog: lithium sudo lvextend --size +10G -r /dev/mapper/lithium--vg-syslog
  • 04:58 ori: Gerrit is back up after service restart; was unavailable between ~ 04:29 - 04:57 UTC
  • 04:56 ori: Restarting Gerrit on ytterbium
  • 04:48 ori: Users report Gerrit is down; on ytterbium java is occupying two cores at 100%
  • 02:26 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jul 23 02:26:49 UTC 2016 (duration 5m 41s)
  • 02:21 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.11) (duration: 08m 24s)
  • 01:02 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.11/extensions/CentralAuth/includes/CentralAuthPlugin.php: T141160 (duration: 00m 29s)
  • 01:01 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.11/extensions/CentralAuth/includes/CentralAuthHooks.php: T141160 (duration: 00m 27s)
  • 01:00 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.11/extensions/CentralAuth/includes/CentralAuthPrimaryAuthenticationProvider.php: T141160 (duration: 00m 28s)
  • 00:37 tgr: doing an emergency deploy of https://gerrit.wikimedia.org/r/#/c/300679 for T141160, creates dozens of new users per hour to be unattached on loginwiki which probably has weird consequences

2016-07-22

  • 22:19 logmsgbot: aaron@tin Synchronized wmf-config/InitialiseSettings.php: Enable debug logging for DBTransaction (duration: 00m 38s)
  • 21:10 ejegg: updated civicrm from 2f4805fa2d2a7c57881408be2b3a017d26d8f43e to d657255e1edebeccfc0a03bea70b78eb11375cf8
  • 20:58 ejegg: disabled Worldpay audit parser job
  • 18:59 ejegg: rolled back payments from 79d2b67067fd7e579372b63e0d619eccfa3b9143 to 79cb53998c41f72d0fa49130ed1f66dc112b478c
  • 18:54 mutante: restart grrrit-wm
  • 16:05 Jeff_Green: running authdns-update to correct a DKIM public key on wikipedia.org
  • 15:24 anomie: Starting script to populate empty gu_auth_token phab:T140478
  • 15:16 urandom: T140825: Restarting Cassandra to apply 8MB trickle_fsync (restbase1015-a.eqiad.wmnet)
  • 14:21 gehel: rolling restart of logstash100[1-3] - T141063
  • 14:19 urandom: T134016: Boostrapping restbase2004-c.codfw.wmnet
  • 12:42 jynus: applying new m5 db grants
  • 11:12 jynus: reimage dbproxy1009 T140983
  • 11:04 jynus: applying new m2 db grants
  • 10:47 jynus: reimage dbproxy1007 T140983
  • 10:36 jynus: applying new m1 db grants
  • 10:27 hashar: Restarting Jenkins entirely (deadlocked)
  • 10:23 hashar: Jenkins has some random deadlock. Will probably reboot it
  • 09:45 jynus: reimage dbproxy1006
  • 09:36 jynus: applying new m3 db grants
  • 08:19 jynus: reimage dbproxy1008
  • 06:43 jynus: updating dns records: m3-slave to db1043; m2-master to dbproxy1002
  • 04:08 jynus: backing up, shutting down and reimage db1043
  • 03:14 jynus: stopping db1043 db
  • 03:06 twentyafterfour: restarted apache2 and phd on iridium
  • 03:04 jynus: reverting m3-master dns back to the proxy
  • 02:59 jynus: restarted phd on iridium
  • 02:35 jynus: SET GLOBAL read_only=0; on db1048
  • 02:34 jynus: updating m3-master dns
  • 02:33 jynus: setting db1043 as read-only (phabricator/m3)
  • 02:31 jynus: making dbstore1002.eqiad.wmnet:3306 a child of db1048.eqiad.wmnet:3306
  • 02:27 jynus: making db2012.codfw.wmnet:3306 a child of db1048.eqiad.wmnet
  • 02:25 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jul 22 02:25:53 UTC 2016 (duration 5m 47s)
  • 02:20 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.11) (duration: 08m 23s)
  • 00:53 bd808: Restarted elasticsearch on logstash1003; couldn't find master (even though the master thought 1003 was fine)
  • 00:43 mutante: restarted grrrit-wm
  • 00:01 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings-labs.php: Labs-only cleanups (duration: 00m 25s)

2016-07-21

  • 23:53 Amir1: deploying 2d9817b to ores in scb nodes
  • 23:49 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 24s)
  • 23:46 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#q,298344,n,z (duration: 00m 24s)
  • 23:41 MaxSem: on tin: ran mwscript extensions/ShortUrl/populateShortUrlTable.php --wiki=urwiki
  • 23:39 MaxSem: created ShortUrl tables on urwiki
  • 23:37 ori: Restarted statsv on hafnium (cc Krinkle). 'gaierror: [Errno -3] Temporary failure in name resolution'
  • 23:34 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.11/extensions/CirrusSearch/: https://gerrit.wikimedia.org/r/#q,300430,n,z https://gerrit.wikimedia.org/r/#q,300436,n,z (duration: 00m 32s)
  • 23:32 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.11/extensions/EventBus/: https://gerrit.wikimedia.org/r/#q,300332,n,z (duration: 00m 26s)
  • 23:27 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/299619/ (duration: 00m 24s)
  • 23:22 logmsgbot: maxsem@tin Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/299615/ (duration: 00m 29s)
  • 23:21 Amir1: restarting uwsgi and celery for ores in scb1002
  • 23:20 logmsgbot: maxsem@tin Synchronized dblists/wikidatadescriptions.dblist: https://gerrit.wikimedia.org/r/#/c/299615/ (duration: 00m 24s)
  • 23:19 Amir1: restarting uwsgi and celery for ores in scb 1001
  • 23:09 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/298933/ (duration: 00m 29s)
  • 22:46 ebernhardson: restart elasticsearch on logstash1002
  • 22:22 bd808: Restarted kibana4 on logstash1001 for "node[18588]: segfault at 2fcb25f00009 ip 0000000000ad9846 sp 00007ffe526bbb40 error 4 in node[400000+1383000]"
  • 22:01 mutante: stat1002 - puppetized git pull from "refinery_source" fails
  • 21:11 logmsgbot: reedy@tin Synchronized wmf-config/CommonSettings.php: Moved WMF specific SiteMatrix data to CommonSettings (duration: 00m 26s)
  • 20:28 ejegg: re-enabled fundraising campaigns after schema update
  • 19:27 logmsgbot: demon@tin Synchronized wikiversions.json: because sync-wikiversions doesn't care about co-masters ugh (duration: 00m 29s)
  • 19:09 logmsgbot: demon@tin rebuilt wikiversions.php and synchronized wikiversions files: last wikis to wmf.11
  • 19:07 ejegg: disabled fundraising CentralNotice campaigns for paymentswiki schema update
  • 18:34 ejegg: updated payments-wiki from f23f15656eb488f5008b45b940077abbaa779004 to 79d2b67067fd7e579372b63e0d619eccfa3b9143
  • 17:33 mutante: restarted grrrit-wm
  • 17:26 logmsgbot: krinkle@tin Synchronized w/static.php: allow short-lived caching of 400/500 errors (duration: 00m 24s)
  • 17:15 ostriches: gerrit: restarting
  • 17:13 ostriches: gerrit: killed a couple of long-running git-upload-pack's for mediawiki/core
  • 17:07 gehel: cleaning leftover crons on logstash* servers - T140973
  • 16:51 urandom: T134016: Starting bootstrap of restbase2003-c.codfw.wmnet
  • 16:50 urandom: T134016: Restart of codfw rack 'c' instances to apply stream socket timeout complete
  • 16:47 urandom: T134016: Restarting Cassandra to apply new stream timeout (restbase2008-b.codfw.wmnet)
  • 16:46 logmsgbot: ebernhardson@tin Synchronized php-1.28.0-wmf.11/extensions/CirrusSearch/includes/Searcher.php: T140950: Deploy UBN fix to CirruSearch (duration: 00m 31s)
  • 16:46 urandom: T134016: Restarting Cassandra to apply new stream timeout (restbase2008-a.codfw.wmnet)
  • 16:43 urandom: T134016: Restarting Cassandra to apply new stream timeout (restbase2004-b.codfw.wmnet)
  • 16:41 urandom: T134016: Restarting Cassandra to apply new stream timeout (restbase200r-a.codfw.wmnet)
  • 16:38 urandom: T134016: Restarting Cassandra to apply new stream timeout (restbase2003-b.codfw.wmnet)
  • 16:36 urandom: T134016: Restarting Cassandra to apply new stream timeout (restbase2003-a.codfw.wmnet)
  • 16:30 subbu: finished (test) deploy of parsoid sha ed2f8228
  • 16:28 urandom: Cancelling 2003-c bootstrap, and disabling Puppet on restbase2003.codfw.wmnet to keep instance down : T134016
  • 16:27 subbu: synced parsoid code; restarting parsoid on wtp1001 as a canary
  • 16:24 subbu: starting (test) parsoid deployment
  • 16:17 subbu: aborted (test) parsoid deployment
  • 16:13 subbu: starting parsoid deployment
  • 14:58 jynus: stopping dbstore1002 for scheduled maintenace T119488
  • 14:44 paravoid: cr2-eqiad is now upgraded, passing transit and cross-DC traffic and is the VRRP master in eqiad
  • 14:43 paravoid: cr2-eqiad: restoring VRRP priorities
  • 14:40 paravoid: cr2-eqiad: restoring PyBal BGP sessions
  • 14:39 paravoid: cr2-eqiad: reenabling IX interface & BGP
  • 14:37 paravoid: cr2-eqiad: reenabling Transit interfaces & BGP
  • 14:35 paravoid: cr2-eqiad: enabling Fundraising interface & BGP
  • 14:30 paravoid: cr2-eqiad: reenabling xe-4/2/0 (link to cr1-eqord) and xe-5/2/3 (link to cr2-codfw)
  • 14:26 paravoid: cr2-eqiad: reenabling all asw-*-eqiad interfaces
  • 14:14 logmsgbot: demon@tin Synchronized wmf-config/: extension list cleanups (duration: 00m 34s)
  • 14:07 paravoid: cr2-eqiad: halting both routing engines(!)
  • 14:04 paravoid: cr2-eqiad: disabling xe-4/2/0 (link to cr1-eqord)
  • 14:04 paravoid: cr2-eqiad: disabling xe-5/2/3 (link to cr2-codfw)
  • 14:02 paravoid: cr2-eqiad: disabling all asw-*-eqiad interfaces
  • 13:41 paravoid: cr2-eqiad: fabric upgrade bandwidth for FPC 4/5
  • 13:38 paravoid: cr2-eqiad: toggling mastership between routing-engines (re1->re0)
  • 13:31 paravoid: cr2-eqiad: setting scb 0 to offline and replacing it
  • 13:31 paravoid: cr2-eqiad: setting fabric plane 0/1/2/3 to offline
  • 13:30 paravoid: cr2-eqiad: powering off re0 (backup)
  • 13:28 paravoid: cr2-eqiad: toggling mastership between routing-engines (re0->re1)
  • 13:18 mobrovac: mathoid deploying 36be4ea
  • 13:12 paravoid: cr2-eqiad: setting scb 1 to offline and replacing it
  • 13:10 paravoid: cr2-eqiad: setting fabric plane 5/6/7 to offline
  • 13:10 paravoid: cr2-eqiad: setting fabric plane 4 to offline
  • 13:10 paravoid: cr2-eqiad: setting "chassis state cb-upgrade on" and powering off re1 (backup)
  • 13:01 godog: bounce gerrit on ytterbium
  • 12:58 godog: manually flipping m2-master to db1020
  • 12:49 paravoid: cr2-eqiad: re-enabling GRES and toggling mastership between routing-engines (re1->re0)
  • 12:48 paravoid: cr2-eqiad: fixing IPv6 VRRP interoperatbility between the cr1/cr2 ( http://www.juniper.net/documentation/en_US/junos14.2/topics/concept/vrrpv3-junos-support.html )
  • 12:41 mobrovac: citoid deployed 5134e49e
  • 12:36 mobrovac: change-prop deploying b7079fd9c
  • 12:27 paravoid: cr2-eqiad: rebooting backup RE (re0)
  • 12:19 paravoid: cr2-eqiad: toggling mastership between routing-engines (re0->re1)
  • 12:15 paravoid: cr2-eqiad: setting "chassis network-services enhanced-ip" and rebooting re1 (then re0 will follow)
  • 11:49 paravoid: upgrading cr2-eqiad:re1 and rebooting
  • 11:38 paravoid: cr2-eqiad: toggling mastership between routing-engines (re1->re0)
  • 11:24 paravoid: upgrading cr2-eqiad:re0 and rebooting
  • 11:15 paravoid: cr2-eqiad: deactivate chassis redundancy graceful-switchover
  • 10:59 paravoid: cr2-eqiad: disabling IX/Transit/Fundraising interfaces
  • 10:55 paravoid: cr2-eqiad: deactivating Fundraising BGP session
  • 10:52 paravoid: cr2-eqiad: deactivating Transit BGP sessions
  • 10:50 paravoid: cr2-eqiad: deactivating IX BGP sessions
  • 10:35 paravoid: cr2-eqiad: increase cross-datacenter link OSPF metrics
  • 09:47 gehel: reinstalling and configuring relforge1001/1002 - T137256
  • 08:10 _joe_: restarting apache on palladium
  • 07:47 apergos: restarted gerrit on ytterbium, it was refusing to complete git fetches for large repos (mw core, puppet...)
  • 03:03 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Jul 21 03:03:21 UTC 2016 (duration 7m 2s)
  • 02:56 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.11) (duration: 08m 57s)
  • 02:31 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.10) (duration: 09m 33s)

2016-07-20

  • 23:43 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings-labs.php: SWAT, no-op (duration: 00m 24s)
  • 23:42 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.10/extensions/EventBus/EventBus.hooks.php: Add rev_by_bot flag to revision_create event (2/2) (duration: 00m 23s)
  • 23:41 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.10/extensions/EventBus/extension.json: Add rev_by_bot flag to revision_create event (1/2) (duration: 00m 26s)
  • 23:40 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.11/extensions/ORES/: Let ORES extension score for some namespaces instead of all (Gerrit:300083]) (duration: 00m 30s)
  • 23:38 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: ORES score edits in main and Property namespaces in wikidatawiki (Gerrit:300086) (duration: 00m 33s)
  • 21:59 mutante: new language "tcy" (Tulu) has been approved and added today - https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikipedia_Tulu
  • 21:49 mutante: DNS authdns-gen-zones on all servers to add new language tcy (bug T97051)
  • 20:28 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.11/extensions/PagedTiffHandler/: (no message) (duration: 00m 25s)
  • 20:25 ejegg: updated civicrm from a386eb5a76ec97b3b01c46a49309dfa39bbc58b0 to 2f4805fa2d2a7c57881408be2b3a017d26d8f43e
  • 20:20 awight: update paymentswiki from 7c6fb5a3b90fffdf2229cc903fb546e0e1e47998 to f23f15656eb488f5008b45b940077abbaa779004
  • 20:06 logmsgbot: demon@tin Finished scap: group1 to wmf.11 (3rd bestest try) (duration: 46m 03s)
  • 19:38 urandom: Starting Casssandra on restbase1011-b.eqiad.wmnet
  • 19:20 logmsgbot: demon@tin Started scap: group1 to wmf.11 (3rd bestest try)
  • 19:18 logmsgbot: demon@tin scap aborted: group1 to wmf.11 (2nd best try) (duration: 01m 30s)
  • 19:17 logmsgbot: demon@tin Started scap: group1 to wmf.11 (2nd best try)
  • 19:12 logmsgbot: demon@tin scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="fawiki" --list-file="/srv/mediawiki-staging/wmf-config/extension-list" --output="/tmp/tmp.KjzXNQRvAU" ' returned non-zero exit status 1 (duration: 00m 28s)
  • 19:12 logmsgbot: demon@tin Started scap: group1 to wmf.11
  • 17:23 urandom: Stopping restbase1013-c bootstrap (pending better timeouts) : T134016
  • 16:09 logmsgbot: reedy@tin Synchronized wmf-config/extension-list: More to extension registration for l10n (duration: 00m 27s)
  • 16:09 logmsgbot: thcipriani@tin Finished scap: SWAT: Fix spelling of RevisionSlider (T140875) (duration: 23m 18s)
  • 15:52 urandom: Resuming failed bootstrap on restbase2003.codfw.wmnet : T134016
  • 15:48 _joe_: removed ruthenium from the list of trebuchet minions
  • 15:46 logmsgbot: thcipriani@tin Started scap: SWAT: Fix spelling of RevisionSlider (T140875)
  • 15:41 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.11/extensions/RevisionSlider/modules/ext.RevisionSlider.HelpDialog.js: SWAT: Open links in the "tutorial" in the new window (T140875) (duration: 00m 27s)
  • 15:32 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: Beta: Test OresEnabledNamespaces on enwiki (duration: 00m 25s)
  • 15:28 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Let bureaucrats in fawiki remove sysop user group (T140810) (duration: 00m 25s)
  • 15:26 mafk: SWAT for 299446 done, which fixes T140544
  • 15:20 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Configuration changes for he.wikinews.org (duration: 00m 28s)
  • 15:08 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Compact Language Links: To beta in ruwikivoyage (duration: 00m 33s)
  • 15:04 urandom: Ammending my last to include 'Rack b'
  • 15:03 urandom: RESTBase Cassandra: raising stream throughput to 25Mbit/s; lowering compaction throughput to 10MB/s : T134016
  • 14:40 Jeff_Green: running authdns-update to deploy SPF/DKIM records for wikipedia.org
  • 14:20 urandom: Restarting bootstrap on restbas1013.eqiad.wmnet (duplicate resume?) : T134016
  • 14:05 urandom: Resuming failed bootstrap on restbase1013-c.eqiad.wmnet : T134016
  • 13:30 urandom: Performing rolling RESTBase restart to work-around Cassandra instance restart fallout : T138314 and T138314
  • 13:28 urandom: Restarting restbase1008-a.eqiad.wmnet to apply a (ephemeral) 7200000ms streaming timeout : T138314
  • 13:15 paravoid: cr2-eqiad: setting VRRP priority to 50 for all subnets, effectively switching the VRRP master to cr1-eqiad
  • 13:12 _joe_: transitioning wtp2011-2020
  • 12:35 _joe_: transitioning wtp2002-2010
  • 12:35 logmsgbot: oblivian@palladium conftool action : set/pooled=yes; selector: name=wtp2001.codfw.wmnet
  • 12:26 _joe_: transition ongoing on wtp2001
  • 12:23 logmsgbot: oblivian@palladium conftool action : set/pooled=no; selector: name=wtp2001.codfw.wmnet
  • 12:22 logmsgbot: oblivian@palladium conftool action : set/pooled=no; selector: name=
  • 12:17 _joe_: disabling puppet on all parsoid hosts for the transition to service-runner T90668
  • 07:28 elukey: restarting evenbus on kafka100[12] (T140848)
  • 06:14 _joe_: updating parsoid on wtp100[12]
  • 04:55 mutante: osmium package chromium-browser is missing after upgrade, refered to by jsbench
  • 04:45 mutante: osmium - rsyncing /home , /srv (except /srv/mediawiki created by puppet) back from temp backup on hafnium
  • 04:38 mutante: osmium result: boots into 4.4 kernel which would not work before.. lol
  • 04:35 mutante: osmium edit grub config to boot second entry (3.16), update-grub, reboot
  • 03:00 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jul 20 03:00:43 UTC 2016 (duration 6m 50s)
  • 02:53 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.11) (duration: 07m 55s)
  • 02:29 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.10) (duration: 08m 41s)
  • 02:28 logmsgbot: demon@tin Synchronized wmf-config/CommonSettings.php: delist codereview (duration: 00m 27s)
  • 01:39 logmsgbot: demon@tin Synchronized tests/SiteConfiguration.php: for completeness (duration: 00m 24s)
  • 01:35 mutante: hafnium stopping rsyncd, deleting configs
  • 01:33 mutante: labstore1005, accepting salt key (reinstall 2016-06-25)
  • 01:29 mutante: rhodium, new puppetmaster, add to salt
  • 01:19 mutante: osmium - revoke old puppet cert, salt-key .. sign new ones
  • 01:19 mutante: osmium - after reinstall with jessie, did not boot with 4.4 kernel, _does_ boot with 3.16.04.. still jessie just booted manually into the older kernel in grub
  • 01:13 logmsgbot: demon@tin Synchronized wmf-config/abusefilter.php: Disable abusefilter profiling on commonswiki (duration: 00m 26s)
  • 00:44 logmsgbot: dereckson@tin Finished scap: wmf-config/ upgrade: Gerrit changes 296770, 296767, 296929, 296930, 292623, 292624 (duration: 45m 42s)
  • 00:42 urandom: Temporarily reducing compaction throughput to 10MB/s on restbase1013-c.eqiad.wmnet : T134016
  • 00:32 mutante: osmium - reboot into PXE, reinstall

2016-07-19

  • 23:58 logmsgbot: dereckson@tin Started scap: wmf-config/ upgrade: Gerrit changes 296770, 296767, 296929, 296930, 292623, 292624
  • 22:17 awight: update paymentswiki to fundraising/REL1_27 from e8b600c518b28e3f350ced85d7d1006a76b86596 to 7c6fb5a3b90fffdf2229cc903fb546e0e1e47998
  • 22:10 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/299870/ (duration: 00m 29s)
  • 21:54 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.10/extensions/AbuseFilter: Backported fix for logspam (duration: 00m 38s)
  • 21:43 gwicke: temporarily lowering compaction throughput on all eqiad restbase cassandra instances from 60mb/s to 20mb/s via `nodetool setcompactionthroughput 20` (T140825)
  • 21:23 gwicke: temporarily lowered compaction throughput on all 1012 instances from 60mb/s to 20mb/s via `nodetool setcompactionthroughput 20` (T140825)
  • 21:08 urandom: Bootstrapping restbase2003-c.codfw.wmnet : T134016
  • 20:45 urandom: Lowering compaction throughput to 20MB/s on restbase1013-{a,b}.eqiad.wmnet : T134016
  • 20:44 urandom: Lowering compaction throughput from 35MB/s to 20MB/s on restbase1013-c.eqiad.wmnet : T134016
  • 20:28 urandom: Throttling stream throughput to 20MB/s on all rack 'b' instances : T134016
  • 20:21 urandom: Lowering compaction throughput from 45MB/s to 35MB/s on restbase1013-c.eqiad.wmnet : T134016
  • 20:06 urandom: Disabling Puppet on restbase2003.codfw.wmnet : T134016
  • 20:03 logmsgbot: demon@tin Finished scap: group0 to wmf.11 (duration: 24m 52s)
  • 19:57 urandom: Reducing stream throughput on restbase1013-{a,b} to 20MB/s : T134016
  • 19:47 urandom: Lowering compaction throughput from 60MB/s to 45MB/s on restbase1013-c.eqiad.wmnet : T134016
  • 19:38 logmsgbot: demon@tin Started scap: group0 to wmf.11
  • 19:31 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.9
  • 19:31 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.8
  • 19:31 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.7
  • 19:31 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.7
  • 19:31 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.6
  • 19:12 urandom: Starting bootstrap of restbase1013-c.eqiad.wmnet : T134016
  • 18:59 urandom: Disabling puppet on restbase1013.eqiad.wmnet : T134016
  • 17:56 paravoid: cr1-eqiad: restart chassis-control immediately (should not be traffic affecting)
  • 17:49 jynus: applying new grants to m3 dbs in preparation for db1043 failover/proxy implementation
  • 17:43 mdholloway: mobileapps deployed aa9115a
  • 17:41 mdholloway: starting mobileapps deployment
  • 17:15 logmsgbot: demon@tin Synchronized wmf-config/InitialiseSettings.php: Don't set everywhere, breaks internal to us but external to MW requests (eg gerrit, ocg, etc) (duration: 00m 25s)
  • 17:14 cmjohnson1: dbstore1002 swapping disk at slot 6
  • 16:18 logmsgbot: oblivian@palladium conftool action : set/pooled=yes; selector: name=wtp1002.*,cluster=parsoid,dc=eqiad
  • 16:01 cmjohnson1: swapping pem0-3 on cr1-eqiad
  • 15:49 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.10/extensions/ContentTranslation/includes/AbuseFilterCheck.php: SWAT: Avoid accessing private $filters field (T139657) (duration: 00m 26s)
  • 15:41 logmsgbot: oblivian@palladium conftool action : set/pooled=yes; selector: name=wtp1001.*,cluster=parsoid,dc=eqiad
  • 15:19 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove Echo transition flags PART II (duration: 00m 30s)
  • 15:19 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Remove Echo transition flags PART I (duration: 00m 26s)
  • 15:18 cmjohnson1: replacing PEM0-3 cr2-eqiad
  • 14:23 logmsgbot: oblivian@palladium conftool action : set/pooled=no:weight=15; selector: cluster=parsoid,name=wtp100[12].*
  • 14:22 logmsgbot: oblivian@palladium conftool action : set/weight=0; selector: cluster=parsoid,name=wtp100[12].*
  • 14:10 jynus: reboot and reimage dbproxy1003 to jessie T125027 T138460
  • 13:48 jynus: restarting dbproxy1005 for kernel upgrade
  • 13:44 jynus: reloading dbproxy1001 to repool db1001 as pasive backend
  • 10:37 elukey: sent SIGHUP to eventbus on kafka100[12] to reload schemas
  • 10:01 jynus: testing haproxy start sequence on dbproxy1005 (unused proxy)
  • 10:01 mobrovac: scb disabling puppet
  • 09:10 godog: upgrade slapd to 2.4.41+dfsg-1+wmf1 on serpens - T130593
  • 02:30 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jul 19 02:30:08 UTC 2016 (duration 6m 2s)
  • 02:24 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.10) (duration: 08m 37s)

2016-07-18

  • 23:55 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/292622/ (duration: 00m 25s)
  • 23:54 logmsgbot: maxsem@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/292622/ (duration: 00m 24s)
  • 23:54 logmsgbot: maxsem@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/292622/ (duration: 00m 25s)
  • 23:47 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/292621/ (duration: 00m 29s)
  • 23:46 logmsgbot: maxsem@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/292621/ (duration: 00m 31s)
  • 23:39 logmsgbot: maxsem@tin Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/292620/ part 2 (duration: 00m 27s)
  • 23:38 logmsgbot: maxsem@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/292620/ part 1 (duration: 00m 26s)
  • 23:31 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/292619/ part 2 (duration: 00m 24s)
  • 23:30 logmsgbot: maxsem@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/292619/ part 1 (duration: 00m 26s)
  • 23:19 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.10/extensions/MultimediaViewer/: https://gerrit.wikimedia.org/r/#/c/299560/ (duration: 00m 26s)
  • 23:17 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.10/extensions/Kartographer/: https://gerrit.wikimedia.org/r/#/c/299560/ (duration: 00m 27s)
  • 23:15 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/296673/ (duration: 00m 30s)
  • 22:58 bawolff: deploy fix for T129738 to php-1.28.0-wmf.10
  • 22:57 logmsgbot: demon@tin Synchronized wmf-config/CommonSettings.php: rm deprecated fundraising config (duration: 00m 26s)
  • 22:51 mutante: restarted grrrit-wm
  • 22:48 ostriches: lead: puppet turned back on
  • 22:44 dapatrick: Deployed patches for T133147 to wmf.10
  • 22:40 ostriches: lead: disabled puppet for a bit to test some CSS tweaks live.
  • 22:35 mutante: gerrit-new restarting for config change 298710
  • 22:34 dapatrick: Deployed patch for T132926 to wmf.10
  • 22:32 mutante: gerrit restarting for config change 298710
  • 22:18 bawolff: Deployed patch for T136402 on php-1.28.0-wmf.10
  • 22:07 gehel: elasticsearch / kibana upgrade done
  • 22:01 dapatrick: Deployed patch for T115333 to wmf.10
  • 21:52 ebernhardson: brought up kibana4 on logstash.wikimedia.org
  • 21:46 logmsgbot: demon@tin Finished scap: security, se-curity (duration: 08m 12s)
  • 21:38 logmsgbot: demon@tin Started scap: security, se-curity
  • 21:34 ebernhardson: changed replica count for logstash-2016.06-(01|02|03|15|16|17) indices back to 2
  • 21:09 ebernhardson: changed replica count for logstash-2016.06-(01|02|03|15|16|17) indices to 0 to make room for recovering todays index
  • 21:07 logmsgbot: demon@tin Synchronized wmf-config/InitialiseSettings.php: whitelist some rss feeds for mw.org (duration: 00m 43s)
  • 21:01 ejegg: updated payments from 8d3873f8d6b0600331775e9ccfc0cf4c6ed1e181 to e8b600c518b28e3f350ced85d7d1006a76b86596
  • 20:18 ebernhardson: installed elasticsearch-2.3.3 to logstash1001-6
  • 20:02 ebernhardson: re-shutdown elasticsearch on logstash1001-6
  • 19:55 ejegg: updated payments from 0c14940f4930e94a9287acae978cc6e661e54ee1 to 8d3873f8d6b0600331775e9ccfc0cf4c6ed1e181
  • 19:54 ebernhardson: re-stopping logstash on logstash1001-3
  • 19:52 gehel: disabling puppet on logstash.* nodes for elasticsearch upgrade
  • 19:49 mutante: ytterbium - fixing Apache config, graceful
  • 19:47 ebernhardson: shutdown elasticsearch on logstash1004-6
  • 19:38 bd808: Dropped logstash-2016.07.04 through logstash-2016.07.14 indices for backing Elasticsearch upgrade
  • 19:36 ebernhardson: shutting down logstash and elasticsearch on logstash1001-03
  • 19:05 logmsgbot: demon@tin Synchronized private/: remove obsolete wikitech config file (duration: 00m 32s)
  • 19:04 gehel: starting elasticsearch upgrade for logstash (T136001)
  • 18:57 logmsgbot: demon@tin Synchronized wmf-config/: Pruning 1.27.0-wmf.N ExtensionMessages files (duration: 00m 34s)
  • 18:08 ejegg: rolled back payments to 0c14940f4930e94a9287acae978cc6e661e54ee1
  • 18:05 ejegg: updated payments from 0c14940f4930e94a9287acae978cc6e661e54ee1 to 8d3873f8d6b0600331775e9ccfc0cf4c6ed1e181
  • 17:28 logmsgbot: demon@tin Synchronized wmf-config/InitialiseSettings.php: globally set $wgHTTPProxy (duration: 00m 26s)
  • 17:18 mobrovac: moblieapps deploying debb3f6
  • 17:08 gehel: updated wdqs to latest version, new blazegraph version, restart of wdqs-updater
  • 15:23 logmsgbot: oblivian@palladium conftool action : set/pooled=yes; selector: name=mw1261.*
  • 15:21 logmsgbot: oblivian@palladium conftool action : set/pooled=no; selector: name=mw1261.*
  • 15:10 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable global abuse filters on ptwiki (T140395) (duration: 00m 38s)
  • 15:09 hashar: gallium upgrading Zuul: zuul_2.1.0-151-g30a433b-wmf3precise1 zuul_2.1.0-151-g30a433b-wmf4precise1_amd64.deb . To support layout validation when multiple connections are used
  • 14:49 mobrovac: mobileapps deploying dfe5f11f5
  • 14:11 mobrovac: moblieapps deploying fb65cea
  • 12:03 hashar: Gerrit was slow processing requests such as git pull since 11:17 UTC . Fixed by killing all idling/waiting tasks T140604
  • 11:08 godog: swift codfw-prod: ms-be202[567] weight 3000 - T136630
  • 10:10 jynus: hard reset for db2056 T140598
  • 08:54 godog: swift eqiad-prod: ms-be102[3-6] to weight 500 - T136631
  • 08:31 hashar: gallium: upgrading Zuul 2.1.0-95-g66c8e52-wmf1precise1 .. zuul_2.1.0-151-g30a433b-wmf3precise1 T137525
  • 02:26 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Jul 18 02:26:24 UTC 2016 (duration 5m 43s)
  • 02:20 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.10) (duration: 08m 23s)

2016-07-17

  • 10:31 godog: restart slapd on serpens - T130593
  • 02:26 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Jul 17 02:26:21 UTC 2016 (duration 5m 42s)
  • 02:20 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.10) (duration: 08m 24s)

2016-07-16

  • 23:16 Krenair: testing that the SAL bot is still working
  • 20:42 logmsgbot: twentyafterfour@tin Synchronized wmf-config/interwiki.php: (no message) (duration: 00m 26s)
  • 20:41 twentyafterfour: deploying interwiki https config change https://gerrit.wikimedia.org/r/#/c/299299 refs T140206
  • 20:32 awight: update paymentswiki from 25c97ba0f27b61859f90fd205c53d587c2838fec to 0c14940f4930e94a9287acae978cc6e661e54ee1
  • 18:31 awight: enable LogCompleted for Ingenico
  • 18:27 awight: update paymentswiki from 8bf6e911eb43a2d369bf656f07d1b51be0a54f6c to 25c97ba0f27b61859f90fd205c53d587c2838fec
  • 02:27 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jul 16 02:27:00 UTC 2016 (duration 5m 44s)
  • 02:21 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.10) (duration: 08m 22s)

2016-07-15

  • 21:12 awight: reenable donation queue
  • 21:11 awight: update civicrm from cea316cc57c511c645a92a003028c95e19cac877 to a386eb5a76ec97b3b01c46a49309dfa39bbc58b0
  • 20:32 awight: disable donations queue consumer
  • 18:29 logmsgbot: ori@tin Synchronized wmf-config/InitialiseSettings.php: I8bf7c8dd: Lower default $wgSquidMaxage from 31 days to 14 days (duration: 00m 39s)
  • 15:49 mobrovac: restbase deploy end of 731284b
  • 15:37 mobrovac: restbase deploy start of 731284b
  • 15:29 ottomata: restarting hadoop-mapreduce-historyserver to apply yarn log aggreation retention settings
  • 13:40 godog: stress-test spinning disks on ms-be102[3-6]
  • 12:08 bblack: varnish: rolling frontend restarts for text+upload done
  • 11:56 bblack: varnish: starting rolling, depooled restart of text and upload frontend caches
  • 11:03 godog: swift codfw-prod: ms-be202[567] weight 2500
  • 10:30 twentyafterfour: deployed rPHABacb736547c6595fe09e05bafd7a3b563d3cf67c8 and rPHABcf12fdf248df82dc414d96bddd147c058bc3d636 to address maniphest task dependency graphs. Now related tasks will be shown as a plain list when there are too many tasks to graph.
  • 10:15 mobrovac: restbase deploy end of 018864b
  • 10:04 mobrovac: restbase deploy start of 018864b
  • 09:33 jynus: renabling semisync replication throughout s4
  • 09:31 jynus: restarted circular replication from db2019 -> db1040
  • 09:22 jynus: updating dns record for s4-master.eqiad.wmnet
  • 08:13 jynus: reimporting x1 partial db copy on dbstore1002 from x1-master
  • 07:30 moritzm: installing PHP security updates on jessie systems
  • 07:27 _joe_: powercycling mw1280
  • 06:47 moritzm: installing nspr security updates
  • 06:08 moritzm: installing libarchive security updates
  • 02:35 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jul 15 02:35:16 UTC 2016 (duration 6m 14s)
  • 02:29 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.10) (duration: 07m 54s)

2016-07-14

  • 23:11 logmsgbot: ebernhardson@tin Synchronized php-1.28.0-wmf.10/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: T137169: Turn of TextCat A/B test (duration: 00m 34s)
  • 21:08 matt_flaschen: Started backfillReadBundles.php on all group 2 wikis
  • 21:07 matt_flaschen: Started backfillUnreadWikis.php --rebuild on all group 2 wikis
  • 20:40 yurik: restarted graphoid with the new settings, enabling geoshape protocol
  • 20:02 ottomata: restarting hadoop-yarn-resourcemanager on analytics1002 and then analytics1001 to apply yarn log aggregation change
  • 19:56 logmsgbot: aude@tin Synchronized php-1.28.0-wmf.10/extensions/RevisionSlider: touching js and resource files (duration: 00m 28s)
  • 19:22 logmsgbot: demon@tin Synchronized wmf-config/CommonSettings-labs.php: maps geoshapes stuff for yurik (labs file for completeness) (duration: 00m 27s)
  • 19:21 logmsgbot: demon@tin Synchronized wmf-config/CommonSettings.php: maps geoshapes stuff for yurik (duration: 00m 31s)
  • 19:19 logmsgbot: demon@tin rebuilt wikiversions.php and synchronized wikiversions files: Move remaining wikis to wmf.10
  • 19:00 urandom: Dropping legacy Cassandra system_auth tables in RESTBase production to complete RBAC conversion : T139639
  • 15:52 logmsgbot: aude@tin Synchronized wmf-config/CommonSettings-labs.php: (no message) (duration: 00m 32s)
  • 15:50 logmsgbot: aude@tin Synchronized dblists/clldefault.dblist: Enable compact language lists on more wikis (duration: 00m 51s)
  • 15:40 jynus: shutting down es2018, pc2004, es2005 for hardware maintenance T139714
  • 15:35 logmsgbot: aude@tin Finished scap: Update i18n for RevisionSlider (duration: 46m 58s)
  • 15:08 jynus: shutting down es2014, es2015, es2016 for hardware maintenance T139714
  • 14:56 bblack: cache_misc: manually raised default_ttl to 3600 (to match https://gerrit.wikimedia.org/r/#/c/298970/ without restarts)
  • 14:48 logmsgbot: aude@tin Started scap: Update i18n for RevisionSlider
  • 14:48 jynus: shutting down es2011, es2012, es2013 for hardware maintenance T139714
  • 14:47 logmsgbot: aude@tin scap aborted: (no message) (duration: 00m 02s)
  • 14:47 logmsgbot: aude@tin Started scap: (no message)
  • 14:45 logmsgbot: aude@tin Synchronized wmf-config/CommonSettings.php: Enable RevisionSlider on test wikis (duration: 00m 28s)
  • 14:44 logmsgbot: aude@tin Synchronized wmf-config/InitialiseSettings.php: Enable RevisionSlider on test wikis (duration: 00m 27s)
  • 14:38 logmsgbot: aude@tin Synchronized wmf-config/extension-list: Add RevisionSlider to extension-list (duration: 00m 42s)
  • 09:58 godog: powercycle ms-be1012, adding back replaced disk
  • 08:40 Amir1: for ores
  • 08:40 Amir1: deploying 0e9555f to scb nodes
  • 08:39 elukey: restarted hhvm on mw1289 mw1280 mw1288 mw1284 mw1287
  • 08:18 jynus: running "megacli -PDOffline -PhysDrv '[32:6]' -aALL" on dbstore1002 to debug issue T140337
  • 08:06 elukey: upgrading cache misc to varnishkafka 1.0.11-1
  • 08:03 _joe_: removing appservers mw1018-25 from service via conftool for decommissioning (T139353)
  • 08:01 elukey: removing api servers mw111[4-9] from service via conftool as first decom step (T139353)
  • 07:55 elukey: removing api servers mw112[0-9] from service via conftool as first decom step (T139353)
  • 06:45 moritzm: restarted hhvm on mw1170
  • 06:07 moritzm: upgrading hhvm in codfw
  • 04:22 twentyafterfour: Phabricator hotfix: applied patch to disable task graph on tasks with > 100 related tasks.
  • 03:15 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Jul 14 03:14:57 UTC 2016 (duration 7m 16s)
  • 03:07 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.10) (duration: 15m 50s)
  • 02:37 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.8) (duration: 15m 17s)
  • 00:36 matt_flaschen: Ran backfillReadBundles on labtestwiki
  • 00:35 matt_flaschen: Started backfillReadBundles on labswiki
  • 00:24 matt_flaschen: Started backfillUnreadWikis --rebuild and backfillReadBundles for all group 0 and group 1 wikis earlier
  • 00:06 twentyafterfour: Phabricator maintenance completed. Service restored
  • 00:01 twentyafterfour: preparing to take Phabricator offline momentarily for scheduled maintenance / upgrade. Service should be restored within a couple of minutes.

2016-07-13

  • 23:31 eileen: update CiviCRM from 0898bb9360fe4a5ddea1a41d4e3f3e9823afee27 to cea316cc57c511c645a92a003028c95e19cac877
  • 23:27 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/298822 (duration: 00m 26s)
  • 23:25 logmsgbot: krenair@tin Synchronized static/images/project-logos: update for https://gerrit.wikimedia.org/r/298819 and https://gerrit.wikimedia.org/r/298822 (duration: 00m 24s)
  • 23:18 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/225509 + https://gerrit.wikimedia.org/r/298899 - create https://meta.wikimedia.org/wiki/Special:Contact/stewards (duration: 00m 26s)
  • 23:17 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.10/includes/resourceloader/ResourceLoaderStartUpModule.php: I882bf7075: ResourceLoader: Update expected length of module version hash (duration: 00m 25s)
  • 21:55 mutante: ytterbium - puppet enabled again, fix deployed
  • 21:48 mutante: ytterbium, disabled puppet, started apache, needs fix
  • 20:40 logmsgbot: demon@tin Synchronized README: no-op to bring co-masters in sync (duration: 00m 28s)
  • 20:28 bearND: deployed mobileapps d1eb1da
  • 20:20 logmsgbot: demon@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.10
  • 20:19 bearND: starting mobileapps deploy
  • 20:03 ebernhard|lunch: Update codfw elasticsearch cluster sttings with cluster.routing.allocation.disk.watermark.low: 70% to match eqiad and reduce free space icinga warnings
  • 19:28 ostriches: gerrit: restarting, puppet back on, issue fixed.
  • 19:23 logmsgbot: anomie@tin Synchronized php-1.28.0-wmf.10/includes/auth/AuthManager.php: Add timing data logging for T119736 (duration: 00m 28s)
  • 19:23 logmsgbot: anomie@tin Synchronized php-1.28.0-wmf.8/includes/auth/AuthManager.php: Add timing data logging for T119736 (duration: 00m 27s)
  • 19:14 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.8/includes/api/ApiQueryRecentChanges.php: API: Remove index forcing in ApiQueryRecentChanges - T140108 (duration: 00m 26s)
  • 19:04 logmsgbot: demon@tin rebuilt wikiversions.php and synchronized wikiversions files: all group0 to wmf.10
  • 18:41 ostriches: gerrit/ytterbium: flapped for a minute because of incompat 2.12/2.8 config. Working, puppet disabled pending real fix.
  • 18:40 mutante: gerrit has a temp problem. maintenance going on
  • 18:34 logmsgbot: krenair@tin Synchronized wmf-config: labs-only change, should be a noop here: https://gerrit.wikimedia.org/r/298812 (duration: 00m 27s)
  • 18:32 mutante: gerrit will restart shortly for a config change. expect a very short downtime
  • 18:22 legoktm: checkLocalUser.php finished, starting run #2 now
  • 17:58 logmsgbot: legoktm@tin Synchronized wmf-config/: PoolCounterClient.php -> extension.json (duration: 00m 32s)
  • 17:44 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.10/extensions/WikimediaEvents/WikimediaEventsHooks.php: Include the namespace for all pages & Include the resolved special page name for special pages - T138500 (duration: 00m 36s)
  • 17:41 logmsgbot: demon@tin Finished scap: wmf.10 code sync + testwiki to wmf.10 for l10n cache gen (once more with feeling) (duration: 47m 06s)
  • 17:35 ejegg: turned on cURL verbose logging for AstroPay requests on payments
  • 17:27 jynus: drop databases fab_migration, percona and test from m3 T138460
  • 17:25 _joe_: restarting hhvm on mw1229 (stuck in HPHP::Treadmill::getAgeOldestRequest)
  • 17:22 urandom: Starting restbase on restbase1013.eqiad.wmnet
  • 16:54 logmsgbot: demon@tin Started scap: wmf.10 code sync + testwiki to wmf.10 for l10n cache gen (once more with feeling)
  • 16:50 ejegg: updated CentralNotice for cookie cleanup
  • 16:48 logmsgbot: ejegg@tin Synchronized php-1.28.0-wmf.8/extensions/CentralNotice/: (no message) (duration: 01m 52s)
  • 16:47 logmsgbot: demon@tin scap failed: OSError [Errno 1] Operation not permitted: '/var/lock/scap' (duration: 00m 00s)
  • 16:41 logmsgbot: demon@tin scap aborted: wmf.10 code sync + testwiki to wmf.10 for l10n cache gen (duration: 00m 37s)
  • 16:40 logmsgbot: demon@tin Started scap: wmf.10 code sync + testwiki to wmf.10 for l10n cache gen
  • 16:39 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.9
  • 16:38 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.7
  • 16:38 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.6
  • 16:37 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.5
  • 16:19 hashar: CI slightly overloaded / backloaded due to a long tail of Wikibase changes sent in Gerrit.
  • 16:03 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: [Beta] Change ORES thresholds in beta (duration: 00m 29s)
  • 16:01 logmsgbot: thcipriani@tin Synchronized wmf-config/LabsServices.php: SWAT: [Beta] Parsoid: direct traffic to deployment-parsoid07 (duration: 00m 26s)
  • 15:30 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable ORES review tool for Turkish Wikipedia (T139992) (duration: 00m 28s)
  • 15:11 urandom: Stopping Staging dumps : T139639
  • 14:44 urandom: Starting offset dump runs from {xenon,cerium,praseodymium}.eqiad.wmnet : T139639
  • 14:38 urandom: Restarting Cassandra on xenon.eqiad.wmnet : T139639
  • 14:32 urandom: Restarting RESTBase on xenon.eqiad.wmnet : T139639
  • 14:29 yurik: deployed kartotherian https://gerrit.wikimedia.org/r/#/c/298731/ & tilerator https://gerrit.wikimedia.org/r/#/c/298732/
  • 14:25 moritzm: depooling mw1298 (image scaler) for some tests
  • 14:21 godog: reboot ms-be1012, many mkfs.xfs stuck on broken sdh
  • 14:17 logmsgbot: oblivian@palladium conftool action : delete; selector: cluster=rcstream
  • 14:16 gehel: shutting down elastic1001-1016 (T139758)
  • 14:16 logmsgbot: oblivian@palladium conftool action : delete; selector: cluster=rcstream
  • 14:13 yurik: about to deploy updated kartotherian & tilerator for node 4.4.6
  • 14:09 urandom: Dropping legacy system_auth tables in staging to complete RBAC conversion : T139639
  • 14:05 bblack: rcstream cleanup done, puppet re-enabled on relevant lvs and rcs100x
  • 13:58 gehel: cleanup puppet / salt from old elasticsearch servers elastic1001-1016 (T139758)
  • 13:58 hashar: T137525 reverted Zuul back to zuul_2.1.0-95-g66c8e52-wmf1precise1_amd64.deb . It could not connect to Gerrit reliably
  • 13:45 ottomata: restarting hadoop nodemanagers to apply log aggregation retention check interval change
  • 13:43 bblack: restarting pybal on primary eqiad high-traffic2 (lvs1002)
  • 13:41 moritzm: upgrading hhvm on remaining appservers in eqiad and codfw
  • 13:34 gehel: disabling puppet and stopping elasticsearch on elastic1001-1016 (T139758)
  • 13:30 bblack: disabling puppet on rcs100[12] for rcstream cleanup
  • 13:29 bblack: disabling puppet on eqiad high-traffic2 lvs for rcstream cleanup
  • 13:20 gehel: scheduling icinga downtime on elastic1001-1016 prior to decommissioning (T139758)
  • 13:01 elukey: upgrading cache maps to varnishkafka 1.0.11-1
  • 13:00 elukey: uploaded varnishkafka 1.0.11-1 to jessie-wikimedia experimental
  • 12:53 hashar: CI is processing with Zuul 2.1.0-151-g30a433b. It might stop processing events at anytime though due to T137525
  • 12:36 hashar: T137525 Upgrading Zuul 2.1.0-95-g66c8e52-wmf1precise1 ... zuul_2.1.0-151-g30a433b-wmf1precise1_amd64.deb
  • 11:52 elukey: installing varnishkafka_1.0.11-1 on cp3008.esams to test it before the complete rollout
  • 11:38 paravoid: cr1-ulsfo: "restart snmp" to fix SNMP hiccup after reboot
  • 11:38 paravoid: cr1/2-ulsfo: disabling flow monitoring
  • 04:22 logmsgbot: legoktm@tin Synchronized wmf-config/InitialiseSettings.php: revert http logging change (duration: 00m 31s)
  • 04:20 logmsgbot: legoktm@tin Synchronized wmf-config/interwiki.php: Update interwiki map, make them HTTPS (duration: 00m 39s)
  • 03:44 logmsgbot: legoktm@tin Synchronized wmf-config/InitialiseSettings.php: Log 'http' at warning level to debug transwiki import errors (duration: 00m 29s)
  • 02:34 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jul 13 02:34:52 UTC 2016 (duration 6m 8s)
  • 02:28 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.8) (duration: 10m 11s)
  • 00:56 ori: Restarted grrrit-wm
  • 00:41 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.8/extensions/VisualEditor/: https://gerrit.wikimedia.org/r/#/c/298677/ (duration: 01m 01s)
  • 00:07 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.8/extensions/Echo/includes/ForeignWikiRequest.php: getCentralAuthToken visibility back to protected (Gerrit:298661) (duration: 00m 27s)

2016-07-12

  • 23:43 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Add flow-create-board for gomwiki sysop (T139226) (duration: 00m 27s)
  • 23:41 mutante: lithium deleted some logs older than 60 days to make space
  • 23:33 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Enable lazy loaded references and images on Thai wikipedia (T136731) (duration: 00m 38s)
  • 23:29 mutante: stat1003 still on every puppet run a mongodb gets started..over and over again
  • 23:12 ejegg: updated payments from d9f7027340e5311f38c4224c2fddde087467df87 to 8bf6e911eb43a2d369bf656f07d1b51be0a54f6c
  • 23:12 ostriches: lead: puppet disabled for a bit while index building is in progress.
  • 23:12 ostriches: ytterbium: puppet enabled again, all happy again
  • 22:56 ostriches: ytterbium: disabled puppet for a moment so we can do a config change w/o gerrit restarting itself
  • 22:06 eileen: upgrade Civicrm from f7434730ebd87f6d542c34c080c61eb3f21ccc6b to 0898bb9360fe4a5ddea1a41d4e3f3e9823afee27
  • 21:52 eileen: Updating CiviCRM from 415d7e62bc3bcbd7c5e3682da64ee4847ad63f5b to f7434730ebd87f6d542c34c080c61eb3f21ccc6b
  • 21:24 logmsgbot: mattflaschen@tin Synchronized php-1.28.0-wmf.8/extensions/Echo/includes/ForeignWikiRequest.php: T140144: Echo/CentralAuth: Bail if not fully initialized (duration: 00m 49s)
  • 20:18 urandom: Start revision culling script for local_group_wikipedia_T_parsoid_html, from restbase1009.eqiad.wmnet : T140008
  • 19:30 logmsgbot: mattflaschen@tin Synchronized php-1.28.0-wmf.8/extensions/Echo/includes/ForeignWikiRequest.php: T119736: T140144: Troubleshoot why Echo is still triggering CA failures (duration: 00m 39s)
  • 19:02 mutante: git pulled on strontium to sync with palladium
  • 18:48 matt_flaschen: Started checkLocalUser.php at ~2016-07-12 17:45 UTC, killed ~18:06 since Echo apparently is not fully fixed after all.
  • 18:38 legoktm: foreachwiki ../../../../home/legoktm/checkLocalUser.php --delete=1 --verbose=1 on terbium
  • 18:24 logmsgbot: anomie@tin Synchronized php-1.28.0-wmf.9/includes/auth/AuthManager.php: Commit transaction after auto-creating a user gerrit:298541 (duration: 00m 29s)
  • 18:22 logmsgbot: anomie@tin Synchronized php-1.28.0-wmf.8/includes/auth/AuthManager.php: Commit transaction after auto-creating a user gerrit:298540 (duration: 00m 30s)
  • 18:08 logmsgbot: demon@tin Synchronized wmf-config/InitialiseSettings.php: turn cx back on (duration: 00m 29s)
  • 17:44 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.8/extensions/ContentTranslation/: ping limiter fixes (duration: 00m 29s)
  • 17:37 jynus: out of band ALTER TABLE recentchanges ADD KEY `name_type_patrolled_timestamp` on db1054 T140108
  • 16:58 logmsgbot: demon@tin Synchronized wmf-config/InitialiseSettings.php: prep pinglimiter config for content translation (duration: 00m 33s)
  • 16:56 Amir1: deploying ores f472f65 to scb
  • 16:51 Amir1: deploying ores f472f65 to scb2001
  • 16:42 godog: disable puppet on ms-fe* and re-enable gradually to apply https://gerrit.wikimedia.org/r/#/c/298297/
  • 16:25 logmsgbot: mattflaschen@tin Synchronized php-1.28.0-wmf.8/extensions/Echo/includes/ForeignWikiRequest.php: T119736: ForeignWikiRequest: Bail early for non-global users (duration: 00m 32s)
  • 16:12 logmsgbot: demon@tin Synchronized wmf-config/InitialiseSettings.php: Disable content translation, outage right now (duration: 00m 29s)
  • 15:44 bblack: cache nodes: salt manual removal of vm compaction cron via sed ( https://gerrit.wikimedia.org/r/298499 )
  • 15:33 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.9/extensions/Echo/includes/ForeignWikiRequest.php: SWAT: ForeignWikiRequest: Bail early for non-global users (T119736) (duration: 00m 31s)
  • 15:29 logmsgbot: thcipriani@tin Synchronized portals: SWAT: Bumping portals to master (T128546) (duration: 00m 29s)
  • 15:28 logmsgbot: thcipriani@tin Synchronized portals/prod/wikipedia.org/assets: SWAT: Bumping portals to master (T128546) (duration: 00m 29s)
  • 15:20 bblack: upgrading nginx to 1.11.2-1+wmf1 on all caches
  • 15:18 logmsgbot: thcipriani@tin Synchronized static/images/project-logos/trwikimedia.png: SWAT: Revert Logo update for trwikimedia (T140015) (duration: 00m 29s)
  • 15:09 logmsgbot: thcipriani@tin Synchronized static/images/project-logos/trwikimedia.png: SWAT: Logo update for trwikimedia (T140015) (duration: 00m 33s)
  • 14:58 bblack: upgrading nginx to 1.11.2-1+wmf1 on cache_maps
  • 14:33 elukey: Rebuild new AQS Cassandra cluster (aqs100[456]) to remove previous testing settings (no prod traffic is served)
  • 14:23 logmsgbot: reedy@tin Synchronized wmf-config/extension-list: even more extension.json (duration: 00m 26s)
  • 14:17 bblack: nginx 1.11.2-1+wmf1 uploaded to carbon
  • 14:17 logmsgbot: reedy@tin Synchronized wmf-config/extension-list: moar extension.json (duration: 00m 26s)
  • 14:04 bblack: lvs nodes: apt-get install linux-meta
  • 13:52 bblack: lvs nodes: apt-get upgrade to latest (various base system packages)
  • 13:49 bblack: cache nodes: apt-get upgrade to latest (just 3.16 kernel)
  • 13:05 ottomata: restarting nodemanagers on analytics 1039 1046 and 1054
  • 10:45 godog: terbium:~# lvextend --size +70G -r /dev/mapper/terbium--vg-root T139786
  • 09:35 gehel: lowering elasticsearch codfw high watermark to rebalance cluster
  • 09:32 godog: reboot ms-be3004 / high load average and xfs unhappy
  • 09:22 godog: progressively delete esams swift containers, unused and not in production
  • 07:49 elukey: removing api servers mw113[0-9] from service via conftool as first decom step (T139353)
  • 06:31 logmsgbot: legoktm@tin Synchronized wmf-config/CommonSettings.php: Don't block logins if CentralAuthUser::queryAttached() fails - T119736 (duration: 00m 27s)
  • 06:00 legoktm: running checkLocalUsers.php on terbium
  • 04:37 ori: Reverted all wikis to wmf8 due to tenfold increase in T119736
  • 04:35 logmsgbot: ori@tin rebuilt wikiversions.php and synchronized wikiversions files: (no message)
  • 02:26 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jul 12 02:26:41 UTC 2016 (duration 5m 31s)
  • 02:21 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.9) (duration: 08m 44s)

2016-07-11

  • 23:49 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.9/resources/: https://gerrit.wikimedia.org/r/#/c/298402/ (duration: 00m 28s)
  • 23:36 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.9/extensions/Echo/: https://gerrit.wikimedia.org/r/#/c/298400/ (duration: 00m 33s)
  • 23:32 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.9/extensions/Citoid/: https://gerrit.wikimedia.org/r/#/c/298327/ (duration: 00m 27s)
  • 23:30 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.9/extensions/Wikidata: https://gerrit.wikimedia.org/r/#/c/298386/ (duration: 02m 00s)
  • 23:15 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/297964/ (duration: 00m 32s)
  • 23:11 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/297556/ (duration: 00m 28s)
  • 23:07 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.9/extensions/Kartographer/: https://gerrit.wikimedia.org/r/#/c/297556/ (duration: 00m 29s)
  • 22:46 logmsgbot: krenair@tin Synchronized wmf-config: more labs-only changes: https://gerrit.wikimedia.org/r/298393 (duration: 00m 36s)
  • 21:50 logmsgbot: krenair@tin Synchronized wmf-config: sync labs-only change, should be a noop here: https://gerrit.wikimedia.org/r/#/c/298299/ (duration: 00m 39s)
  • 21:40 awight: resuming fundraising donations queue consumer
  • 21:37 mutante: ytterbium - graceful'ed Apache, warning about duplicate NameVirtual host is gone
  • 21:03 bd808: Updated default mapping for logstash-* index creation using json generated by https://gerrit.wikimedia.org/r/#/c/298295/. Should take effect starting with the logstash-2016.07.12 index.
  • 20:44 logmsgbot: reedy@tin Synchronized wmf-config/extension-list-labs: nooop for prod (duration: 00m 32s)
  • 20:31 ottomata: rolling restart of hadoop-yarn-nodemanager to apply log aggregation retention seconds
  • 20:29 mdholloway: mobileapps deployed df16702
  • 20:25 mdholloway: starting mobileapps deployment
  • 20:20 awight: disabled fundraising donation queue consumer...
  • 20:09 subbu: finished deploying parsoid sha e738c415
  • 20:05 subbu: synced new parsoid code; restarted parsoid on wtp1001 as a canary
  • 20:03 subbu: starting parsoid deploy
  • 18:37 chasemp: new hd for failed array in labstore2001
  • 18:23 ejegg: updated payments from 2fc573cbb94e833c4144aa9dad79de8ec374bb09 to d9f7027340e5311f38c4224c2fddde087467df87
  • 18:08 mutante: welcome new mediawiki deployer Brian Wolff (T138635)
  • 17:44 ejegg: updated CiviCRM from f477a42014dd1e6759849b347d5f73d710954d0b to bf029eecb9bfb49d267e60d76344b0170bfa0a83
  • 17:09 awight: reenable fundraising campaigns
  • 17:07 gehel: starting deployment of latest WDQS (second time deploying with scap3)
  • 16:52 twentyafterfour: unclog the phabricator task queue (phd) by cherry-picking upstream fix 12c6f87ca to wmf/stable (+restarted phd on iridium)
  • 15:47 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Update logo settings for the Nepali Wikipedia (T139240) PART II (duration: 00m 26s)
  • 15:46 logmsgbot: thcipriani@tin Synchronized static/images/project-logos: SWAT: Update logo settings for the Nepali Wikipedia (T139240) PART I (duration: 00m 27s)
  • 15:43 elukey: restarted hhvm on mw1170 (Apache errors while reading FCGI headers, HHVM dump debug in /tmp/hhvm.14968.bt.)
  • 15:40 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow to import from zh.wikipedia to beta.wikiversity (T139922) (duration: 00m 26s)
  • 15:37 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: State Compact Language Links is not beta anymore (T136677) (duration: 00m 26s)
  • 15:33 logmsgbot: thcipriani@tin Synchronized wmf-config/interwiki.php: SWAT: Update interwiki map (duration: 00m 28s)
  • 15:25 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: HD version for sqwikiquote logos (T139229) PART II (duration: 00m 27s)
  • 15:25 logmsgbot: thcipriani@tin Synchronized static/images/project-logos: SWAT: HD version for sqwikiquote logos (T139229) PART I (duration: 00m 25s)
  • 15:19 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add contentdm.lib.byu.edu to wgCopyUploadsDomains (T139095) (duration: 00m 26s)
  • 15:13 logmsgbot: thcipriani@tin Synchronized wmf-config/throttle.php: SWAT: Remove old throttle rules (duration: 00m 30s)
  • 15:06 elukey: removing mw114[0-8] from service via conftool as first decom step (T139353)
  • 15:06 logmsgbot: thcipriani@tin Synchronized static/images/project-logos/sqwikiquote.png: SWAT: Change Albanian Wikiquote logo (T139229) (duration: 00m 34s)
  • 14:46 ema: upgrading cache_misc to varnish 4.1.3-1wm1
  • 14:39 _joe_: shutting down mw1090-1113,mw1149-51 for decommissioning
  • 14:05 moritzm: rebooting achernar for kernel update
  • 13:31 moritzm: rebooting acamar for kernel update
  • 13:07 ema: upgrading esams cache_maps to varnish 4.1.3-1wm1
  • 12:51 gehel: upgrading nodejs to 4.4.6 on maps2.* servers
  • 12:24 ema: upgrading ulsfo cache_maps to varnish 4.1.3-1wm1
  • 12:14 elukey: restarted hhvm on mw1261
  • 12:03 ema: upgrading codfw cache_maps to varnish 4.1.3-1wm1
  • 11:30 ema: upgrading eqiad cache_maps to varnish 4.1.3-1wm1
  • 11:19 moritzm: installing hhvm updates on canary app servers
  • 10:25 hashar: CI: upgraded Chromium from v49 to v51 (v50 caused qunit jobs to fail / timeout randomly) T136188
  • 10:00 godog: swift codfw-prod: ms-be202[567] weight 2000
  • 09:47 moritzm: installing GCC stable updates on trusty systems (also provides some runtime libs in addition to GCC itself)
  • 09:40 ema: upgrading cp1046 to varnish 4.1.3-1wm1
  • 07:29 mobrovac: change-prop deploying 2b699a6
  • 07:26 mobrovac: graphoid deploying 375d31fd
  • 07:24 mobrovac: mathoid deploying 669cfc0
  • 07:19 mobrovac: cxserver deploying fd8eca47e
  • 07:15 mobrovac: citoid deploying 274c0231d
  • 07:13 mobrovac: mobileapps deploying 6e409f46
  • 06:34 _joe_: restarted hhvm on mw1168
  • 06:13 moritzm: restarted saltmaster on neodymium
  • 02:26 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Jul 11 02:26:55 UTC 2016 (duration 5m 41s)
  • 02:21 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.9) (duration: 08m 41s)

2016-07-10

  • 02:25 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Jul 10 02:25:55 UTC 2016 (duration 5m 43s)
  • 02:20 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.9) (duration: 08m 34s)

2016-07-09

  • 19:54 bd808: restarted logstash on logstash1003 for de-dot plugin update (T136001)
  • 19:52 bd808: restarted logstash on logstash1002 for de-dot plugin update (T136001)
  • 19:50 bd808: restarted logstash on logstash1001 for de-dot plugin update (T136001)
  • 19:46 bd808: Updated logstash/plugins to 18b3f1f (Fix de_dot to process keys with falsey values)
  • 02:29 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jul 9 02:29:53 UTC 2016 (duration 6m 7s)
  • 02:23 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.9) (duration: 08m 21s)
  • 00:28 mutante: analytics1027 - icinga said puppet fail, just ran it, recovery, same on neon.. something kafka graphite checks
  • 00:06 mutante: gerrit is back
  • 00:05 ostriches: gerrit: disabled puppet for a minute so I can unbreak gerrit so I can fix gerrit in puppet.

2016-07-08

  • 23:57 mutante: behold, gerrit might restart now for config change
  • 23:52 Amir1: manually restarting uwsgi-ores in scb1001
  • 23:48 logmsgbot: krinkle@tin Synchronized docroot/foundation/: Remove unused docroot/foundation/index.html (duration: 00m 30s)
  • 23:32 Amir1: manually restarting uwsgi-ores on scb1002
  • 22:59 urandom: Restarting restbase1015-b.eqiad.wmnet to cancel running streams : T139362
  • 22:26 eileen: from bdf2afd417b70332c9542fd3ee4f14cb4e6f93cc to f477a42014dd1e6759849b347d5f73d710954d0b
  • 22:13 urandom: Restarting restbase1014-b.eqiad.wmnet to cancel running streams : T139362
  • 22:02 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.9/extensions/CentralAuth/: Fix job serializing (and status display on Special:GlobalRenameProgress) - T137973 (duration: 00m 32s)
  • 21:59 logmsgbot: reedy@tin Synchronized wmf-config/extension-list: Fix RestBaseUpdateJobs in extension-list (duration: 00m 36s)
  • 21:51 logmsgbot: reedy@tin Synchronized wmf-config/CommonSettings.php: Use Canonical entry point for RestBaseUpdateJobs (duration: 00m 33s)
  • 21:50 logmsgbot: reedy@tin Synchronized multiversion/MWWikiversions.php: Remove some else if spaces (duration: 00m 46s)
  • 21:48 urandom: Restarting restbase1009-b.eqiad.wmnet to cancel running streams : T139362
  • 21:31 elukey: mw1146 powercycled (memory pressure, no ssh/root login)
  • 21:17 urandom: Restarting restbase1015-a.eqiad.wmnet to cancel running streams : T139362
  • 21:11 urandom: Restarting restbase1014-a.eqiad.wmnet to cancel running streams : T139362
  • 21:04 urandom: Restarting restbase1009-a.eqiad.wmnet to cancel running streams : T139362
  • 20:59 urandom: Forcing node removal (restbase1014-c.eqiad.wmnet) : T139362
  • 20:51 urandom: Throttle RESTBase Cassandra outgoing streams to 1mbit cluster-wide : T139362 (actually happened at 21:26)
  • 20:17 bd808: Deleted old l10nupdate caches manually on tin (T130317)
  • 19:56 urandom: Throttle RESTBase Cassandra outgoing streams to 3mbit cluster-wide : T139362
  • 18:52 anomie: Attempting to resubmit LocalRenameUserJobs for T137973
  • 18:48 urandom: "This is going to hurt me more than it does you."; `nodetool removenode' of restbase1014-c.eqiad.wmnet : T139362
  • 18:39 urandom: Stopping restbase1014-c.eqiad.wmnet : T139362
  • 18:29 mutante: db1042 - temp stop puppet, edit ferm rules to allow testing from lead
  • 14:43 jynus: rechecking data consistency after m3 table fixes (could cause lag)
  • 12:21 moritzm: installing glib updates from jessie point release
  • 12:03 jynus: stopping replication on db1043 (m3-slave) for maintenance
  • 09:33 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: aqs1002.eqiad.wmnet
  • 09:30 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: aqs1002.eqiad.wmnet
  • 09:30 elukey: upgrading nodejs packages on aqs100[23]
  • 09:10 logmsgbot: elukey@palladium conftool action : set/pooled=yes:weight=5; selector: mw1261.eqiad.wmnet
  • 09:06 logmsgbot: elukey@palladium conftool action : set/pooled=no:weight=5; selector: mw1261.eqiad.wmnet
  • 09:03 logmsgbot: elukey@palladium conftool action : set/pooled=yes:weight=5; selector: mw1261.eqiad.wmnet
  • 09:01 moritzm: rebooting ruthenium for update to Linux 4.4
  • 08:40 hashar: gallium: deleting old log files /var/log/zuul/gearman-server-debug.log*
  • 08:35 hashar: gallium: restarting Zuul to apply logging configuration change https://gerrit.wikimedia.org/r/#/c/291913/
  • 08:18 moritzm: rebooting terbium for kernel security update
  • 08:12 moritzm: rearming keyholder on tin
  • 08:08 moritzm: rebooting tin for kernel security update
  • 08:08 logmsgbot: oblivian@palladium conftool action : set/pooled=no; selector: name=mw1261.eqiad.wmnet
  • 07:15 _joe_: removing 20 gb logfile from terbium, only useless debug info
  • 02:37 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jul 8 02:37:29 UTC 2016 (duration 6m 22s)
  • 02:31 logmsgbot: dereckson@tin scap sync-l10n completed (1.28.0-wmf.9) (duration: 07m 47s)
  • 02:12 Dereckson: scap pull on terbium (was out of disk space during previous full scap)
  • 02:00 logmsgbot: l10nupdate@tin LocalisationUpdate failed: git pull of extensions failed
  • 01:54 mutante: terbium ran out of disk, deleted rotated nutcracker log
  • 01:48 logmsgbot: dereckson@tin Finished scap: Flow 297914, Echo 297919 297934, ORES 297916 (duration: 27m 48s)
  • 01:26 ostriches: gerrit: readded robots.txt to ytterbium for now
  • 01:20 logmsgbot: dereckson@tin Started scap: Flow 297914, Echo 297919 297934, ORES 297916
  • 00:42 awight: roll back paymentswiki further, to 2fc573cbb94e833c4144aa9dad79de8ec374bb09
  • 00:39 awight: roll back paymentswiki from f54ffb4fad0dc18079a813fbe25813dba36c64aa to c33ddfccf945bd075f0abff9e9de8c09f0174f89
  • 00:27 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.9/includes/api/ApiParse.php: API: Generate head items in the context of the given title (T139565) (duration: 00m 30s)
  • 00:01 awight: CentralNotice campaigns reenabled
  • 00:00 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Enable ORES review tool as a beta feature in ptwiki (T139692) (duration: 00m 28s)

2016-07-07

  • 23:57 Dereckson: Ran extensions/ORES/maintenance/CheckModelVersions.php and extensions/ORES/maintenance/PopulateDatabase.php on ptwiki (T139692)
  • 23:55 awight: Update paymentswiki from c33ddfccf945bd075f0abff9e9de8c09f0174f89 to f54ffb4fad0dc18079a813fbe25813dba36c64aa
  • 23:50 Dereckson: Created table ores_classification on ptwiki from php-1.28.0-wmf.9/extensions/ORES/sql/ores_classification.sql
  • 23:48 Dereckson: Created table ores_model on ptwiki from php-1.28.0-wmf.9/extensions/ORES/sql/ores_model.sql
  • 23:36 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Remove unused deprecated $wgStyleSheetPath (Gerrit:297511) (duration: 00m 27s)
  • 23:23 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: VisualEditor: Move cite out of primary toolbar except on WP/WB/WV (Gerrit:296573) (duration: 00m 30s)
  • 22:33 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.9/extensions/VisualEditor/: https://gerrit.wikimedia.org/r/#/c/297795/ and https://gerrit.wikimedia.org/r/#/c/297908/ (duration: 00m 45s)
  • 21:32 urandom: Bootstrapping restbase1009-c : T139362
  • 20:59 urandom: Fin : T126629
  • 20:52 urandom: Restarting Cassandra instance restbase2009-b.codfw.wmnet : T126629
  • 20:50 urandom: Restarting Cassandra instance restbase2009-a.codfw.wmnet : T126629
  • 20:49 urandom: Upgrading Cassandra to 2.2.6-wmf1 on restbase2009.codfw.wmnet : T126629
  • 20:48 urandom: Cassandra upgrade of restbase2006.codfw.wmnet instances complete : T126629
  • 20:45 urandom: Restarting RESTBase on restbase2001.codfw.wmnet
  • 20:41 urandom: Restarting Cassandra instance restbase2006-b.codfw.wmnet : T126629
  • 20:38 urandom: Restarting Cassandra instance restbase2006-a.codfw.wmnet : T126629
  • 20:36 urandom: Upgrading Cassandra to 2.2.6-wmf1 on restbase2006.codfw.wmnet : T126629
  • 20:33 urandom: Cassandra upgrade of restbase2005.codfw.wmnet instances complete : T126629
  • 20:30 urandom: Restarting Cassandra instance restbase2005-b.codfw.wmnet : T126629
  • 20:27 urandom: Restarting Cassandra instance restbase2005-a.codfw.wmnet : T126629
  • 20:27 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.9
  • 20:26 twentyafterfour: deploying wmf.9 to all wikis refs T138555
  • 20:26 urandom: Upgrading Cassandra to 2.2.6-wmf1 on restbase2005.codfw.wmnet : T126629
  • 20:15 gehel: logstash upgrade aborted, rescheduled to Monday July 11th
  • 20:10 bd808: Restarted logstash on logstash1001; hoping to clear up missing de-dot errors
  • 20:07 bd808: Restarted logstash on logstash1002; hoping to clear up missing de-dot errors
  • 20:07 urandom: Cassandra upgrade of restbase2008.codfw.wmnet instances complete : T126629
  • 20:04 urandom: Restarting Cassandra instance restbase2008-b.codfw.wmnet : T126629
  • 20:02 urandom: Restarting Cassandra instance restbase2008-a.codfw.wmnet : T126629
  • 20:02 logmsgbot: mattflaschen@tin Synchronized php-1.28.0-wmf.9/extensions/Echo: Fixes for notification sorting and message parsing (duration: 00m 38s)
  • 20:01 urandom: Upgrading Cassandra to 2.2.6-wmf1 on restbase2008.codfw.wmnet : T126629
  • 20:01 urandom: Cassandra upgrade of restbase2004.codfw.wmnet instances complete : T126629
  • 19:59 logmsgbot: mattflaschen@tin Synchronized php-1.28.0-wmf.9/extensions/Flow/includes/Notifications/PostReplyPresentationModel.php: flow-post-reply: show compact header on one line (duration: 00m 32s)
  • 19:59 urandom: Restarting Cassandra instance restbase2004-b.codfw.wmnet : T126629
  • 19:57 bd808: Restarted logstash on logstash1003; hoping to clear up missing de-dot errors
  • 19:55 urandom: Restarting Cassandra instance restbase2004-a.codfw.wmnet : T126629
  • 19:55 urandom: Upgrading Cassandra to 2.2.6-wmf1 on restbase2004.codfw.wmnet : T126629
  • 19:53 urandom: Cassandra upgrade of restbase2003.codfw.wmnet instances complete : T126629
  • 19:50 urandom: Restarting Cassandra instance restbase2003-b.codfw.wmnet : T126629
  • 19:46 urandom: Restarting Cassandra instance restbase2003-a.codfw.wmnet : T126629
  • 19:45 urandom: Upgrading Cassandra to 2.2.6-wmf1 on restbase2003.codfw.wmnet : T126629
  • 19:32 urandom: Upgrade of restbase2007.codfw.wmnet instances complete : T126629
  • 19:29 urandom: Restarting Cassandra instance restbase2007-c.codfw.wmnet : T126629
  • 19:27 urandom: Restarting Cassandra instance restbase2007-b.codfw.wmnet : T126629
  • 19:24 urandom: Restarting Cassandra instance restbase2007-a.codfw.wmnet : T126629
  • 19:22 urandom: Upgrading Cassandra to 2.2.6-wmf1 on restbase2007.codfw.wmnet : T126629
  • 19:19 urandom: Upgrade of restbase2002.codfw.wmnet instances complete : T126629
  • 19:16 urandom: Restarting Cassandra instance restbase2002-c.codfw.wmnet : T126629
  • 19:13 urandom: Restarting Cassandra instance restbase2002-b.codfw.wmnet : T126629
  • 19:12 ostriches: gerrit: force all users to log out. sorry ❤️
  • 19:11 urandom: Restarting Cassandra instance restbase2002-a.codfw.wmnet : T126629
  • 19:09 urandom: Upgrading Cassandra to 2.2.6-wmf1 on restbase2002.codfw.wmnet : T126629
  • 19:08 urandom: Cassandra upgrade of restbase2001.codfw.wmnet instances complete : T126629
  • 19:05 urandom: Restarting Cassandra instance restbase2001-c.codfw.wmnet : T126629
  • 19:02 urandom: Restarting Cassandra instance restbase2001-b.codfw.wmnet : T126629
  • 18:56 urandom: Restarting Cassandra instance restbase2001-a.codfw.wmnet : T126629
  • 18:53 urandom: Upgrading Cassandra to 2.2.6-wmf1 on restbase2001.codfw.wmnet : T126629
  • 18:27 urandom: Disabling Puppet on RESTBase codfw nodes : T126629
  • 18:23 awight: Update SmashPig from 917138e159f0341e3dfbb35818c3ce479927875b to e6aa6fe6fdcaab8e961a8b0668cc742d4c443c46
  • 18:15 urandom: Cassandra 2.2.6 upgrade of restbase1015.eqiad.wmnet instances complete : T126629
  • 18:11 urandom: Restarting Cassandra instance restbase1015-b.eqiad.wmnet : T126629
  • 18:08 urandom: Restarting Cassandra instance restbase1015-a.eqiad.wmnet : T126629
  • 18:07 urandom: Upgrading Cassandra to 2.2.6-wmf1 on restbase1015.eqiad.wmnet : T126629
  • 17:12 ostriches: gerrit: flush all caches to pick up account disable & rename
  • 16:55 urandom: Restarting Cassandra instance restbase1014-c.eqiad.wmnet : T126629
  • 16:54 awight: update paymentwiki from 2fc573cbb94e833c4144aa9dad79de8ec374bb09 to c33ddfccf945bd075f0abff9e9de8c09f0174f89
  • 16:48 urandom: Restarting Cassandra instance restbase1014-b.eqiad.wmnet : T126629
  • 16:47 jynus: stopping pc2006 for hardware maintenance T139283
  • 16:39 urandom: Restarting Cassandra instance restbase1014-a.eqiad.wmnet : T126629
  • 16:38 urandom: Upgrading Cassandra to 2.2.6-wmf1 on restbase1014.eqiad.wmnet : T126629
  • 16:29 urandom: Disabling Puppet on restbase101[4-5].eqiad.wmnet : T126629
  • 16:24 gehel: starting elasticsearch and kibana upgrade on logstash cluster (T136001)
  • 16:21 awight: Taking Fundraising campaigns down for maintenance
  • 15:51 elukey: add mw1261 back into service
  • 15:44 bd808: Dropped logstash indices older than logstash-2016.07.01 in preparation for elasticsearch upgrade
  • 15:25 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.9/extensions/Echo/modules/controller/mw.echo.Controller.js: SWAT: Correct section (alert/message/all) (duration: 00m 25s)
  • 15:14 _joe_: uploaded new HHVM package for jessie
  • 15:11 logmsgbot: thcipriani@tin Synchronized dblists/clldefault.dblist: SWAT: Deploy Compact Language Links as default (Stage 4) (T136677) PART II (duration: 00m 34s)
  • 15:11 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy Compact Language Links as default (Stage 4) (T136677) PART I (duration: 00m 55s)
  • 14:43 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: aqs1001.eqiad.wmnet
  • 14:38 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: aqs1001.eqiad.wmnet
  • 14:37 elukey: depool aqs1001 for nodejs upgrade
  • 12:05 elukey: depooling mw1261 from service (T73487)
  • 11:34 jynus: breaking m3 replication on db1048 (depooled) to check icinga alert changes
  • 10:44 jynus: disabling all mysql lag alerts cross-fleet T122457
  • 10:29 akosiaris: reboot fermium.wikimedia.org hassium.eqiad.wmnet install1001.wikimedia.org krypton.eqiad.wmnet meitnerium.wikimedia.org mendelevium.eqiad.wmnet T134242
  • 10:22 akosiaris: reboot dubnium T134242
  • 10:22 akosiaris: reboot bromine T134242
  • 10:13 akosiaris: reboot bohrium T134242
  • 10:11 elukey: pooling mw1261 back to service with Apache mod-proxy-fcgi set to trace8 (T73487)
  • 10:07 akosiaris: reboot etherpad1001.eqiad.wmnet, kernel upgrade and qemu upgrade, T134242
  • 10:04 gehel: rolling restart of elasticsearch cluster eqiad completed (T138811)
  • 08:51 _joe_: removing all old servers from the appservers pool but the canaries (T139353)
  • 08:47 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Failover s4 recentchanges to db1056 (duration: 00m 38s)
  • 08:33 hoo: Updated Wikidata's property suggester with data from Monday's json dump and removed the external identifiers as a workaround for T132839
  • 07:56 gehel: rolling restart of elasticsearch cluster codfw completed (T138811)
  • 07:16 legoktm: mysql:wikiadmin@db1041 [centralauth]> delete from localuser where lu_name ="Philippe" and lu_wiki ="scnwiki";
  • 04:37 eileen: Update CiviCRM from dd24368a897fd78752178ee253e7a890dd57db41 to bdf2afd417b70332c9542fd3ee4f14cb4e6f93cc
  • 03:12 eileen: CiviCRM upgrade from 5f8f7c3236e6bf12c52deea07093fbca165ef4a7 to dd24368a897fd78752178ee253e7a890dd57db41
  • 03:08 chasemp: silence labvirt1011 flapping for 24h, we have a task, and we are attempting to move vms we can
  • 02:48 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Jul 7 02:48:55 UTC 2016 (duration 6m 44s)
  • 02:42 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.9) (duration: 07m 07s)
  • 02:26 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.8) (duration: 09m 58s)
  • 02:14 chasemp: reboot labvirt1011
  • 01:53 chasemp: start nova-compute again as it seems ot have been killed on labvirt1011 which is acting weird
  • 01:24 andrewbogott: rebooting labvirt1011 (because it is acting crazy)
  • 00:50 eileen: updating CiviCRM from 54e168db2fddc6a9a07036323e01a27dd64333cf to 5f8f7c3236e6bf12c52deea07093fbca165ef4a7
  • 00:23 eileen: Updated CiviCRM from bb9bf136dc0fa82d5d07ebeb33d696e54672b2d6 to 54e168db2fddc6a9a07036323e01a27dd64333cf
  • 00:06 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.8/extensions/CentralAuth/: Make LocalRename jobs run sequentially - T137973 (for real this time) (duration: 00m 30s)
  • 00:05 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.9/extensions/CentralAuth/: Make LocalRename jobs run sequentially - T137973 (duration: 00m 30s)
  • 00:03 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.8/extensions/CentralAuth/: Make LocalRename jobs run sequentially - T137973 (duration: 00m 34s)

2016-07-06

  • 23:56 legoktm: created pageassesments tables on testwiki
  • 23:52 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.9/includes/specials/SpecialContributions.php: Add mediawiki.special.changeslist to SpecialContributions - T139522 (duration: 00m 25s)
  • 23:50 legoktm: running extensions/ORES/maintenance/CheckModelVersions.php and extensions/ORES/maintenance/PopulateDatabase.php on ruwiki
  • 23:49 logmsgbot: legoktm@tin Synchronized wmf-config/InitialiseSettings.php: Enable ORES review tool as a beta feature in ruwiki - T139541 (duration: 00m 27s)
  • 23:47 legoktm: created ores_* tables on ruwiki
  • 23:41 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.9/extensions/Echo/: T139321, T139323 (duration: 00m 32s)
  • 23:33 logmsgbot: legoktm@tin Synchronized wmf-config: VisualEditor: Move the citation button out of the primary toolbar on Wikivoyes - T133725 (2/2) (duration: 00m 30s)
  • 23:32 logmsgbot: legoktm@tin Synchronized wmf-config/InitialiseSettings.php: VisualEditor: Move the citation button out of the primary toolbar on Wikivoyes - T133725 (1/2) (duration: 00m 26s)
  • 23:30 logmsgbot: legoktm@tin Synchronized wmf-config: touch (duration: 00m 32s)
  • 23:25 logmsgbot: legoktm@tin Synchronized wmf-config: Test PageAssessments extension on test.wikipedia.org - T137918 (duration: 00m 36s)
  • 22:18 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.9/includes/diff: T139526 (duration: 00m 38s)
  • 20:44 urandom: Restarting Cassandra instance restbase1008-b.eqiad.wmnet : T126629
  • 20:41 bearND: deployed mobileapps 7a73789
  • 20:38 bearND: starting mobileapps deploy
  • 20:38 urandom: Restarting Cassandra instance restbase1008-a.eqiad.wmnet : T126629
  • 20:23 urandom: Disable Puppet on restbase1009.eqiad.wmnet : T126629
  • 20:13 urandom: Upgrade of restbase1013.eqiad.wmnet instances complete : T126629
  • 20:06 urandom: Restarting Cassandra instance restbase1013-b.eqiad.wmnet : T126629
  • 20:00 urandom: Restarting Cassandra instance restbase1013-a.eqiad.wmnet : T126629
  • 19:58 twentyafterfour: deployed 1.28.0-wmf.9 to group1 wikis: T138555
  • 19:57 urandom: Upgrading Cassandra package to 2.2.6-wmf1 on restbase1013.eqiad.wmnet : T126629
  • 19:57 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.9
  • 19:30 urandom: Upgrade of restbase1012.eqiad.wmnet instances complete : T126629
  • 19:24 urandom: Restarting Cassandra instance restbase1012-c.eqiad.wmnet : T126629
  • 19:18 urandom: Restarting Cassandra instance restbase1012-b.eqiad.wmnet : T126629
  • 19:09 urandom: Restarting Cassandra instance restbase1012-a.eqiad.wmnet : T126629
  • 19:07 urandom: Upgrading Cassandra package to 2.2.6-wmf1 on restbase1012.eqiad.wmnet : T126629
  • 18:57 mutante: mw1133 puppet fail because out of memory, stop hhvm, run puppet
  • 18:55 urandom: Cassandra upgrade of restbase1008.eqiad.wmnet instances complete : T126629
  • 18:54 mutante: labstore1001 - failed backup
  • 18:54 mutante: mw1261 syntax error in Apache config
  • 18:52 urandom: Restarting Cassandra instance restbase1008-c : T126629
  • 18:51 mutante: labstore2001 - RAID failure in Icinga (is it T102626 ?)
  • 18:47 urandom: Restarting Cassandra instance restbase1008-b : T126629
  • 18:46 mutante: mw1261 restart hhvm service
  • 18:42 urandom: Restarting Cassandra instance restbase1008-a : T126629
  • 18:40 urandom: Upgrading Cassandra package to 2.2.6-wmf1 on restbase1008 : T126629
  • 17:14 urandom: Disabling Puppet on restbase{1008,1012,1013}.eqiad.wmnet in preparation for rack 'b' Cassandra upgrade : T126629
  • 16:54 urandom: Upgrade of restbase1011.eqiad.wmnet instances to Cassandra 2.2.6 complete : T126629
  • 16:50 urandom: Restarting Cassandra for restbase1011-c.eqiad.wmnet : T126629
  • 16:45 urandom: Restarting Cassandra for restbase1011-b.eqiad.wmnet : T126629
  • 16:41 urandom: Restarting Cassandra fro restbase1011-a.eqiad.wmnet : T126629
  • 16:39 urandom: Upgrading Cassandra package to 2.2.6-wmf1 on restbase1011.eqiad.wmnet : T126629
  • 16:26 awight: update orphan rectifier from 2fc573cbb94e833c4144aa9dad79de8ec374bb09 to 70a7baa9f77c2510739bab0ff9d1b51578a59a6e
  • 16:25 urandom: Upgrade of restbase1010.eqiad.wmnet instances complete : T126629
  • 16:25 awight: update orphan rectifier config to add payments 4 to the Redis pool
  • 16:22 urandom: Restarting Cassandra for restbase1010-c.eqiad.wmnet : T126629
  • 15:59 urandom: Restarting Cassandra for restbase1010-b.eqiad.wmnet : T126629
  • 15:47 urandom: Restarting Cassandra for restbase1010-a.eqiad.wmnet : T126629
  • 15:45 urandom: Re-enabling Puppet on restbase1010 : T126629
  • 15:44 urandom: Upgrading Cassandra to 2.2.6-wmf1 on restbase1010 : T126629
  • 15:36 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.9/extensions/Echo/modules/styles: SWAT: Set width to Special:Notifications (T138433) (duration: 00m 30s)
  • 15:29 elukey: restarting the hdfs datanode on each analytics* Hadoop server to force the new -Xmx2048 heap setting to be picked up
  • 15:27 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Echo transition flags everywhere (duration: 00m 26s)
  • 15:22 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable autopatrolled user group at urwiki (T139302) (duration: 00m 30s)
  • 15:18 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.9/includes/diff/DifferenceEngine.php: Show parser output for diffs unless extension aborts (T139433) (duration: 00m 30s)
  • 15:08 urandom: Disabling puppet on restbase101[0-1].eqiad.wmnet in preparation for 2.2.6 upgrade : T126629
  • 15:02 akosiaris: poweroff prometheus2002 from dbrb -> plain conversion
  • 15:00 _joe_: depooling mw1261, installing an apache package with additional fixes (T73487)
  • 14:19 moritzm: rebooting californium for kernel update (hosting horizon.wikimedia.org)
  • 14:02 moritzm: rebooting silver (hosting wikitech)
  • 13:48 gehel: re-enabling puppet on ^(aqs|restbase).* after confirming that Cassandra puppet module change is a noop
  • 13:38 gehel: disabling puppet on ^(aqs|restbase).* before merging changes to Cassandra puppet module
  • 13:35 elukey: depooling mw1261.eqiad to restore previous fcgi logging settings (T73487)
  • 13:17 dcausse: restarting elastic master node (elastic1040)
  • 13:15 dcausse: truncating elastic main logs on elastic1040 and elastic1034
  • 12:19 mobrovac: restbase deploy end of fa4699a
  • 12:09 mobrovac: restbase deploy start of fa4699a
  • 11:54 elukey: depooling mw1261.eqiad.wmnet to raise Apache's mod-fcgi to trace8 for 503 investigation - T73487 (this will probably slow down a bit the host)
  • 11:15 jynus: shutting down db1048 in preparation for upgrade
  • 10:14 moritzm: upgrading restbase cluster in codfw for nodejs 4.4.6
  • 09:39 _joe_: restarting hhvm on mw1236,mw1215 to test for possible TC cache corruption
  • 09:31 moritzm: installing tomcat security updates on Ubuntu systems (jessie already fixed)
  • 09:13 logmsgbot: elukey@palladium conftool action : set/pooled=yes:weight=25; selector: mw1285.eqiad.wmnet
  • 09:10 logmsgbot: elukey@palladium conftool action : set/pooled=yes:weight=25; selector: mw1284.eqiad.wmnet
  • 09:09 elukey: pooling into service the last batch of new API appservers - mw1284->mw1290
  • 08:46 godog: lithium:~$ sudo lvextend --size +50G -t -r /dev/mapper/lithium--vg-syslog
  • 08:32 godog: powercycle ms-be2021, unreachable and nothing on console
  • 08:11 jynus: stopping replication on db1056 and performing alter table
  • 07:58 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Fail commons db servers back to its original configuration (duration: 00m 45s)
  • 07:57 moritzm: rolling reboot of sca clusters for kernel update
  • 07:51 moritzm: restarted hhvm on mw1148
  • 03:09 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jul 6 03:09:17 UTC 2016 (duration 6m 53s)
  • 03:02 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.9) (duration: 17m 48s)
  • 02:25 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.8) (duration: 09m 16s)
  • 00:23 logmsgbot: maxsem@tin Synchronized wmf-config: No-op (duration: 00m 37s)

2016-07-05

  • 23:57 MaxSem: ran ORES's CheckModelVersions.php and PopulateDatabase.php on nlwiki
  • 23:48 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/297462/ (duration: 00m 29s)
  • 23:42 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.8/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/297542/ (duration: 00m 33s)
  • 23:35 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/297542/ (duration: 00m 30s)
  • 23:33 MaxSem: created ORES tables on nlwiki
  • 23:26 chasemp: reboot of labvirt1011 which is being flaky I cannot keep a connection to
  • 23:22 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/296852/ (duration: 00m 53s)
  • 22:07 hasharAway: CI had issue booting instances since 19:50UTC it is operational again as of 21:30 UTC and slowly processing the backlog.
  • 21:34 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: 1.28.0-wmf.9 to group0
  • 20:56 logmsgbot: twentyafterfour@tin Finished scap: Rebuild l10n cache and deploy 1.28.0-wmf.9 to testwiki (duration: 32m 16s)
  • 20:24 logmsgbot: twentyafterfour@tin Started scap: Rebuild l10n cache and deploy 1.28.0-wmf.9 to testwiki
  • 20:01 akosiaris: T107306 uploaded to apt.wikimedia.org jessie-wikimedia: lttoolbox_3.3.3~r68466-2+wmf1
  • 20:01 akosiaris: T107306 uploaded to apt.wikimedia.org jessie-wikimedia: hfst_3.10.0~r2798-1+wmf1
  • 20:01 akosiaris: T107306 uploaded to apt.wikimedia.org jessie-wikimedia: hfst-ospell_0.4.0~r4643-5+wmf1
  • 20:01 akosiaris: T107306 uploaded to apt.wikimedia.org jessie-wikimedia: foma_0.9.18+r243-1+wmf1
  • 20:01 akosiaris: T107306 uploaded to apt.wikimedia.org jessie-wikimedia: cg3_0.9.9~r11624-1+wmf1
  • 20:01 akosiaris: T107306 uploaded to apt.wikimedia.org jessie-wikimedia: apertium_3.4.2~r68466-2+wmf1
  • 20:01 akosiaris: T107306 uploaded to apt.wikimedia.org jessie-wikimedia: apertium-lex-tools_0.1.1~r66150-1+wmf1
  • 20:01 akosiaris: T107306 uploaded to apt.wikimedia.org jessie-wikimedia: apertium-apy_0.9.1~r343-1
  • 18:54 chasemp: add dpatrick to wmf-nda
  • 18:32 Danny_B: killed wb2-phab @ tools.wikibugs to perform some batch edits
  • 18:30 mutante: added siddharth11 to LDAP group "wmf" per T138369#2425011
  • 16:44 jynus: rebooting pc2006 T139283
  • 16:20 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: mw1024.eqiad.wmnet
  • 16:13 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: mw1024.eqiad.wmnet
  • 16:12 elukey: depooling mw1024 to restore regular fcgi logging settings
  • 16:09 urandom: Bootstrapping restbase1009-c.eqiad.wmnet : T139362
  • 15:49 akosiaris: reenable alerts from smokeping on codfw
  • 15:46 andrewbogott: rebooting labservices1002 to see if it survives the reboot better this time
  • 15:32 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: touch initialisesettings to clear logspam, hopefully (duration: 00m 27s)
  • 15:20 elukey: powercycled mw1140, memory saturated and not reachable via ssh/mgmt-console
  • 15:20 paravoid: mr1-codfw: "request system snapshot media internal slice alternate" + "request system reboot"
  • 15:08 yurik: depl & restarted tilerator https://gerrit.wikimedia.org/r/#/c/297410/
  • 14:42 logmsgbot: elukey@palladium conftool action : set/pooled=yes:weight=25; selector: mw1283.eqiad.wmnet
  • 14:41 logmsgbot: elukey@palladium conftool action : set/pooled=yes:weight=25; selector: mw1282.eqiad.wmnet
  • 14:37 logmsgbot: elukey@palladium conftool action : set/pooled=yes:weight=25; selector: mw1281.eqiad.wmnet
  • 14:26 urandom: Issuing `nodetool cleanup' for restbase1014-b.eqiad.wmnet
  • 14:12 logmsgbot: elukey@palladium conftool action : set/pooled=yes:weight=25; selector: mw1280.eqiad.wmnet
  • 14:09 logmsgbot: elukey@palladium conftool action : set/pooled=yes:weight=25; selector: mw1279.eqiad.wmnet
  • 14:07 logmsgbot: oblivian@palladium conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=appserver,service=apache2,name=mw1217.*
  • 14:07 elukey: Pooling first batch of new eqiad api-servers - mw1279->mw1283
  • 14:03 _joe_: depooling permanently mw1091-13 from the appservers pool in eqiad
  • 13:58 logmsgbot: elukey@palladium conftool action : set/weight=20; selector: mw2245.codfw.wmnet
  • 13:58 logmsgbot: elukey@palladium conftool action : set/weight=20; selector: mw2244.codfw.wmnet
  • 13:58 logmsgbot: elukey@palladium conftool action : set/weight=20; selector: mw2243.codfw.wmnet
  • 13:58 logmsgbot: elukey@palladium conftool action : set/weight=20; selector: mw2242.codfw.wmnet
  • 13:57 logmsgbot: elukey@palladium conftool action : set/weight=20; selector: mw2241.codfw.wmnet
  • 13:57 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: mw2245.codfw.wmnet
  • 13:57 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: mw2244.codfw.wmnet
  • 13:57 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: mw2243.codfw.wmnet
  • 13:56 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: mw2242.codfw.wmnet
  • 13:56 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: mw2241.codfw.wmnet
  • 13:56 elukey: pooling new codfw appservers - mw224[12345]
  • 12:32 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: mw1024.eqiad.wmnet
  • 12:12 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: mw1024.eqiad.wmnet
  • 12:11 elukey: depooling/re-pooling mw1024.eqiad.wmnet to temporarily set up trace8 logging (503 investigation - T73487)
  • 12:08 jynus: running schema change on db1019 T73563
  • 11:15 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Failover all commons special roles to db1081 (duration: 00m 24s)
  • 11:00 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Failover commons recentachanges (duration: 00m 36s)
  • 10:45 jynus: SET GLOBAL read_only=0; on db1040, our new m4-master
  • 10:38 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Failover commons master to db1040 (duration: 00m 32s)
  • 10:23 jynus: archiving m3-master phlegal* databases before dropping them
  • 10:21 mobrovac: restbase staging started a no-op dump on cerium to test restbase on node 4.4.6
  • 10:05 logmsgbot: elukey@palladium conftool action : set/weight=30; selector: mw1275.eqiad.wmnet
  • 10:05 logmsgbot: elukey@palladium conftool action : set/weight=30; selector: mw1274.eqiad.wmnet
  • 10:05 logmsgbot: elukey@palladium conftool action : set/weight=30; selector: mw1273.eqiad.wmnet
  • 09:59 logmsgbot: elukey@palladium conftool action : set/weight=30; selector: mw1272.eqiad.wmnet
  • 09:31 _joe_: shutting down mw1009-16 for decommissioning
  • 09:06 _joe_: decommissioning mw1009-16
  • 08:38 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: mw1275.eqiad.wmnet
  • 08:36 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: mw1274.eqiad.wmnet
  • 08:32 gehel: deleting enwikisource_titlesuggest on elasticsearch codfw (index creation issue during cluster restart)
  • 08:31 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: mw1273.eqiad.wmnet
  • 08:24 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: mw1272.eqiad.wmnet
  • 08:21 elukey: adding and pooling new appservers - mw127[2345].eqiad
  • 08:07 godog: swift codfw-prod: ms-be202[567] weight 1500
  • 07:55 jynus: dropping etherpad_restore2 database from m1 T138516
  • 07:40 akosiaris: T138516 forcing a puppet run on cache::misc hosts after merging https://gerrit.wikimedia.org/r/297352
  • 07:29 akosiaris: T138516 stop the secondary etherpad instance on etherpad1001. etherpad-restore.wikimedia.org has served its purpose, killing it
  • 02:44 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jul 5 02:44:09 UTC 2016 (duration 6m 12s)
  • 02:38 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.8) (duration: 17m 13s)

2016-07-04

  • 20:28 jynus: removing /tmp/joal/sstables on all analytics10* hosts
  • 20:22 jynus: deleted 21GB worth of temporary files from analytics1050
  • 19:58 logmsgbot: aaron@tin Synchronized wmf-config/filebackend-production.php: Increase redis lockmanager timeout to 2 (duration: 00m 31s)
  • 19:57 logmsgbot: legoktm@tin Synchronized php-1.28.0-wmf.8/extensions/MassMessage/: MassMessage is no longer accepting lists in the MassMessageList content model - T139303 (duration: 00m 39s)
  • 17:37 jynus: testing slave_parallel_threads=5 on db1073
  • 14:27 moritzm: rebooting lithium for kernel update
  • 14:22 moritzm: installing tomcat7/ libservlet3.0-java security update on the kafka brokers
  • 14:06 _joe_: shutting down mw1001-1008 for decommissioning
  • 14:03 gehel: rolling restart of elasticsearch codfw/eqiad for kernel upgrade (T138811)
  • 13:47 _joe_: stopping jobrunner on mw1011-16 as well, befor decommissioning
  • 13:46 moritzm: depooling mw1153-mw1160 (trusty image scalers), replaced by mw1291-mw1298 (jessie image scalers)
  • 13:44 godog: ack all mr1-codfw related alerts in librenms
  • 13:43 akosiaris: restart smokeping on netmon1001, temporarily disabled msw1-codfw
  • 13:38 gehel: resuming writes on Cirrus / elasticsearch, this did not speedup cluster recovery
  • 13:18 godog: bounce redis on rcs1001
  • 13:16 gehel: restarting elastic1021 for kernel upgrade (T138811)
  • 13:07 elukey: Bootstrapping again Cassandra on aqs100[456] (rack awareness + 2.2.6 - testing environment)
  • 13:02 gehel: pausing writes on Cirrus / elasticsearch for faster cluster restart
  • 12:43 hashar: Nodepool back up with 10 instances (instead of 20) to accomodate for labs capacity T139285
  • 12:39 godog: nodetool-b stop -- COMPACTION on restbase1014
  • 12:29 moritzm: rolling reboot of rcs* cluster for kernel security update
  • 12:10 moritzm: rolling reboot of ocg* cluster for kernel security update
  • 11:40 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Failover db1053 to db1072 (duration: 00m 40s)
  • 10:56 moritzm: rolling reboot of swift frontends in eqiad for kernel security update
  • 10:30 yuvipanda: stop nodepool on labnodepool1001 and disable puppet to keep it down, to allow stabilizing labs first
  • 10:28 yuvipanda: restart rabbitmq-server on labcontrol1001
  • 10:14 moritzm: installing chromium security update on osmium
  • 10:07 moritzm: installing xerces-c security updates on Ubuntu systems (jessie already fixed)
  • 10:01 _joe_: stopping jobchron and jobrunner on mw1001-10 before decommission
  • 09:50 godog: reimage ms-be300[234] with jessie
  • 09:44 hashar: Labs infra cant delete instances anymore (impacts CI as well) T139285
  • 09:41 moritzm: installing p7zip security updates
  • 09:38 hashar: CI is out of Nodepool instances, the pool has drained because instances can no more be deleted over the OpenStack API
  • 09:25 elukey: Added new jobrunners in service - mw130[256].eqiad.wmnet (https://etherpad.wikimedia.org/p/jessie-install)
  • 08:16 moritzm: rolling reboot of swift backends in eqiad for kernel security update
  • 07:49 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Failover db1034 to db1062 (duration: 00m 30s)
  • 02:26 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Jul 4 02:26:54 UTC 2016 (duration 5m 42s)
  • 02:21 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.8) (duration: 09m 14s)

2016-07-03

  • 19:27 Reedy: Ran namespaceDupes --fix on gomwiki
  • 14:59 yuvipanda: restart nova-compute process on labvirt1010
  • 14:59 yuvipanda: restart nova-compute process on labvirt10101
  • 09:06 jynus: removing old logs from pc2004
  • 07:42 logmsgbot: legoktm@tin Synchronized static/images/project-logos/: Put high-res enwiktionary logos in the right place - T139255 (duration: 00m 38s)
  • 02:27 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Jul 3 02:27:13 UTC 2016 (duration 5m 38s)
  • 02:21 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.8) (duration: 09m 13s)

2016-07-02

  • 19:15 twentyafterfour: Deployed hotfix to phabricator. Restarted apache2 on iridium
  • 02:29 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jul 2 02:29:17 UTC 2016 (duration 5m 40s)
  • 02:23 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.8) (duration: 08m 52s)

2016-07-01

  • 22:23 logmsgbot: krinkle@tin Synchronized php-1.28.0-wmf.8/extensions/WikimediaEvents/extension.json: T128115 (duration: 00m 37s)
  • 22:22 logmsgbot: krinkle@tin Synchronized php-1.28.0-wmf.8/extensions/WikimediaEvents/modules/: T128115 (duration: 00m 30s)
  • 21:04 logmsgbot: ori@tin Synchronized wmf-config/CommonSettings.php: I7a95c0f4: Bump $wgResourceLoaderMaxQueryLength to 5,000 (duration: 00m 32s)
  • 20:08 logmsgbot: ori@tin Synchronized wmf-config/CommonSettings.php: I6eb0ae67: Bump $wgResourceLoaderMaxQueryLength to 4,000 (duration: 00m 26s)
  • 19:17 ori: restarted coal on graphite1001 stopped receiving messages from EL 0mq publisher
  • 19:16 ori: restarted navtiming on hafnium; stopped receiving messages from EL 0mq publisher
  • 18:34 mutante: mw1259 - powercycling
  • 18:32 logmsgbot: krinkle@tin Synchronized docroot/default/: (no message) (duration: 00m 31s)
  • 18:31 logmsgbot: krinkle@tin Synchronized errorpages/: (no message) (duration: 01m 06s)
  • 17:47 ebernhardson: restart elasticsearch on elastic1017 to attempt to clear up a continuous backlog of relocating shards
  • 15:53 godog: temporarily run 3x statsdlb instances on graphite1001 to minimise drops - T101141
  • 14:57 dcausse: upgraded and restarted elastic on nobelium@eqiad
  • 14:23 godog: enable another statsdlb instance temporarily on graphite1001 to investigate drops
  • 14:15 moritzm: rearmed keyholder on mira after reboot
  • 13:56 moritzm: rebooting codfw poolcounters for kernel update
  • 13:47 moritzm: rebooting osmium for kernel update
  • 13:28 cmjohnson1: mw1145 swapped eth0 cable
  • 13:04 moritzm: rebooting mira for kernel update
  • 12:59 moritzm: rebooting francium for kernel update
  • 11:23 godog: bounce statsdlb on graphite1001, drops are back after yesterday's reboot T101141
  • 11:15 moritzm: removed two obsolete, older kernel packages from wtp1002 (had flagged an icinga warning on diskspace on /boot)
  • 09:38 elukey: rebooted eventlog2001.codfw.wmnet for kernel upgrades
  • 09:35 moritzm: rolling reboot of swift backends in codfw
  • 09:15 moritzm: powercycling ms-fe2003, stuck after reboot
  • 09:03 moritzm: powercycling ms-fe2002, stuck after reboot
  • 08:41 moritzm: powercycling ms-fe2001, stuck after reboot
  • 08:32 moritzm: rolling reboot of swift frontends in codfw
  • 06:28 moritzm: resuming rolling reboots of elastic* clusters in eqiad and codfw
  • 06:18 moritzm: rolling reboot of wtp1* for kernel security update
  • 02:44 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jul 1 02:44:01 UTC 2016 (duration 5m 7s)
  • 02:38 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.8) (duration: 17m 02s)
  • 01:59 logmsgbot: ori@tin Synchronized wmf-config/CommonSettings.php: I3a8057a8: Bump $wgResourceLoaderMaxQueryLength to 3,000 (duration: 00m 28s)
  • 01:46 logmsgbot: aaron@tin Synchronized wmf-config/InitialiseSettings.php: Enable LocalFile log (duration: 00m 32s)
  • 01:08 logmsgbot: ori@tin Synchronized wmf-config/InitialiseSettings.php: Ie8a71af5: HD logo for en.wiktionary (2/2) (duration: 00m 27s)
  • 01:07 logmsgbot: ori@tin Synchronized static/images/project-logos: Ie8a71af5: HD logo for en.wiktionary (1/2) (duration: 00m 28s)
  • 01:04 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.8/extensions/WikimediaEvents/modules/ext.wikimediaEvents.deprecate.js: Ie28d823c: Log ResourceLoader URL-splitting (duration: 00m 32s)
  • 00:20 logmsgbot: maxsem@tin Finished scap: https://gerrit.wikimedia.org/r/#/c/296819/ - noop in prod (duration: 27m 27s)

2016-06-30

  • 23:53 logmsgbot: maxsem@tin Started scap: https://gerrit.wikimedia.org/r/#/c/296819/ - noop in prod
  • 23:47 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/296257/ (duration: 00m 26s)
  • 23:43 logmsgbot: maxsem@tin Synchronized portals: (no message) (duration: 00m 27s)
  • 23:43 logmsgbot: maxsem@tin Synchronized portals/prod/wikipedia.org/assets: (no message) (duration: 00m 28s)
  • 23:41 ori: banned /static/images/project-logos/enwiktionary.png and /static/images/project-logos/adywiki.png
  • 23:37 logmsgbot: maxsem@tin Synchronized docroot/mediawiki/xml/sitelist-1.0/index.html: https://gerrit.wikimedia.org/r/#/c/296788/ (duration: 00m 24s)
  • 23:27 logmsgbot: maxsem@tin Synchronized static/images/project-logos/: (no message) (duration: 00m 27s)
  • 23:12 logmsgbot: maxsem@tin Synchronized static/images/project-logos/enwiktionary.png: https://gerrit.wikimedia.org/r/#/c/296757/ (duration: 00m 30s)
  • 22:04 logmsgbot: krinkle@tin Synchronized wmf-config/InitialiseSettings.php: test2wiki wgSquidMaxage (duration: 00m 28s)
  • 21:00 ebernhardson: change cluster.routing.allocation.disk.watermark.high on eqiad elasticsearch cluster to 80%
  • 20:52 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.8/includes/filerepo/file/LocalFile.php: 51d7fb48f2af31d69db115a8b3ed790cdaaf0d2e (duration: 00m 35s)
  • 19:47 logmsgbot: aaron@tin Synchronized wmf-config/InitialiseSettings.php: Set the SaveParse log (duration: 00m 26s)
  • 19:43 twentyafterfour: ran scap pull on mw2123
  • 19:42 twentyafterfour: ran scap pull on mw2098
  • 19:42 gehel: activating statement timeout limitations for kartotherian on maps cluster codfw (T138422)
  • 19:38 ostriches: mw2134: running sync-common, seems out of...sync :)
  • 19:20 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: (no message)
  • 19:19 twentyafterfour: Deploying 1.28.0-wmf.8 to all wikimedia wikis.
  • 19:18 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.8/includes: adc4c90202d6c44aa58756e3c6bc35918afc5f75 (duration: 01m 19s)
  • 18:24 ori: restarted coal on graphite1001 and navtiming on hafnium due to inexplicably stopped metrics; nothing useful in logs.
  • 17:48 yurik: deployed Kartotherian https://gerrit.wikimedia.org/r/#/c/296787/
  • 17:10 yurik: deployed Graphoid https://gerrit.wikimedia.org/r/#/c/296780/
  • 17:08 jynus: stopping slave on db1073 to test InnoDB compression T139055
  • 16:44 dcausse: restarting elastic1036 (master in eqiad)
  • 16:23 logmsgbot: thcipriani@tin Synchronized wmf-config: SWAT:Revert "Use extension registration for TitleBlacklist" (duration: 00m 32s)
  • 16:22 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT:Revert "Use extension registration for TitleBlacklist" (duration: 00m 27s)
  • 16:17 dcausse: truncating elastic logs on elastic1036 and elastic1021
  • 16:16 logmsgbot: thcipriani@tin Synchronized wmf-config: SWAT: Use extension registration for TitleBlacklist (T119117) PART II (duration: 00m 36s)
  • 16:15 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Use extension registration for TitleBlacklist (T119117) PART I (duration: 00m 39s)
  • 15:53 logmsgbot: thcipriani@tin Synchronized wmf-config: SWAT: Use extension registration for LabeledSectionTransclusion (T119117) (duration: 00m 27s)
  • 15:48 logmsgbot: thcipriani@tin Synchronized wmf-config: SWAT: Short array syntax (duration: 00m 30s)
  • 15:39 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: SWAT: Put wikidatawiki back on 1.28.0-wmf.8
  • 15:36 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.7/extensions/Wikidata/extensions/Wikibase/repo/includes/WikibaseRepo.php: SWAT: Update Wikidata - Fix broken editing of statements (T138974) (duration: 00m 25s)
  • 15:35 akosiaris: restarted (actually puppet did) gerrit after merging 4 related changes
  • 15:34 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.8/extensions/Wikidata/extensions/Wikibase/repo/includes/WikibaseRepo.php: SWAT: Update Wikidata - Fix broken editing of statements (T138974) (duration: 00m 31s)
  • 15:27 logmsgbot: thcipriani@tin Synchronized dblists/clldefault.dblist: SWAT: Deploy Compact Language Links as default (Stage 3.5) (T136677) (duration: 00m 26s)
  • 15:18 logmsgbot: thcipriani@tin Synchronized wmf-config/throttle.php: SWAT: Delete old throttle rules (duration: 00m 25s)
  • 15:12 logmsgbot: thcipriani@tin Synchronized static/images/project-logos/enwiktionary.png: SWAT: Revert Change project logo for enwikt (T138801) (duration: 00m 25s)
  • 15:07 logmsgbot: thcipriani@tin Synchronized static/images/project-logos/enwiktionary.png: SWAT: Change project logo for enwikt (T138801) (duration: 00m 25s)
  • 15:04 moritzm: rolling reboot of wtp2 for kernel security update
  • 14:49 mdholloway: mobileapps finished deploying 43538aa
  • 14:46 moritzm: pooling four additional jessie-based image scalers (mw1295-mw1298)
  • 14:45 mdholloway: starting mobileapps deployment
  • 13:42 godog: swift codfw-prod: ms-be202[234] weight 3000
  • 13:10 jynus: upgrading and restarting analytics1003 mysql tables
  • 11:59 moritzm: pooling three additional jessie-based image scalers
  • 10:26 elukey: rebooting stat100[234] and analytics1003 for kernel upgrades
  • 10:20 moritzm: powercycling mw1016, stuck after reboot
  • 09:44 dcausse: truncating current logs again on elastic1045 and elastic1036
  • 09:37 moritzm: powercycling mw1011, stuck after reboot
  • 09:21 dcausse: truncating current logs on elastic1045 and elastic1036
  • 09:19 mobrovac: zotero deployed translators cde2f7531a4
  • 09:13 godog: reboot graphite2001 / graphite1001 to apply trusty kernel update
  • 09:13 dcausse: deleting old logs on elastic1045 and elastic1036
  • 06:57 moritzm: powercycling elastic1015, stuck after reboot
  • 06:37 moritzm: powercycling elastic1014, stuck after reboot
  • 06:26 moritzm: resuming rolling restarts of elasticsearch cluster in eqiad and codfw
  • 06:18 moritzm: rolling restart of mw1001-mw1016 for kernel secuity update
  • 03:00 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Jun 30 03:00:28 UTC 2016 (duration 7m 10s)
  • 02:53 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.8) (duration: 07m 08s)
  • 02:39 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 17m 23s)
  • 01:33 twentyafterfour: starting phd with only 4 taskmasters to help lighten the load
  • 01:30 twentyafterfour: stopped phd on iridium to investigate large spike in sql insert volume
  • 01:18 mutante: iridium back up, on 3.13.0-91
  • 01:15 mutante: rebooting iridium
  • 00:39 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Don't use always true wmgUseXFFBlocks anymore (2/2) (duration: 00m 25s)
  • 00:38 twentyafterfour: Phabricator upgrade complete, service appears to be stable.
  • 00:32 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Don't use always true wmgUseXFFBlocks anymore (1/2) (duration: 00m 25s)
  • 00:30 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Don't use always true wmgUseXFFBlocks anymore (1/2) (duration: 00m 27s)
  • 00:27 twentyafterfour: Taking phabricator offline momentarily for scheduled update. Expect less than 5 minutes of downtime.
  • 00:25 logmsgbot: maxsem@tin Synchronized wmf-config/: Try again? (duration: 00m 29s)
  • 00:17 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Revert "Cleanup: Move never-altered GlobalBlockingBlockXFF into CommonSettings" (no-op) (duration: 00m 26s)
  • 00:15 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Revert "Cleanup: Move never-altered GlobalBlockingBlockXFF into CommonSettings" (duration: 00m 25s)
  • 00:10 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Clean-up for IS/CS (Gerrit:292615 to Gerrit:292618, no op, 2/2) (duration: 00m 29s)
  • 00:09 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Clean-up for IS/CS (Gerrit:292615 to Gerrit:292618, no op, 1/2) (duration: 00m 28s)
  • 00:08 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.8/extensions/TemplateSandbox: https://gerrit.wikimedia.org/r/#/c/296675/ (duration: 00m 30s)
  • 00:08 logmsgbot: dereckson@tin scap aborted: wmf-config/CommonSettings.php Clean-up for IS/CS (Gerrit:292615 to Gerrit:292618, no op, 1/2) (duration: 00m 20s)
  • 00:07 logmsgbot: dereckson@tin Started scap: wmf-config/CommonSettings.php Clean-up for IS/CS (Gerrit:292615 to Gerrit:292618, no op, 1/2)

2016-06-29

  • 23:21 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Revert "Enable Echo transition flags in production for testing" (duration: 00m 25s)
  • 22:50 logmsgbot: demon@tin Synchronized wmf-config/CommonSettings.php: extdist config for 1.27/1.25 (duration: 00m 31s)
  • 21:43 logmsgbot: krenair@tin Synchronized php-1.28.0-wmf.8/extensions/VisualEditor/ApiVisualEditor.php: https://gerrit.wikimedia.org/r/296661 - VE namespaces issue (duration: 00m 26s)
  • 21:41 chasemp: cleared phab 2fa for ebernhardson for lost phone
  • 21:34 jynus: removing /srv/backups/m2-otrs-* (tranferred to es2001) to make space
  • 21:02 yurik: deployed Graphoid https://gerrit.wikimedia.org/r/#/c/296498/
  • 20:55 yurik: deployed Tilerator https://gerrit.wikimedia.org/r/#/c/296647/
  • 20:50 yurik: deployed Kartotherian https://gerrit.wikimedia.org/r/#/c/296646/
  • 20:21 bearND: mobileapps deployed 1da6bf0
  • 20:15 bearND: starting mobileapps deploy
  • 19:54 logmsgbot: maxsem@tin Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/296627/ (duration: 00m 31s)
  • 19:39 mutante: antimony - shutdown -h now (since it's gone from Icinga now)
  • 19:38 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: testwikidata back to wmf.8
  • 19:36 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: Roll back wikidata and testwikidata to 1.28.0-wmf.7 per request by @aude
  • 19:22 mutante: antimony puppetstoredconfigclean.rb to remove icinga monitor remnants
  • 19:14 ostriches: ytterbium: running puppet and reloading replication plugin
  • 19:13 mutante: antimony - stopping gitblit service
  • 19:07 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.8T138555
  • 18:39 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.7/maintenance/dumpBackup.php: Deploy I94ca4a06 (duration: 00m 25s)
  • 18:39 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.7/maintenance/backup.inc: Deploy I94ca4a06 (duration: 00m 24s)
  • 18:37 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.7/includes/export/WikiExporter.php: Deploy I94ca4a06 (duration: 00m 25s)
  • 18:30 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.8/maintenance/dumpBackup.php: Deploy I94ca4a06 (duration: 00m 27s)
  • 18:29 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.8/maintenance/backup.inc: Deploy I94ca4a06 (duration: 00m 33s)
  • 18:28 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.8/includes/export/WikiExporter.php: Deploy I94ca4a06 (duration: 00m 34s)
  • 18:06 mutante: we stopped using gitblit. git.wikimedia.org URLs P3318 T137224
  • 18:05 mutante: git.wm.org URLs switched from gitblit to phab redirects
  • 17:48 ostriches: gerrit: flushed all caches to pick up rename, things may be slow for the next 15m or so
  • 15:49 logmsgbot: thcipriani@tin Synchronized wmf-config/throttle.php: SWAT: Throttling exemption for enwiki (T138167) (duration: 00m 25s)
  • 15:38 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.7/resources/Resources.php: SWAT: mediawiki.action.edit.stash: Restore dependency to "jquery.getAttrs" (T138931) (duration: 00m 26s)
  • 15:34 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.7/extensions/Flow/maintenance/FlowRemoveOldTopics.php: SWAT: Also delete topics that have more recent updates by (only) talk page manager (T119509) (duration: 00m 25s)
  • 15:29 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.7/extensions/Flow: SWAT: Do not reimport existing header (T119509) (duration: 00m 46s)
  • 15:22 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.7/extensions/Flow/maintenance/FlowRestoreLQT.php: SWAT: Script to restore LQT topics to their pre-import state (T119509) (duration: 00m 26s)
  • 15:11 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Echo transition flags in production for testing (duration: 00m 27s)
  • 15:09 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Add $wmgEchoTransition setting for Echo transition flags PART II (duration: 00m 26s)
  • 15:08 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add $wmgEchoTransition setting for Echo transition flags PART I (duration: 00m 50s)
  • 14:46 moritzm: powercycling elastic1012, stuck on reboot
  • 13:22 elukey: rebooting analytics1027 for kernel upgrades
  • 12:58 moritzm: rebooting dataset1001 for kernel update
  • 12:41 moritzm: continuing rolling restarts of elastic* in eqiad and codfw for kernel security update
  • 12:32 moritzm: powercycling elastic1010, stuck on reboot
  • 12:11 moritzm: powercycling mw1260, stuck on reboot
  • 11:38 jynus: halfway moving otrs backups from dbstore1001 to es2001
  • 11:27 gehel: powercycling elastic1009 - stuck in reboot
  • 11:11 moritzm: powercycling mw1223, stuck on reboot
  • 11:09 gehel: deleting broken dewiki_titlesuggest index from codfw (T138811)
  • 10:31 elukey: rebooting analytics100[12] (Hadoop Yarn/HDFS master and standby) - One at the time forcing failover manually with daemon restarts
  • 09:55 moritzm: powercycling mw1163, stuck on reboot
  • 09:23 gehel: banning elastic1001 to 1016 from cluster to prepare their decommissioning (T138329)
  • 09:20 ema: upgrading diamond to 3.5-6 (T138758)
  • 09:01 elukey: rebooting analytics1028->1057 for kernel upgrades (Hadoop worker nodes)
  • 08:55 moritzm: powercycling mw1111, stuck on reboot
  • 08:44 elukey: puppet stopped on analytics1027 to prevent Camus job to run (prep step for Hadoop kernel upgrades)
  • 08:40 moritzm: powercycling mw1108, stuck on reboot
  • 08:12 moritzm: powercycling mw1099, stuck on reboot
  • 08:12 moritzm: powercycling mw1097, stuck on reboot
  • 08:05 moritzm: powercycling mw1092, stuck on reboot
  • 07:47 moritzm: rolling reboot of appservers in eqiad for kernel security update
  • 07:16 moritzm: powercycling snapshot1002, reboot stuck
  • 07:11 moritzm: powercycling snapshot1001, reboot stuck
  • 06:58 moritzm: rebooting most snapshot hosts for kernel security update
  • 03:28 logmsgbot: krinkle@tin Synchronized wmf-config/InitialiseSettings.php: test2wiki (duration: 00m 33s)
  • 02:56 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jun 29 02:56:32 UTC 2016 (duration 6m 30s)
  • 02:50 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.8) (duration: 04m 48s)
  • 02:30 chasemp: labstore1004 is replicating NFS/DRBD shares to labstore1005 and they are large and it's taking a long time
  • 02:29 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 09m 21s)
  • 02:18 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: sync wikiversions.json - group0 to 1.28.0-wmf.8 refs T137492
  • 02:16 twentyafterfour: promoting group0 to 1.28.0-wmf.8
  • 00:02 logmsgbot: twentyafterfour@tin Finished scap: sync new branch, testwiki to php-1.28.0-wmf.8 refs T137492 (duration: 51m 59s)

2016-06-28

  • 23:10 logmsgbot: twentyafterfour@tin Started scap: sync new branch, testwiki to php-1.28.0-wmf.8 refs T137492
  • 23:10 Krenair: wikitech-static working now, poke me on IRC or file a #wikitech.wikimedia.org ticket if you find any issues
  • 23:10 twentyafterfour: syncing new branch 1.28.0-wmf.8 refs T137492
  • 23:04 logmsgbot: ebernhardson@tin Synchronized php-1.28.0-wmf.7/extensions/EventBus/EventBus.php: SWAT: EventBus: Match the expected format of response log key (duration: 00m 31s)
  • 23:01 Krenair: Updating MW version on wikitech-static to 1.27 (LTS) - https://lists.wikimedia.org/pipermail/mediawiki-announce/2016-June/000191.html
  • 21:59 halfak: deploying ores beec291
  • 21:33 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.7
  • 21:31 logmsgbot: twentyafterfour@tin Synchronized php-1.28.0-wmf.7/extensions/AbuseFilter/: deploy https://gerrit.wikimedia.org/r/#/c/296464/ refs T138550 T136973 (duration: 00m 36s)
  • 21:24 twentyafterfour: deploying wmf.7 yet again, once CI finishes testing https://gerrit.wikimedia.org/r/#/c/296464/ refs T138550 T136973
  • 20:24 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: once again rolling back to wmf.6 refs T136973 T138550
  • 20:11 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.7
  • 20:09 logmsgbot: twentyafterfour@tin Synchronized php-1.28.0-wmf.7/extensions/AbuseFilter/: deploying https://gerrit.wikimedia.org/r/#/c/296440/ refs T138550, T136973 (duration: 02m 06s)
  • 20:09 twentyafterfour: deploying https://gerrit.wikimedia.org/r/#/c/296440/ to hopefully unblock wmf.7 deployments. refs T138550, T136973
  • 20:08 gehel: disabling puppet on wdqs100[12] to cleanup after failed scap3 deplyoment
  • 19:33 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: Rolling back to wmf.6: save time regression is still present in wmf.7
  • 19:32 twentyafterfour: Rolling back to wmf.6: T138550 is still a problem
  • 19:24 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.7
  • 19:23 twentyafterfour: Deploying 1.28.0-wmf.7 to all wikis
  • 18:23 mutante: zosma - fresh install, sign puppet certs, initial puppet run
  • 16:16 gehel: starting rolling restart of elasticsearch codfw cluster (T138811)
  • 15:25 logmsgbot: thcipriani@tin Synchronized portals: SWAT: Bumping portals to master (T136874) (duration: 00m 29s)
  • 15:24 logmsgbot: thcipriani@tin Synchronized portals/prod/wikipedia.org/assets: SWAT: Bumping portals to master (T136874) (duration: 00m 24s)
  • 15:16 logmsgbot: thcipriani@tin Synchronized dblists/visualeditor-default.dblist: SWAT: Enable VisualEditor by default for all users of the French (T136993), English (T136992), and German (T136991) Wikivoyage (duration: 00m 24s)
  • 15:09 logmsgbot: thcipriani@tin Synchronized dblists/visualeditor-default.dblist: SWAT: Enable VisualEditor by default for all users of the Italian Wikivoyage (T136994) (duration: 00m 25s)
  • 14:52 gehel: powercycling elastic1004 (server not coming up during restart - T138811)
  • 13:47 godog: bounce carbon on graphite machines after applying https://gerrit.wikimedia.org/r/266567
  • 13:40 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: aqs1001.eqiad.wmnet
  • 12:50 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1023, 24, 33, 35, 39, 44, 52, 61, 64, 68, 63, 67, 72, 73 (duration: 02m 39s)
  • 12:00 gehel: powercycling elastic1002 (server not coming up during restart - T138811)
  • 11:43 gehel: powercycling elastic1001 (server not coming up during restart - T138811)
  • 11:21 gehel: rolling restart of elasticsearch eqiad
  • 10:44 moritzm: rolling reboot of mediawiki in codfw for kernel security update
  • 09:39 moritzm: powercycling mw1021, didn't come up after reboot
  • 09:32 elukey: restarted hhvm on mw1238, memory pressure ok but hhvm stuck (hhvm-dump-debug in /tmp/hhvm.14788.bt.)
  • 09:28 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: aqs1003.eqiad.wmnet
  • 09:25 moritzm: powercycling mw1019, didn't come up after reboot
  • 09:25 logmsgbot: reedy@tin Synchronized wmf-config/interwiki.php: Updated IW map (duration: 00m 49s)
  • 09:13 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: aqs1003.eqiad.wmnet
  • 08:57 moritzm: powercycling mw1018, didn't come up after reboot
  • 08:47 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Prepare old servers for decom by sending all queries to new servers (duration: 01m 39s)
  • 08:32 moritzm: rolling reboot of mediawiki canaries for kernel security update
  • 08:30 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: aqs1002.eqiad.wmnet
  • 08:17 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: aqs1002.eqiad.wmnet
  • 08:15 elukey: rebooting aqs100[23].eqiad for kernel upgrades
  • 02:54 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jun 28 02:54:56 UTC 2016 (duration 7m 16s)
  • 02:47 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 08m 59s)
  • 02:27 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 51s)
  • 00:26 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.7/includes/api/ApiMain.php: UsageException to try to catch T138585 issue (duration: 00m 27s)
  • 00:21 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Enable Wikibase descriptions on Catalan and Polish wikis (T135429) (duration: 00m 26s)
  • 00:09 logmsgbot: dereckson@tin Synchronized wmf-config/mobile.php: Introduce config variable to control tagline (T138738, 2/2) (duration: 00m 27s)
  • 00:08 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Introduce config variable to control tagline (T138738, 1/2) (duration: 00m 32s)
  • 00:07 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings-labs.php: Introduce config variable to control tagline (no-op) (duration: 00m 27s)
  • 00:05 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.6/extensions/MobileFrontend/: Introduce config variable to control tagline (T138738) (duration: 00m 29s)
  • 00:02 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.7/extensions/MobileFrontend/: Introduce config variable to control tagline (T138738) (duration: 00m 39s)

2016-06-27

  • 20:13 mdholloway: mobileapps deployed 30cc12e
  • 20:08 subbu: finished deploying parsoid sha dd8e644d
  • 20:04 subbu: synced new parsoid code; restarted parsoid on wtp1001 as a canary
  • 20:01 subbu: starting parsoid deploy
  • 17:23 gehel: deploying new logstash config for transition to elasticsearch 2.x (T138335)
  • 15:21 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Increase move rate limit for extendedmovers in enwiki to 16/60 (duration: 00m 28s)
  • 15:19 logmsgbot: thcipriani@tin Synchronized wmf-config/throttle.php: SWAT: Delete old throttle rules (duration: 00m 26s)
  • 15:16 gehel: banning elastic1001 to prepare its decommissioning (T138329)
  • 15:13 logmsgbot: thcipriani@tin Synchronized dblists/clldefault.dblist: SWAT: Deploy Compact Language Links as default (Stage 3) PART II (duration: 00m 23s)
  • 15:07 logmsgbot: thcipriani@tin Synchronized wmf-config: SWAT: Deploy Compact Language Links as default (Stage 3) (duration: 00m 40s)
  • 15:00 elukey: mw1136 powercycled - not responsive to ssh and root login
  • 14:49 logmsgbot: gehel@palladium conftool action : set/pooled=no; selector: dc=eqiad,cluster=elasticsearch,service=elasticsearch,name=elastic101[0-6].eqiad.wmnet
  • 14:39 logmsgbot: gehel@palladium conftool action : set/pooled=no; selector: dc=eqiad,cluster=elasticsearch,service=elasticsearch,name=elastic100[0-9].eqiad.wmnet
  • 14:37 logmsgbot: gehel@palladium conftool action : get/pooled; selector: dc=eqiad,cluster=elasticsearch,service=elasticsearch,name=elastic100[0-9]..eqiad.wmnet
  • 14:34 gehel: removing old elasticsearch servers in eqiad from LVS (elastic1001-1016 - T138329)
  • 10:10 moritzm: pooled mw1291 (jessie imagescaler)
  • 09:48 jynus: stopping and reimporting db2010 (m1)
  • 09:47 gehel: removing maps-test*.codfw.wmnet servers from LVS (T138092)
  • 09:19 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: dc=eqiad,cluster=elasticsearch,service=elasticsearch-ssl,name=elastic104..eqiad.wmnet
  • 09:19 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: dc=eqiad,cluster=elasticsearch,service=elasticsearch,name=elastic104..eqiad.wmnet
  • 09:18 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: dc=eqiad,cluster=elasticsearch,service=elasticsearch-ssl,name=elastic103..eqiad.wmnet
  • 09:18 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: dc=eqiad,cluster=elasticsearch,service=elasticsearch,name=elastic103..eqiad.wmnet
  • 09:10 logmsgbot: gehel@palladium conftool action : get/pooled; selector: elastic10??\.eqiad\.wmnet (tags: ['dc=eqiad', 'cluster=elasticsearch', 'service=elasticsearch'])
  • 09:07 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: elastic1032.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=elasticsearch', 'service=elasticsearch-ssl'])
  • 09:06 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: elastic1032.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=elasticsearch', 'service=elasticsearch'])
  • 09:00 gehel: adding new elasticsearch servers in eqiad to LVS
  • 08:54 godog: swift codfw-prod ms-be202[234] weight 2000
  • 07:15 elukey: puppet stopped on analytics1049 to remove it completely from the Hadoop cluster - broken disk
  • 02:51 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Jun 27 02:51:41 UTC 2016 (duration 7m 5s)
  • 02:44 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 08m 09s)
  • 02:27 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 54s)

2016-06-26

  • 02:52 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Jun 26 02:52:48 UTC 2016 (duration 6m 19s)
  • 02:46 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 08m 15s)
  • 02:28 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 48s)

2016-06-25

  • 09:37 mutante: install2001 killing ganglia aggregator processes, running puppet, for debugging
  • 02:51 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jun 25 02:51:43 UTC 2016 (duration 6m 26s)
  • 02:45 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 07m 58s)
  • 02:28 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 53s)
  • 01:07 chasemp: sign labstore1005 puppet certs and bootstrap the server
  • 00:53 chasemp: hand hack apache on labmon to make it work temporarily

2016-06-24

  • 18:41 logmsgbot: krenair@tin Synchronized dblists/mobilemainpagelegacy.dblist: https://gerrit.wikimedia.org/r/#/c/295958/4 - fix mobile main page rendering on a bunch of wikis, effectively putting them back to how they were a few days ago (duration: 00m 37s)
  • 17:19 mobrovac: change-prop deploying df88a75b
  • 17:05 _joe_: re-started changeprop after disabling the dependency module
  • 14:18 paravoid: shutting down ms-fe3002 due to on-site work
  • 14:05 logmsgbot: krinkle@tin Synchronized php-1.28.0-wmf.7/includes/OutputPage.php: T138586 hotfix (duration: 00m 47s)
  • 14:02 mobrovac: scb100x disabled puppet to clear changeprop queues
  • 13:22 gehel: re-enabling puppet on maps1002 (still in pre-configuration state, only default role)
  • 12:34 hashar: Random resource loader entries are apparently faulty causing issues with css and/or javascript T138586
  • 12:04 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: aqs1001.eqiad.wmnet
  • 12:03 elukey: rebooting aqs1001.eqiad.wmnet for kernel upgrades
  • 10:55 jynus: updated m1-slave dns to be db1001
  • 10:20 hashar: gallium: restarted apache2 , potentially stuck proxy
  • 10:18 moritzm: upgrade nodejs on scb systems in codfw and restart node-based services
  • 09:59 ema: nginx rolling restart to enable TFO on all tlsproxies (T108827)
  • 09:52 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1059 with low weight, increase weight of db1061, db1062 (duration: 00m 33s)
  • 09:48 moritzm: upgrade nodejs on restbase test systems (xenon/praseodymium/cerium/restbase-test) and restart restbase on those
  • 09:09 mobrovac: scb100x stopping puppet to stop change-prop and clear the queue
  • 08:29 moritzm: uploaded nodejs 4.4.6 for jessie-wikimedia to carbon
  • 07:10 elukey: memcached on mc1007 restarted with growth factor 1.05 (T129963)
  • 03:54 robh: data copy for labmon1001 verified complete with proper permissions, re-enabling and running puppet to start back up services
  • 03:19 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jun 24 03:19:55 UTC 2016 (duration 7m 4s)
  • 03:12 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 17m 24s)
  • 02:38 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 17m 08s)
  • 01:22 bblack: stream.wikimedia.org (RCStream) DNS moved to cache_misc termination. If anyone reports bugs with rcstream services, revert https://gerrit.wikimedia.org/r/295385

2016-06-23

  • 23:17 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/295600/ (duration: 00m 29s)
  • 23:15 logmsgbot: maxsem@tin Synchronized dblists/mobilemainpagelegacy.dblist: https://gerrit.wikimedia.org/r/#/c/295600/ (duration: 00m 28s)
  • 22:33 chasemp: reimage labstore1005 post io testing
  • 22:12 chasemp: powercycle labstore1005
  • 21:24 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group2 wikis to wmf.6
  • 21:11 chasemp: silence alerts for labstore1004 for setup
  • 20:31 ebernhardson: synced out latest logstash-plugins via trebuchet
  • 20:17 Dereckson: Run initSiteStats.php on cebwiki (T138533)
  • 20:04 logmsgbot: jzerebecki@tin Synchronized wmf-config/CommonSettings.php: Log PHP/HHVM errors in CLI mode to stderr, not stdout T138291 (duration: 00m 28s)
  • 20:03 robh: labmon1001 data restore at 100gb 50minutes in, 298gb total for restoration
  • 19:29 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.7
  • 19:24 greg-g: 19:21 < RoanKatto>  !log Synced patches for T137288 and T137593
  • 18:31 elukey: mw130[0134] - new jobrunners installed and pooled (happened automatically after the fist puppet run)
  • 18:09 robh: labmon1001 powering down for reimage
  • 17:45 subbu: finished deploying parsoid sha 18022c96
  • 17:40 subbu: synced new code; restarted parsoid on wtp1001 as a canary
  • 17:37 subbu: starting parsoid deploy
  • 17:29 robh: labmon1001 cpy changed back to local usb, errors on network transfer for ownership. resumed rsync with append flag to local usb disk.
  • 17:03 bblack: cache perf tuning marker: start rollout of tcp_no_metrics_save:0
  • 16:27 chasemp: remove old log files on ytterbium for T114395
  • 16:18 godog: swift: add ms-be202[234] weight 1000 - T136630
  • 15:31 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings-labs.php: SWAT: LABS: Enable geoshapes graph protocol (duration: 00m 29s)
  • 15:26 akosiaris: stop etherpad-lite, etherpad is down
  • 15:16 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy Compact Language Links as default (Stage 2) PART III (duration: 00m 24s)
  • 15:16 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Deploy Compact Language Links as default (Stage 2) PART II (duration: 00m 28s)
  • 15:15 logmsgbot: thcipriani@tin Synchronized dblists/clldefault.dblist: SWAT: Deploy Compact Language Links as default (Stage 2) PART I (duration: 00m 41s)
  • 15:11 robh: puppet disabled on labmon1001 along with all icinga alerting. data migration to usb in progress via root screen session
  • 15:05 robh: starting data backup of labmon1001, halting statsite/graphite/carbon-relay on system
  • 14:47 akosiaris: change the default message in etherpad to indicate problems
  • 14:47 mobrovac: change-prop deploying 05c72ed24ca
  • 14:45 akosiaris: debugging etherpad. Started the service with a blank db, looks like it's working
  • 14:38 akosiaris: stopping etherpad-lite on etherpad1001, disabling puppet
  • 14:32 jynus: restarting etherpad-lite.service
  • 13:53 hashar: Zuul/CI are slowly catching up. I had to drop a few changes that got force merged on the SmashPig repo.
  • 13:37 awight: update SmashPig from a435adeb130217bda8b95d3c5c6331ace8ad1228 to 917138e159f0341e3dfbb35818c3ce479927875b
  • 13:36 hashar: CI is slowed down due to surge of jobs and lack of instances to build them on ( T133911 ). Queue is 50 for Jessie and 25 for Trusty.
  • 13:30 jynus: db1059 backup and reimage
  • 13:28 awight: update SmashPig from c0cc2a1a6062ad8d114473ea1a444786a0d50833 to a435adeb130217bda8b95d3c5c6331ace8ad1228
  • 13:16 jynus: running scap pool on mw1301
  • 13:13 mobrovac: restarting zotero on sca, 6g mem
  • 13:13 jynus: running scap pool on mw1300
  • 13:11 mobrovac: citoid deploying 0129ab0b
  • 13:11 elukey: purged some puppet output logs on compiler02.puppet3-diffs.eqiad.wmflabs to free space (disk full)
  • 13:09 moritzm: depooled jessie image scaler (mw1291) again, works fine, to be permanently pooled on Monday
  • 12:49 moritzm: pooling new jessie image scaler mw1291 for short production smoke testing
  • 12:35 awight: update SmashPig from f7d65c54bed3ff9c478b0dbcaa1b2d27cc665ace to c0cc2a1a6062ad8d114473ea1a444786a0d50833
  • 12:18 awight: update SmashPig from 90757321a3bfa1045202e06e3dd1960a0043493a to f7d65c54bed3ff9c478b0dbcaa1b2d27cc665ace
  • 12:07 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1059; Repool db1061 & db1062; increase weight of db1068 (duration: 00m 39s)
  • 11:33 gehel: rolling restart of elasticsearch10(01|30|08|36|13|40) to activate new masters
  • 10:13 andrewbogott: restarting rabbitmq-server on labcontrol1001 (random debugging attempt for T138106)
  • 09:49 godog: reimage ms-be202[567] with incorrect raid settings
  • 09:11 jynus: syncing etherpadlite.store (m1) on db2010, which had 2 bad chunks
  • 08:39 mobrovac: change-prop restarting on scb to pick up ores rules https://gerrit.wikimedia.org/r/295576
  • 08:06 mobrovac: change-prop deploying 45db4f84827
  • 06:59 moritzm: installing spice security updates
  • 02:48 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Jun 23 02:47:59 UTC 2016 (duration 6m 44s)
  • 02:41 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 07m 05s)
  • 02:26 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 11m 19s)

2016-06-22

  • 23:24 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/295560/ (duration: 00m 25s)
  • 23:23 logmsgbot: maxsem@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/295560/ (duration: 00m 24s)
  • 23:23 logmsgbot: maxsem@tin Synchronized dblists/mobilemainpagelegacy.dblist: https://gerrit.wikimedia.org/r/#/c/295560/ (duration: 00m 24s)
  • 23:14 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/294247/ (duration: 00m 24s)
  • 23:09 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/295558/ (duration: 00m 40s)
  • 22:25 ori: Ran hacked maintain-replicas.pl on labsdb100[13] for T135029
  • 21:06 bblack: cache perf: start deploy of -autocorking (probably last experiment I can squeeze in today)
  • 21:00 Dereckson: Run namespaceDupes.php on ptwikinews (T138230) and frwikinews (T138442)
  • 20:33 mdholloway: mobileapps: finished deploying 8046ee2
  • 20:26 yurik: deployed & restarted tilerator https://gerrit.wikimedia.org/r/#/c/295447/
  • 20:25 mdholloway: starting mobileapps deployment
  • 20:20 Reedy: created tmplog_begin_devices on tmplog_end_devices on testwiki.cn_template_log
  • 20:18 yurik: deployed & restarted kartotherian https://gerrit.wikimedia.org/r/#/c/295449/
  • 19:32 bblack: start rollout of first batch of cache sysctl stuff (un-mysterious + disable prequeue timestamps)
  • 19:29 jynus: archiving and dropping reviewdb on m1 shard
  • 19:06 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.7
  • 18:46 jynus: shutting down and reimaging db1001
  • 18:20 papaul: ms-be202[3-7] - signing puppet certs, salt-key, initial run
  • 17:23 akosiaris: restart apache on ununpentium for m1 migration. Hosts RT, just did it for good measure
  • 17:21 akosiaris: restarted bacula-director on helium
  • 17:15 jynus: killing puppet, rt, librenms user connections on db1001
  • 17:10 jynus: failovered m1-master from db1001 to db1016
  • 16:20 gehel: new elasticsearch servers elastic1032-1047 are configured and have joined the eqiad cluster
  • 15:26 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.6/extensions/OATHAuth: SWAT: Fixup qrcode-generating js, to stop race condition. (duration: 00m 33s)
  • 15:23 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Improve style (duration: 00m 33s)
  • 15:18 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.7/extensions/OATHAuth: SWAT: Fixup qrcode-generating js, to stop race condition. (duration: 00m 27s)
  • 15:13 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add www.wpc.ncep.noaa.gov to wgCopyUploadsDomains (duration: 00m 54s)
  • 15:01 elukey: rebooting bohrium.eqiad.wmnet (running piwik) for kernel upgrades
  • 14:32 jynus: checksumming m1 databases in preparation for failover
  • 14:29 tgr: running https://phabricator.wikimedia.org/diffusion/ECAU/browse/master/maintenance/checkLocalUser.php for some users T119736
  • 14:04 moritzm: rolling restart of hhvm/apache on app servers in eqiad for expat security update
  • 13:42 godog: add 500G to fluorine /a (almost full)
  • 13:31 gehel: configuring new elasticsearch servers elastic1038-1042 in eqiad
  • 13:03 hashar: Manually moved some missing build records. Restarting Jenkins
  • 12:49 hashar: T80385 Restarting Jenkins with builds dir set to "${JENKINS_HOME}/builds/${ITEM_FULL_NAME}" which is /var/lib/jenkins/builds/XXX
  • 12:35 gehel: starting reimage of mw1292
  • 12:34 _joe_: disabling puppet on mw1017, live-hacking it
  • 12:34 hashar: T80385 stopping Jenkins and migrating all build records to /var/lib/jenkins/builds
  • 12:06 gehel: configuring new elasticsearch servers elastic1033-1037 in eqiad
  • 10:46 godog: upload libphutil/arcanist 0~git20160620-0wmf1 to carbon
  • 10:32 elukey: mw1140 powercycle after freeze issues due to memory pressure (was not able to ssh to it)
  • 10:18 moritzm: rolling restart of restbase in eqiad to pick up firejail change in service::node
  • 09:46 moritzm: rolling restart of restbase in codfw to pick up firejail change in service::node
  • 09:43 legoktm: live-hacking on mw1017 to debug T115119
  • 09:19 jynus: stopping and reconfiguring mysql on dbstore1001
  • 07:59 moritzm: rolling restart of hhvm/apache on canary app servers in eqiad for expat security update
  • 07:30 jynus: stopping, backing up and reimaging db1061 and db1062
  • 07:06 moritzm: restarted hhvm on mw1131
  • 04:29 chasemp: fix salt key on labtestmetal2001
  • 03:12 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jun 22 03:12:33 UTC 2016 (duration 6m 44s)
  • 03:05 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 17m 49s)
  • 02:31 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 24s)

2016-06-21

  • 23:14 yurik: updated/restarted kartotherian & tilerator - https://gerrit.wikimedia.org/r/#/c/295440/ https://gerrit.wikimedia.org/r/#/c/295441/
  • 23:05 tgr: deleted localuser rows for Mahir256@orwikisource and A879071@enwiki for T119736
  • 22:19 bd808: Backfilled missing 2016-06-20 data to https://tools.wmflabs.org/sal/production?d=2016-06-20
  • 22:08 logmsgbot: ori@tin Synchronized static/images/mobile: I8f09e825: Optimize mobile static images (duration: 00m 34s)
  • 19:27 bd808: Restarted dead logstash process on logstash1001. Looks to have stopped itself due to the the Elasticsearch OOM earlier
  • 19:18 logmsgbot: thcipriani@tin Purged l10n cache for 1.28.0-wmf.5
  • 19:17 bd808: Restarted ElasticSearch on logstash1001; dead from OOM
  • 19:14 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.28.0-wmf.7
  • 18:50 bblack: enabled tcp_notsent_lowat optimization on all caches (marking this time for investigation of perf graphs later) - https://gerrit.wikimedia.org/r/#/c/295376/
  • 17:16 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.7/extensions/Graph/lib/graph2.compiled.js: pre-train backport: Updated to latest graph2 lib (duration: 00m 31s)
  • 17:10 yurik_: deployed graphoid https://gerrit.wikimedia.org/r/#/c/295367/
  • 17:06 logmsgbot: thcipriani@tin Synchronized wmf-config/throttle.php: Temporary IP Cap Lift on es.wiki and commons (duration: 00m 24s)
  • 16:33 yurik_: deployed and restarted graphoid with scap3
  • 16:32 gehel: starting installation of new elasticsearch server elastic1032.eqiad.wmnet
  • 15:58 gehel: puppet run on tin to enable scap3 deployment for graphoid
  • 15:53 logmsgbot: catrope@tin Synchronized php-1.28.0-wmf.7/extensions/Echo/: (no message) (duration: 00m 33s)
  • 15:44 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy Compact Language Links as default (Stage 1) (duration: 00m 25s)
  • 15:42 logmsgbot: thcipriani@tin Synchronized wmf-config/db-eqiad.php: Repool db1068 with low weight; depool db1061 and db1062 (duration: 00m 30s)
  • 15:20 logmsgbot: hashar@tin Finished scap: testwiki to group0 (previously was labtestwiki which does not work) (duration: 51m 45s)
  • 14:47 moritzm: rolling restart of aqs service on aqs1001-aqs1006 to pick up new firejail settings
  • 14:28 logmsgbot: hashar@tin Started scap: testwiki to group0 (previously was labtestwiki which does not work)
  • 14:14 moritzm: correction: restbase1007 was already depooled for cassandra maintenance, thus only rebooting to 4.4
  • 14:12 moritzm: depooling restbase1007 for upgrade to Linux 4.4
  • 14:09 logmsgbot: hashar@tin scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="labtestwiki" --outdir="/tmp/scap_l10n_87423667" --threads=4 --lang en --quiet' returned non-zero exit status 255 (duration: 02m 58s)
  • 14:06 logmsgbot: hashar@tin Started scap: (no message)
  • 14:03 gehel: disabling alerting for maps100?\.eqiad\.wmnet during initial installation
  • 14:02 logmsgbot: hashar@tin scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="labtestwiki" --outdir="/tmp/scap_l10n_2087727834" --threads=4 --lang en --quiet' returned non-zero exit status 255 (duration: 06m 37s)
  • 13:55 logmsgbot: hashar@tin Started scap: testwiki to 1.28.0-wmf.7 (take three) T136973
  • 13:55 logmsgbot: hashar@tin scap aborted: testwiki to 1.28.0-wmf.7 (take two) T136973 (duration: 01m 35s)
  • 13:53 logmsgbot: hashar@tin Started scap: testwiki to 1.28.0-wmf.7 (take two) T136973
  • 13:53 logmsgbot: hashar@tin scap aborted: testwiki to 1.28.0-wmf.7 T136973 (duration: 04m 17s)
  • 13:48 logmsgbot: hashar@tin Started scap: testwiki to 1.28.0-wmf.7 T136973
  • 13:15 hashar: T136973 applied all security patches to 1.28.0-wmf.7
  • 13:11 RoanKattouw: Running extensions/Echo/maintenance/removeOrphanedEvents.php on all Echo-enabled wikis for T136425
  • 12:57 moritzm: rolling restart of hhvm/apache in codfw for expat security update
  • 12:49 RoanKattouw: Running extensions/Echo/maintenance/backfillReadBundles.php on all Echo-enabled wikis for T136368
  • 12:49 RoanKattouw: Running extensions/Echo/maintenance/backfillReadBundles.php on all Echo-enabled wikis
  • 12:36 hoo: Started a new JSON dump creation on snapshot1003 (after the last one was inconsistent, per T138291)
  • 12:35 gehel: lowering throttling limit for index recovery on codfw elasticsearch cluster
  • 12:33 hoo: Removed Wikidata json dumps from 20160620 (inconsistent, per T138291).
  • 12:30 hashar: T136973 started cut of branch wmf/1.28.0-wmf.7
  • 12:25 gehel: lowering throttling limit for index recovery on eqiad elasticsearch cluster
  • 11:06 jynus: reimaging db1068
  • 10:32 godog: reboot ms-be2003 for disk ordering - T137785
  • 10:22 moritzm: installing expat security updates on Ubuntu systems
  • 10:03 moritzm: installing wget security updates on Ubuntu systems
  • 09:43 gehel: lowering disk high watermark to rebalance elasticsearch eqiad cluster disk space
  • 09:25 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1068; repool db1070 and db1071 as api (duration: 00m 27s)
  • 09:22 moritzm: rolling reboot of logstash cluster to Linux 4.4
  • 07:41 elukey: restarted hhvm on mw1141 - hhvm was getting SEGV (dump in /tmp/hhvm.8735.bt.)
  • 07:39 elukey: restarted hhvm on mw1139 (hhvm-dump in /tmp/hhvm.20736.bt.)
  • 06:41 moritzm: restarted hhvm on mw1252
  • 02:10 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jun 21 02:10:55 UTC 2016 (duration 6m 36s)
  • 02:04 logmsgbot: l10nupdate@tin LocalisationUpdate failed (1.28.0-wmf.6) at 2016-06-21 02:04:19+00:00

2016-06-20

  • 23:22 Dereckson: `mwscript namespaceDupes.php ptwikinews --fix` (T138230). Some links and revisions are still to fix.
  • 23:16 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Fix pt.wikinews namespace issue (T138230) (duration: 00m 24s)
  • 23:13 logmsgbot: dereckson@tin Synchronized wmf-config/mobile.php: Remove old mobile workaround for Wikidata descriptions (T127250, T138085) (duration: 00m 33s)
  • 21:05 logmsgbot: aude@tin Synchronized php-1.28.0-wmf.6/extensions/Wikidata: Fix property suggester (duration: 01m 59s)
  • 19:50 chasemp: cleaning up /scratch NFS share as it ran out of inodes
  • 19:17 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/includes/api/ApiStashEdit.php: 82e14dc66f478fbdb9ca6eab1eeb4f9c68c99bd1 (duration: 00m 36s)
  • 18:09 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Pool db1071 with low weight after maintenance (duration: 00m 26s)
  • 17:32 bd808: https://tools.wmflabs.org/sal missing events between 2016-06-19T12:29 and 2016-06-20T17:26.
  • 17:26 gehel: deploying latest WDQS
  • 17:19 godog: upload libphutil / arcanist 0~git20160616-0wmf1 to jessie-wikimedia T137770
  • 17:18 mark: Rebooting pfw-codfw
  • 17:00 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: revert cll patch (duration: 00m 25s)
  • 15:44 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow sysops to add to/remove from confirmed on ca.wikinews (duration: 00m 25s)
  • 15:37 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable NewUserMessage on pl.wikipedia (duration: 00m 25s)
  • 15:31 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.6/extensions/CentralAuth: SWAT: queryAttached into cheap and expensive part (duration: 00m 31s)
  • 15:20 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Flow beta feature on frwikiquote (duration: 00m 28s)
  • 15:13 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy Compact Language Links as default (Stage 1) PART III (duration: 00m 30s)
  • 15:12 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Deploy Compact Language Links as default (Stage 1) PART II (duration: 00m 29s)
  • 15:12 logmsgbot: thcipriani@tin Synchronized dblists/cll-nondefault.dblist: SWAT: Deploy Compact Language Links as default (Stage 1) PART I (duration: 00m 29s)
  • 15:11 logmsgbot: jmm@palladium conftool action : select; selector: name=mw1099.eqiad.wmnet
  • 15:04 logmsgbot: thcipriani@tin Synchronized dblists/visualeditor-default.dblist: SWAT: Enable VisualEditor by default for all users of French Wikinews (duration: 00m 29s)
  • 13:27 elukey: restarted hhvm on mw1145 after temp. freeze due to memory pressure (hhvm debug in /tmp/hhvm.17794.bt.)
  • 13:27 paravoid: reactivating peerings with Telia Carrier/AS1299 (eqiad/codfw/ulsfo)
  • 13:06 Amir1: full deployment for 8e65182 in ores nodes
  • 13:04 Amir1: deploying 8e65182 to scb2001
  • 12:56 gehel: installing maps1001.eqiad.wmnet (secondary cluster, no traffic there yet) - T138092
  • 12:56 paravoid: deactivating peerings with Telia Carrier/AS1299 (eqiad/codfw/ulsfo)
  • 12:41 moritzm: rebooting ms1001 for update to Linux 4.4
  • 12:13 Amir1: started deploying ores in scb2001 bdc1e2bd
  • 11:36 godog: roll-restart swift on ms-be1* to apply https://gerrit.wikimedia.org/r/294691
  • 11:27 Amir1: for ores in scb nodes
  • 11:27 Amir1: rollbacking ae71d842dfc0958e06922062dd09d49243332a6a
  • 11:13 _joe_: restarting uwsgi orse service
  • 10:58 Amir1: deploying bdc1e2b in ores nodes
  • 10:53 godog: roll-restart swift on ms-be2* to apply https://gerrit.wikimedia.org/r/294691
  • 10:44 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1071 completelly (duration: 00m 25s)
  • 10:35 jynus: db1071 stop, backup and reimage
  • 10:31 mobrovac: restbase started mobile-sections dump for eswiki on restbase1009 for T136964
  • 10:05 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1073 at 100% weight; depool db1071 for reimaging (duration: 00m 27s)
  • 09:50 moritzm: rolling reboot of restbase2001/restbase2002 for upgrade to Linux 4.4
  • 08:57 Amir1: deploying 5dfe738 in ores nodes
  • 08:15 moritzm: installing libxlst security updates
  • 07:43 gehel: rebalancing shards on elasticsearch eqiad cluster
  • 06:47 _joe_: activating the jessie jobrunner, mw1299
  • 05:57 logmsgbot: ori@tin Synchronized wmf-config/CommonSettings.php: Id5804a80: Better cache headers for 'Powered by MediaWiki' badge (2/2) (duration: 00m 35s)
  • 05:56 logmsgbot: ori@tin Synchronized static/images: Id5804a80: Better cache headers for 'Powered by MediaWiki' badge (1/2) (duration: 00m 33s)
  • 02:29 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Jun 20 02:29:01 UTC 2016 (duration 5m 44s)
  • 02:23 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 09m 54s)

2016-06-19

  • 12:29 elukey: restarted hhvm on mw1138 - trace in /tmp/hhvm.25048.bt, hhvm killed by OOM
  • 12:27 elukey: restarted hhvm on mw1114 - trace in /tmp/hhvm.11092.bt, hhvm killed by OOM
  • 02:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Jun 19 02:31:25 UTC 2016 (duration 5m 47s)
  • 02:25 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 50s)

2016-06-18

  • 02:32 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jun 18 02:32:26 UTC 2016 (duration 6m 18s)
  • 02:26 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 04s)

2016-06-17

  • 21:21 urandom: Reenabling puppet and resetting configuration on xenon.eqiad.wmnet : T137419
  • 20:39 urandom: Restarting Cassandra on xenon.eqiad.wmnet to apply -XX:+PreserveFramePointer : T137419
  • 20:35 urandom: Disabling puppet on xenon.eqiad.wmnet : T137419
  • 20:23 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.6/extensions/WikimediaEvents/: https://gerrit.wikimedia.org/r/#/c/294958/ (duration: 00m 33s)
  • 18:56 urandom: Restarting Cassandra on xenon.eqiad.wmnet with -XX:+PreserveFramePointer : T137419
  • 18:32 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1073 with low weight after reimage (duration: 00m 35s)
  • 16:29 moritzm: installing squid security updates on carbon
  • 15:59 urandom: Starting html dumps from xenon.eqiad.wmnet and cerium.eqiad.wmnet : T137419
  • 15:54 urandom: Restarting Cassandra on xenon.eqiad.wmnet to enable large pages : T137419
  • 14:55 mobrovac: scb disabling puppet for stopping change-prop to clear transclusion queues
  • 14:16 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Increase db1072 weight after repooling (duration: 00m 36s)
  • 12:57 jynus: stopping, backuping and reimaging db1073
  • 12:49 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1072 with low weight, depool db1073 (duration: 00m 27s)
  • 12:49 moritzm: rolling reboot of mw1157-mw1160 into new kernels
  • 12:27 moritzm: restarted hhvm on mw1133 and mw1135
  • 11:14 moritzm: stopping puppet on hosts using service::node (restbase, sca, scb, aqs) for step-by-step rollout of two puppet patches for firejail/service::node
  • 09:31 _joe_: powercycling mw1140, OOMd
  • 09:30 moritzm: rolling reboot of mw1153,mw1155,mw1156 into new kernels
  • 08:29 hashar: Restarting Jenkins on gallium. Web interface at least is deadlocked somehow
  • 07:23 jynus: backuping and reimaging db1072
  • 07:18 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1072 for maintenance (duration: 00m 31s)
  • 07:11 mobrovac: restbase started mobile-sections dump on restbase1009 for T136964
  • 07:02 mobrovac: change-prop restarting it to apply https://gerrit.wikimedia.org/r/294880
  • 06:40 moritzm: installing apache update on palladium
  • 06:16 akosiaris: _joe_ restarted zotero on sca1001
  • 06:16 akosiaris: restarted zotero on sca1002
  • 06:04 logmsgbot: root@palladium conftool action : set/weight=25; selector: cluster=api_appserver,name=mw127.*
  • 05:58 logmsgbot: root@palladium conftool action : set/pooled=yes:weight=20; selector: cluster=api_appserver,name=mw127.*
  • 02:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jun 17 02:31:00 UTC 2016 (duration 6m 26s)
  • 02:24 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 09m 46s)

2016-06-16

  • 23:44 logmsgbot: ebernhardson@tin Synchronized php-1.28.0-wmf.6/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: T137167: TextCat A/B test for Language Identification (duration: 00m 25s)
  • 23:24 logmsgbot: ebernhardson@tin Synchronized php-1.28.0-wmf.6/extensions/WikimediaEvents/extension.json: T137167: TextCat A/B test for Language Identification (duration: 00m 24s)
  • 23:19 logmsgbot: ebernhardson@tin Synchronized php-1.28.0-wmf.6/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: T137167: TextCat A/B test for Language Identification (duration: 00m 24s)
  • 23:16 logmsgbot: ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: T137167: search: Dependent config for textcat AB test. (duration: 00m 26s)
  • 23:11 logmsgbot: ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: T137888: Two permission changes at urwiki (duration: 00m 27s)
  • 23:07 logmsgbot: ebernhardson@tin Synchronized wmf-config/InitialiseSettings-labs.php: T127250: Prepare Wikidata descriptions on mobile for production rollout (duration: 00m 27s)
  • 22:33 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.6/extensions/Kartographer: https://gerrit.wikimedia.org/r/294856 https://gerrit.wikimedia.org/r/294855 (duration: 00m 30s)
  • 22:24 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/294854/ (duration: 00m 26s)
  • 21:15 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.6/extensions/VisualEditor/ApiVisualEditor.php: Pass empty summary to parseAndStash() to avoid warnings T137995 (duration: 00m 39s)
  • 19:05 logmsgbot: hashar@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.6
  • 18:37 tgr: running invalidateUserSessions.php for T137799
  • 18:22 mobrovac: change-prop deploying bc87a1fecfa
  • 16:36 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Set all new slaves to medium weight (300) after warm up (duration: 00m 25s)
  • 15:37 jynus: deleted sqldata.s6 from labsdb1008 - space issues caused by queries creating temporary tables
  • 15:27 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.6/extensions/ORES/includes/Hooks.php: SWAT: Performance boost on hidenondamaging (duration: 00m 35s)
  • 15:23 moritzm: rolling reboot of restbase1008 - restbase1011 for upgrade to Linux 4.4
  • 15:21 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.6/extensions/ORES: SWAT: Skip when an edit is errored in PopulateDatabase.php (duration: 00m 30s)
  • 15:04 logmsgbot: root@palladium conftool action : set/pooled=yes; selector: name=mw1262.eqiad.wmnet
  • 14:31 twentyafterfour: re-enabled and ran puppet agent --test on iridium. Everything appears to be normal.
  • 13:04 mobrovac: scb1001 enabled puppet back
  • 12:57 gehel: rebalancing shards on elasticsearch equiad cluster
  • 12:33 Amir1: manually restarted celery-ores-worker in scb1001
  • 12:32 moritzm: installing apache2 trusty update on graphite1001
  • 12:32 Amir1: manually restarted celery-ores-worker in scb1002
  • 12:10 moritzm: restarted hhvm on mw1137, got stuck
  • 10:44 moritzm: depooling mw1154 for kernel update/reboot
  • 10:14 mobrovac: scb1001 disabling puppet for a while to manually test changeprop with transclusion rules
  • 09:59 mobrovac: restbase deploy end of ebeaa46
  • 09:56 _joe_: powercycling mw1143, unresponsive on ssh, console
  • 09:48 mobrovac: restbase deploy start of ebeaa46
  • 09:18 logmsgbot: hashar@tin Synchronized php-1.28.0-wmf.6/extensions/MobileFrontend: MobileFrontend RL registration issue preventing Special:Nearby from working properly T137919 (duration: 00m 36s)
  • 08:41 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Pool db1085, increase weight of all new db servers (duration: 00m 29s)
  • 08:15 jynus: rebooting db1085 before putting it back into production
  • 02:34 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 15m 49s)
  • 00:57 twentyafterfour: puppet disabled on iridium because https://gerrit.wikimedia.org/r/#/c/294653/ needs to merge (hotfix in preamble.php which puppet will undo if it's allowed to run)
  • 00:43 twentyafterfour: phabricator upgrade/maintenance complete. Everything appears to be back up and running normally.
  • 00:41 twentyafterfour: taking phabricator offline momentarily for scheduled maintenance.
  • 00:24 robh: mw1147 rebooted and manually running scap pull
  • 00:21 robh: mw1147 seems to have died during scap, unresponsive from serial console, powercycled
  • 00:16 logmsgbot: mattflaschen@tin Synchronized php-1.28.0-wmf.6/extensions/Kartographer: Search for maplinks inside and outside of content. (duration: 01m 08s)

2016-06-15

  • 23:38 logmsgbot: mattflaschen@tin Synchronized php-1.28.0-wmf.6/extensions/Echo: Sync Echo fix for cross-wiki notifications: 62324e3 (duration: 00m 33s)
  • 21:32 logmsgbot: aaron@tin Synchronized wmf-config/filebackend-production.php: Set "sync" filebackend replication to measure latency effect (duration: 00m 25s)
  • 21:27 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/includes/libs/objectcache/WANObjectCache.php: faff8f1ef1bfefd1804a3f46e58566711faa3224 (duration: 00m 27s)
  • 21:16 dapatrick: Deployed patch for T137264 to wmf.5 and wmf.6
  • 20:17 logmsgbot: hashar@tin Synchronized wmf-config/throttle.php: Temporary IP Cap Lift on es.wiki T137917 (duration: 00m 30s)
  • 20:09 subbu: finished deploying parsoid sha 3445eceb
  • 20:05 bblack: cache frontend restarts complete
  • 20:04 subbu: synced new code; restarted parsoid on wtp1001 as a canary
  • 20:02 subbu: starting parsoid deploy
  • 19:25 bblack: rolling restart of global varnish frontends (salt -b 1: depool -> sleep 15 -> restart -> repool) - estimated ~35 mins to completion - T107236 (...._
  • 19:15 bblack: varnish frontend restart halted - v4 compat issue to address :P
  • 19:11 bblack: rolling restart of global varnish frontends (salt -b 1: depool -> sleep 15 -> restart -> repool) - estimated ~30 mins to completion - T107236
  • 19:05 logmsgbot: hashar@tin rebuilt wikiversions.php and synchronized wikiversions files: (no message)
  • 18:54 ori: Started MySQL on es2019 (T130702)
  • 16:32 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1023; pool db1085 (disabled), db1088, db1092 w/low weight (duration: 00m 25s)
  • 16:07 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix autopatrolled group for ko.wikipedia (duration: 00m 31s)
  • 16:00 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.6/resources/src/mediawiki.special/mediawiki.special.search.styles.css: SWAT: Explicitly specify the width of the search input on Special:Search (duration: 00m 25s)
  • 15:53 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add autopatrolled group in kowiki (duration: 00m 24s)
  • 15:33 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy ORES beta feature in wikidatawiki (duration: 00m 24s)
  • 15:23 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: kafka1002.eqiad.wmnet
  • 15:23 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/extensions/ORES: SWAT: Skip when an edit is errored in PopulateDatabase.php (duration: 00m 27s)
  • 15:17 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Send authentication events to logstash (duration: 00m 28s)
  • 15:15 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: kafka1002.eqiad.wmnet
  • 15:11 logmsgbot: thcipriani@tin Synchronized wmf-config/logging.php: SWAT: Fix logging config for authmanager metrics channel rename (duration: 00m 24s)
  • 15:10 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: kafka1001.eqiad.wmnet
  • 15:06 logmsgbot: thcipriani@tin Synchronized wmf-config/throttle.php: SWAT: Remove old throttle rules (duration: 00m 30s)
  • 15:00 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: kafka1001.eqiad.wmnet
  • 15:00 mobrovac: scb disabled puppet for stopped change-prop during kafka nodes upgrade
  • 15:00 elukey: rebooting Eqiad Event Bus for kernel upgrades (one node at the time)
  • 14:24 moritzm: installing php security updates on jessie systems
  • 13:55 moritzm: remove unused PHP packages from the recently provisioned jessie app servers (new installation are fixed in puppet to only install php5-cli, but the initial set needs fixed up manually)
  • 13:40 gehel: rolling back update of firejail on maps2001
  • 13:16 _joe_: stopped jobchron, jobrunner on mw1299, masked in systemd
  • 13:15 mobrovac: change-prop deployed 6ad337
  • 13:06 moritzm: installing libav security updates
  • 12:37 _joe_: rebooting mw1299
  • 12:06 gehel: upgrade of firejail on maps server stopped, pending a patch to service::node
  • 11:46 mobrovac: scb enabled puppet back
  • 11:44 gehel: upgrading firejail to 0.9.38 on maps servers
  • 11:32 mobrovac: scb disabled puppet for 5 min to keep change-prop down
  • 11:30 mobrovac: change-prop deploying 353b926
  • 11:29 jynus: stopping db1023 for cloning to new s6 hosts
  • 11:22 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Increase new enwiki dbs weight, depool db1023 for cloning (duration: 00m 27s)
  • 11:13 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1033, first pool of db1079, db1086, db1094 with low weight (duration: 00m 25s)
  • 11:11 moritzm: enabed firejail wrapper for imagemagick's convert (for image scalers and the Score extension)
  • 10:59 paravoid: rebooting install2001 again
  • 10:48 logmsgbot: jmm@tin Synchronized wmf-config/CommonSettings.php: firejail security hardening for image scalers (duration: 00m 26s)
  • 09:48 godog: bounce ms-be2003, xfs high load
  • 09:13 moritzm: repooled mw1154 (kernel still the same ATM)
  • 08:53 moritzm: depooling mw1154 (image scaler) for kernel update
  • 08:29 jynus: turning down db1033 for cloning to new s7 slaves
  • 08:15 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1033 for cloning (duration: 00m 38s)
  • 06:59 moritzm: installing apache trusty updates on eqiad app servers
  • 03:51 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/includes/parser/Parser.php: 4e6e1bc1f2de000f0fdd84dcf04f63a21127d24a (duration: 00m 30s)
  • 03:49 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/includes/parser/Parser.php: 23bac8905a9d60cdc0a068ca025644e091b9027f (duration: 00m 32s)
  • 03:10 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jun 15 03:10:57 UTC 2016 (duration 6m 55s)
  • 03:04 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 16m 29s)
  • 02:30 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 12m 24s)
  • 02:29 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.5/extensions/Scribunto/engines/LuaCommon/TitleLibrary.php: revert: ad-hoc debug of vary-revision in scribunto (duration: 00m 29s)
  • 02:22 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.5/extensions/Scribunto/engines/LuaCommon/TitleLibrary.php: ad-hoc debug of vary-revision in scribunto (duration: 00m 26s)
  • 01:51 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.5/resources/src/mediawiki.action/mediawiki.action.edit.stash.js: Idfad8407: Improve client-side edit stash change detection (duration: 00m 24s)
  • 01:31 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/includes/parser: 78de24a20c4662ea709e1f8af84bb5fae4aea2fa (duration: 00m 33s)
  • 01:30 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/includes/parser: 48652dfc27d1bbaab41b3a4d8f7d6be23e2da6b6 (duration: 00m 34s)

2016-06-14

  • 23:40 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.6/resources/src/mediawiki.action/mediawiki.action.edit.stash.js: Idfad8407c8e: Improve client-side edit stash change detection (duration: 00m 25s)
  • 23:30 logmsgbot: ori@tin Synchronized wmf-config/InitialiseSettings.php: Id800a9d35b: Set import sources for he.wikipedia (T137074) and If66f307a2e: Set import sources for pt.wikinews (T137633) (duration: 00m 27s)
  • 23:28 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.6/extensions/AntiSpoof: I2e407a3ac8: Revert "Make sure AntiSpoof mappings are mapping in the correct direction." (duration: 00m 27s)
  • 23:15 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.6/extensions/Echo: If07369cb1: Allow the primary link to set all bundled notifications as read (T136368) (duration: 00m 34s)
  • 23:09 logmsgbot: ori@tin Synchronized wmf-config/abusefilter.php: I4e5e4d227: Set $wgAbuseFilterConditionLimit = 2000 for commonswiki (T132048) (duration: 00m 28s)
  • 22:43 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/includes/deferred: 0d038de1414c0b4faed1cc9882151e68d86d3b2d (duration: 00m 25s)
  • 22:15 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/includes/deferred: 29863094805baed7a5fa493c99c87745ce041f49 (duration: 00m 27s)
  • 21:50 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/resources: 7898fd2fa969342a5cc30df6a5757f4642cd6118 (duration: 00m 28s)
  • 21:44 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/includes: 7898fd2fa969342a5cc30df6a5757f4642cd6118 (duration: 01m 12s)
  • 21:33 logmsgbot: gehel@palladium conftool action : set/pooled=no; selector: name=maps-test2.*
  • 21:28 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: name=maps-test2*
  • 21:28 gehel: sending traffic back to old maps servers (T137620)
  • 21:10 logmsgbot: gehel@palladium conftool action : set/pooled=no; selector: name=maps-test2*
  • 21:09 logmsgbot: gehel@palladium conftool action : set/pooled=no; selector: maps-test2001.codfw.wmnet (tags: ['dc=codfw', 'cluster=maps', 'service=kartotherian'])
  • 21:09 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: maps2004.codfw.wmnet (tags: ['dc=codfw', 'cluster=maps', 'service=kartotherian'])
  • 21:08 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: maps2003.codfw.wmnet (tags: ['dc=codfw', 'cluster=maps', 'service=kartotherian'])
  • 21:08 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: maps2002.codfw.wmnet (tags: ['dc=codfw', 'cluster=maps', 'service=kartotherian'])
  • 20:58 logmsgbot: gehel@palladium conftool action : set/pooled=yes; selector: maps2001.codfw.wmnet (tags: ['dc=codfw', 'cluster=maps', 'service=kartotherian'])
  • 20:55 gehel: pooling maps2001 (new map server) - T137620
  • 20:50 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.6/includes: ca9068daffb49cc0cdfb84385a29aea34df155cd (duration: 01m 51s)
  • 20:46 gehel: adding new maps servers to LVS
  • 20:09 logmsgbot: demon@tin Finished scap: wikidata submodule update for wmf.6 (duration: 25m 51s)
  • 19:43 logmsgbot: demon@tin Started scap: wikidata submodule update for wmf.6
  • 19:30 logmsgbot: demon@tin Finished scap: group0 to 1.28.0-wmf.6 (duration: 26m 43s)
  • 19:03 logmsgbot: demon@tin Started scap: group0 to 1.28.0-wmf.6
  • 18:56 logmsgbot: demon@tin Purged l10n cache for 1.27.0-wmf.23
  • 18:54 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.4
  • 18:54 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.3
  • 18:54 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.2
  • 18:53 logmsgbot: demon@tin Purged l10n cache for 1.28.0-wmf.1
  • 17:22 Dereckson: Run initSiteStats.php for arcwiki and htwiki (T137827)
  • 16:48 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1040, pool for the first time db1081, db1084, db1091 (duration: 00m 34s)
  • 16:41 godog: reimage ms-fe3002 with jessie T117972
  • 15:57 yurik: deployed & restarted kartotherian (fixing spec.config tests)
  • 15:54 urandom: Restarting cassandra-metrics-collector on restbase1007 : T137304
  • 15:53 logmsgbot: thcipriani@tin Synchronized wmf-config: SWAT: Beta: Enable Compact Language Links for new users (duration: 00m 31s)
  • 15:41 logmsgbot: hashar@tin scap aborted: testwiki to php-1.28.0-wmf.6 and rebuild l10n cache (duration: 01m 31s)
  • 15:40 logmsgbot: hashar@tin Started scap: testwiki to php-1.28.0-wmf.6 and rebuild l10n cache
  • 15:35 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/extensions/CentralAuth/includes/CentralAuthHooks.php: SWAT: Account for changed login process (duration: 00m 26s)
  • 15:27 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add nonecho.dblist and echo.dblist PART III (duration: 00m 27s)
  • 15:26 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Add nonecho.dblist and echo.dblist PART II (duration: 00m 26s)
  • 15:26 godog: reimage ms-fe3001 with jessie T117972
  • 15:25 logmsgbot: thcipriani@tin Synchronized dblists: SWAT: Add nonecho.dblist and echo.dblist PART I (duration: 00m 28s)
  • 15:19 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add nonecho.dblist and echo.dblist PART III (duration: 00m 28s)
  • 15:18 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Add nonecho.dblist and echo.dblist PART II (duration: 00m 30s)
  • 15:18 logmsgbot: thcipriani@tin Synchronized dblists: SWAT: Add nonecho.dblist and echo.dblist PART I (duration: 00m 30s)
  • 15:09 yurik: deployed & restarted kartotherian
  • 15:07 logmsgbot: thcipriani@tin Synchronized dblists/visualeditor-default.dblist: SWAT: Enable VisualEditor by default on eleven Wikivoyages (duration: 01m 49s)
  • 13:47 hashar: T136971 Cutting MediaWiki branches 1.28.0-wmf.6
  • 13:40 moritzm: installing apache trusty updates on codfw app servers
  • 13:28 paravoid: rebooting install2001, T137647
  • 12:58 moritzm: installing apache trusty updates on canary app servers
  • 12:55 mobrovac: change-prop deployed f34fb06c99
  • 12:27 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1024, pool for the first time db1090 with low weight (duration: 00m 38s)
  • 11:30 mobrovac: scb disabling puppet for 10 mins or so to keep change-prop down
  • 11:15 akosiaris: T134242 rebooting alsafi.wikimedia.org hassaleh.codfw.wmnet kraz.wikimedia.org mx2001.wikimedia.org planet2001.codfw.wmnet pollux.wikimedia.org pybal-test2001.codfw.wmnet pybal-test2002.codfw.wmnet pybal-test2003.codfw.wmnet for qemu-kvm upgrade
  • 11:13 akosiaris: T134242 install qemu-system-common, qemu-system-x86 1:2.5+dfsg-4~bpo8+1 from jessie-backports on ganeti200{1,2,3,4,5,6}
  • 11:04 _joe_: pooling all the new codfw appservers that have been installed - mw2215-mw2240 (T135466)
  • 10:56 _joe_: pooling the new jessie appservers, mw1263-71
  • 10:52 logmsgbot: oblivian@palladium conftool action : set/weight=30; selector: cluster=appserver,dc=eqiad,name=mw12[67].*
  • 09:27 godog: roll-restart swift proxy in codfw and eqiad
  • 09:04 hashar: gallium: manually removing cron entry zuul_repack from user zuul. Causes cron spam due to zuul merger no more being on gallium T137418
  • 08:59 jynus: stopping db1040 for cloning to new s4 hosts
  • 08:28 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1040 for cloning (duration: 00m 32s)
  • 08:23 _joe_: powercycling mw1154, unresponsive
  • 07:19 jynus: powercycling mw1156, could not regain control after OOM
  • 07:18 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1024, increase weight of db1082, db1087 and db1092 (duration: 10m 50s)
  • 07:05 _joe_: rolling reboot of mw2233-40
  • 06:47 _joe_: rebooting mw2228
  • 06:43 _joe_: rebooting mw2228
  • 06:29 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Pool db1052, db1080, db1083, db1089 (duration: 01m 31s)
  • 02:39 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jun 14 02:39:50 UTC 2016 (duration 5m 59s)
  • 02:33 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 12m 14s)

2016-06-13

  • 23:50 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Add ORES to whitelisted beta features (T130211) (duration: 00m 23s)
  • 23:42 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.5/extensions/ORES/includes/Hooks.php: Update links to beta features (duration: 00m 25s)
  • 23:33 ejegg: updated payments from 44102c59ac897c9acab470bf83369d233f9b736f to 2fc573cbb94e833c4144aa9dad79de8ec374bb09
  • 23:29 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Update cross-wiki upload configuration (Gerrit:293355) (duration: 00m 23s)
  • 23:10 logmsgbot: dereckson@tin Synchronized portals: (no message) (duration: 00m 24s)
  • 23:10 logmsgbot: dereckson@tin Synchronized portals/prod/wikipedia.org/assets: (no message) (duration: 00m 24s)
  • 22:51 logmsgbot: demon@tin Synchronized wmf-config/CommonSettings.php: Update extension distributor settings (duration: 00m 24s)
  • 22:42 yurik: switched to scap3 and deployed tilerator. Deployed kartotherian. Restarted.
  • 22:41 dapatrick: Deployed patches for T129738 to wmf5
  • 22:36 awight: update fundraising CRM revert from e684b7823e751558772a4de4ac23819bc601eb74 to bb9bf136dc0fa82d5d07ebeb33d696e54672b2d6
  • 22:11 awight: Updating fundraising CRM from b7b46740d701942507dca0a98a75f3f87b6b31b1 to e684b7823e751558772a4de4ac23819bc601eb74
  • 19:15 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/resources: ee2da9c2ae6fac93bf65d17b5ea48e5c47c87d47 (duration: 00m 35s)
  • 18:20 bblack: upgrading nginx (etc) on deployment-prep caches
  • 18:11 gehel: deploying latest GUI on WDQS,
  • 17:58 urandom: Upgrade of restbase1007.eqiad.wmnet (https://people.wikimedia.org/~eevans/debian/cassandra_2.2.6-wmf1_all.deb) complete : T137474
  • 17:55 urandom: Restarting restbase1007-c.eqiad.wmnet : T137474
  • 17:52 urandom: Restarting restbase1007-b.eqiad.wmnet : T137474
  • 17:47 awight: Whitelist Special:PaypalExpressGatewayResult
  • 17:43 godog: enable proxy_http apache module on graphite1003 / graphite2002 and restart apache
  • 17:38 urandom: Restarting restbase1007-a.eqiad.wmnet : T137474
  • 17:37 urandom: Upgrading restbase1007.eqiad.wmnet w/ https://people.wikimedia.org/~eevans/debian/cassandra_2.2.6-wmf1_all.deb : T137474
  • 17:35 awight: update paymentswiki from 63fbe39fbc4d671fd2705ce9e42762b7c49564c2 to 44102c59ac897c9acab470bf83369d233f9b736f
  • 16:51 _joe_: powercycling mw1115
  • 16:49 logmsgbot: thcipriani@tin Finished scap: Update l10n cache for ores (duration: 32m 04s)
  • 16:17 logmsgbot: thcipriani@tin Started scap: Update l10n cache for ores
  • 15:59 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Enable ORES on fawiki PART II (duration: 00m 24s)
  • 15:58 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable ORES on fawiki PART I (duration: 00m 25s)
  • 15:58 logmsgbot: thcipriani@tin Synchronized wmf-config/extension-list: SWAT: Add ORES to extension-list (duration: 00m 25s)
  • 15:42 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add images.nypl.org to $wgCopyUploadsDomains for commons (duration: 00m 24s)
  • 15:37 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable VE in NS_PROJECT in cswiki (duration: 00m 25s)
  • 15:32 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/extensions/Echo: SWAT: Use localized weekdays on Special:Notifications (duration: 00m 32s)
  • 15:26 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable transwiki import for la.wiktionary (duration: 00m 26s)
  • 15:22 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Permission changes in zhwiki (duration: 00m 26s)
  • 15:18 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable VisualEditor by default for logged-out users on four Wikipedias too (duration: 00m 24s)
  • 15:10 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/extensions/MobileFrontend: Do Not strip srcset on API mobileview action PART II (duration: 00m 38s)
  • 15:09 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/extensions/MobileFrontend/includes/MobileContext.php: Do Not strip srcset on API mobileview action PART I (duration: 00m 49s)
  • 15:05 godog: reboot ms-be2012 to fix disk ordering T136395
  • 14:51 godog: truncate syslog.1 on ms-be2012
  • 14:26 bblack: upgrading cp* nginx (and other oustanding minor package updates)
  • 14:23 bblack: uploaded nginx-1.11.1-1+wmf2 to carbon
  • 13:55 dcausse: restarting logstash on logstash1001
  • 11:59 mobrovac: change-prop deployed 54f98b7
  • 11:31 _joe_: rolling reboot of the new appservers in codfw + scap pull
  • 09:55 _joe_: powercycling mw1138, oom, console non-responsive
  • 09:53 jynus: stopping db1052 and cloning it to db1080, db1083 and db1089
  • 09:43 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1052 for cloning (duration: 00m 26s)
  • 08:51 moritzm: removed /var/log/logstash/logstash.log.1 on logstash1001, depleted disk space on the root partition, fallout of T137400
  • 08:43 jynus: powercycling mw1155.eqiad.wmnet , unresponsive on ssh, serial console
  • 08:31 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Increase weight of db1082, db1087, db1092 (duration: 02m 36s)
  • 08:25 logmsgbot: oblivian@palladium conftool action : set/weight=30; selector: name=mw1261.eqiad.wmnet
  • 08:17 logmsgbot: oblivian@palladium conftool action : set/pooled=yes; selector: name=mw1261.eqiad.wmnet
  • 08:01 logmsgbot: oblivian@palladium conftool action : set/pooled=no:weight=20; selector: name=mw1262.eqiad.wmnet
  • 08:00 logmsgbot: oblivian@palladium conftool action : set/pooled=no:weight=20; selector: name=mw1261.eqiad.wmnet
  • 08:00 logmsgbot: oblivian@palladium conftool action : set/pooled=no; selector: name=mw1261.eqiad.wmnet
  • 06:32 logmsgbot: oblivian@palladium conftool action : set/pooled=yes; selector: name=mw126.*
  • 02:28 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 13m 02s)

2016-06-12

  • 02:28 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 12m 24s)

2016-06-11

  • 03:14 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.5/includes/parser/CacheTime.php: remove ad-hoc logging of updateCacheExpiry(0) traces (duration: 00m 23s)
  • 03:11 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.5/includes/parser/CacheTime.php: ad-hoc logging of updateCacheExpiry(0) traces (duration: 00m 25s)
  • 02:36 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jun 11 02:36:22 UTC 2016 (duration 6m 30s)
  • 02:29 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 11m 42s)
  • 01:01 mutante: rutherfordium ganeti lockup, gnt-instance console .. and it recovered

2016-06-10

  • 23:37 awight: Update PayPal Express Checkout configuration: add API certificate path
  • 21:58 logmsgbot: ori@tin Synchronized wmf-config/mobile.php: I3d8155d7e14: Remove old config hack that disabled $wgResponsiveImages on mobile (duration: 00m 24s)
  • 19:38 mutante: cp1043/cp1044 - revoke puppet cert, salt key
  • 19:30 logmsgbot: thcipriani@tin Synchronized wmf-config/throttle.php: Fix for ip lift cap for eswiki and Temporary IP Cap Lift for eswiki (duration: 00m 23s)
  • 19:19 logmsgbot: ori@tin Synchronized multiversion: Id432e25c: MWMultiVersion: allow wiki to be specified via the environment (duration: 00m 56s)
  • 17:12 elukey: Updated the puppet compiler with new hosts/facts
  • 16:32 mutante: cp1043,cp1044 shutdown -h, confirmed not in pybal/confctl
  • 16:27 mutante: cp1043/cp1044 - decom'ing, were already "Unused spare system" but running, scheduling downtime in icinga, shutting them down and removing from torrus config and puppet (T133614)
  • 14:17 urandom: Testing patched Cassandra (dpkg -i ...; service cassandra-{a,b} restart) on restbase-test200[1-2] : T137474
  • 14:06 urandom: Testing patched Cassandra (dpkg -i ...; service cassandra-a restart) on restbase-test2001 : T137474
  • 13:59 urandom: Testing patched Cassandra (dpkg -i ...; service cassandra-a restart) on praseodymim : T137474
  • 13:58 urandom: Testing patched Cassandra (dpkg -i ...; service cassandra-a restart) on cerium : T137474
  • 13:15 urandom: Starting html dump(s) in RESTBase staging : T137474
  • 13:13 urandom: Testing patched Cassandra (dpkg -i ...; service cassandra-a restart) on xenon : T137474
  • 11:07 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/includes/specials/SpecialUserLogin.php: deploy gerrit:293704 to fix AuthManager metrics (duration: 00m 32s)
  • 11:06 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/includes/specials/SpecialCreateAccount.php: deploy gerrit:293704 to fix AuthManager metrics (duration: 00m 52s)
  • 10:56 mobrovac: scb100x enabled puppet back
  • 10:05 mobrovac: scb100x disabling puppet and stopping change-prop to look at zookeeper znodes
  • 09:22 elukey: restarted uwsgi-ores on scb200[12] as deployment follow up
  • 08:27 elukey: restarted uwsgi-ores (after a deployment + puppet run) - service was down
  • 08:01 Amir1: deploying 38df031 into scb100[12] for ores service. Expecting some down time
  • 07:59 dcausse: refilling ttmserver index on all ttm enabled wikis
  • 06:42 moritzm: bounced hhvm on mw1264 (backtrace in /tmp/hhvm.2197.bt)
  • 06:28 papaul: mw2215-mw2238 -signing puppet certs, salk-key initial run
  • 05:54 mutante: re-enabling puppet on carbon
  • 04:48 moritzm: installing squid3 security updates on Ubuntu systems
  • 03:17 logmsgbot: aaron@tin Synchronized wmf-config/CommonSettings.php: Lower $wgAPIMaxLagThreshold to 5 (duration: 00m 36s)
  • 02:35 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jun 10 02:35:15 UTC 2016 (duration 6m 2s)
  • 02:29 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 11m 32s)
  • 01:44 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/includes/specialpage/LoginSignupSpecialPage.php: deploying gerrit:293668: fix AuthManager warning spam (duration: 00m 25s)
  • 01:43 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/includes/specials/SpecialUserLogin.php: deploying gerrit:293667: fix AuthManager dashboard (duration: 00m 33s)
  • 01:42 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/includes/specials/SpecialCreateAccount.php: deploying gerrit:293667: fix AuthManager dashboard (duration: 00m 25s)
  • 00:56 kaldari: ran mwscript maintenance/updateCollation.php --wiki=tawiki --force
  • 00:40 kaldari: ran mwscript maintenance/updateCollation.php --wiki=tawikibooks --force
  • 00:39 kaldari: ran mwscript maintenance/updateCollation.php --wiki=tawikinews --force
  • 00:36 mutante: git pull on strontium because i merged a non-change
  • 00:31 kaldari: ran mwscript maintenance/updateCollation.php --wiki=tawikiquote --force
  • 00:27 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings-labs.php: ores.wikimedia.org instead of ores.wmflabs.org (duration: 00m 25s)
  • 00:21 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.5/extensions/Wikidata/extensions/ArticlePlaceholder/includes/SearchHookHandler.php: Update Wikidata - Fix uncaught exception in ArticlePlaceholder (3/3) (duration: 00m 25s)
  • 00:20 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.5/extensions/Wikidata/vendor/composer/installed.json: Update Wikidata - Fix uncaught exception in ArticlePlaceholder (2/3, no-op) (duration: 00m 25s)
  • 00:19 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.5/extensions/Wikidata/composer.lock: Update Wikidata - Fix uncaught exception in ArticlePlaceholder (1/3, no-op) (duration: 00m 27s)
  • 00:12 kaldari: ran mwscript maintenance/updateCollation.php --wiki=tawikisource --force
  • 00:07 kaldari: ran mwscript maintenance/updateCollation.php --wiki=tawiktionary --force

2016-06-09

  • 23:57 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Set Tamil projects to use uca-ta collation II (T75453) (duration: 00m 25s)
  • 23:53 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Enable Flow beta feature on frwiki (T136684) (duration: 00m 27s)
  • 23:47 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Remove HiddenPrefs hack for turning off cross-wiki notifications (T135266) (duration: 00m 27s)
  • 23:31 logmsgbot: tgr@tin Synchronized wmf-config/InitialiseSettings.php: enable AuthManager on group2 wikis T135504 (duration: 00m 24s)
  • 23:29 logmsgbot: tgr@tin Synchronized wmf-config/CommonSettings.php: enable use of group1, group2 dblists in config (duration: 00m 23s)
  • 23:28 logmsgbot: tgr@tin Synchronized dblists/group2.dblist: add dblist for group2 (duration: 00m 22s)
  • 23:20 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/includes/specialpage/LoginSignupSpecialPage.php: deploying gerrit:293636 for AuthManager T135504 (duration: 00m 25s)
  • 23:19 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/extensions/MobileFrontend/resources/skins.minerva.special.userlogin.styles/userlogin.less: deploying gerrit:293638 for AuthManager T135504 (duration: 00m 25s)
  • 23:18 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/extensions/ConfirmEdit/FancyCaptcha/resources/ext.confirmEdit.fancyCaptcha.js: deploying gerrit:293637 for AuthManager T135504 (duration: 00m 24s)
  • 22:48 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.5
  • 22:35 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/includes/site/DBSiteStore.php: Revert "Map dummy language codes in sites" Part II (duration: 00m 31s)
  • 22:35 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/includes/ServiceWiring.php: Revert "Map dummy language codes in sites" Part I (duration: 00m 23s)
  • 21:41 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/includes: 904dd4ae088a8f67942c09b2b28178377955d6a6 (duration: 01m 18s)
  • 20:57 logmsgbot: hoo@tin Synchronized wmf-config/InitialiseSettings.php: Enable the ArticlePlaceholder on nnwiki (T130997) (duration: 00m 24s)
  • 20:53 logmsgbot: hoo@tin Synchronized wmf-config/InitialiseSettings.php: Enable the ArticlePlaceholder on lvwiki (T136100) (duration: 00m 26s)
  • 20:48 logmsgbot: hoo@tin Synchronized wmf-config/InitialiseSettings.php: Enable the ArticlePlaceholder on guwiki (T136517) (duration: 00m 24s)
  • 20:40 logmsgbot: hoo@tin Synchronized php-1.28.0-wmf.4/extensions/Wikidata: Update ArticlePlaceholder (duration: 01m 54s)
  • 20:36 logmsgbot: hoo@tin Synchronized php-1.28.0-wmf.5/extensions/Wikidata: Update ArticlePlaceholder (without unrelated T136598 fixes this time) (duration: 01m 51s)
  • 20:33 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/includes/user/User.php: c3b1f80a701d61dc57ccac0c8b1dc7daf03fa925 (duration: 00m 29s)
  • 19:59 urandom: Restarting Cassandra on xenon.eqiad.wmnet (removing patched test build; restoring state) : T137474
  • 19:53 logmsgbot: hoo@tin Synchronized php-1.28.0-wmf.5/extensions/Wikidata: revert, possible s5 master overload (duration: 01m 57s)
  • 19:47 logmsgbot: hoo@tin Synchronized php-1.28.0-wmf.5/extensions/Wikidata: Update ArticlePlaceholder (duration: 02m 04s)
  • 19:44 bearND: mobileapps deployed 71ff97c
  • 19:42 bearND: starting mobileapps deploy
  • 19:11 ejegg: updated cancel page settings on payments-wiki
  • 17:43 urandom: Restarting Cassandra on xenon.eqiad.wmnet (use exponentially decaying resevoirs for metrics histograms) : T126629
  • 17:19 mobrovac: change-prop deploying ecfda93f09d
  • 17:10 ejegg: updated payments-wiki from 3dcf58e3b4e1d02ad4f1874a3e87e55b7e169bfe to 053aaa259382c94aa59e4d0da7317fcafab635cd
  • 15:31 elukey: added topic override retention.bytes=536870912000 to Kafka webrequest_text (T136690)
  • 15:22 hashar: Cleaning git-daemon on gallium (was used by zuul-merger) T137418
  • 15:19 logmsgbot: aude@tin Synchronized wmf-config/InitialiseSettings.php: Add *.nara.gov to wgCopyUploadDomains (duration: 00m 40s)
  • 14:47 mobrovac: change-prop stopped on scb1002
  • 14:38 elukey: Tested temp setting retention.bytes=2G for Analytics kafka topic webrequest_misc
  • 14:37 hashar: Removing zuul-merger from gallium
  • 14:33 hashar: stopped / disabled zuul-merger on gallium T137418
  • 14:12 mobrovac: change-prop restarting on scb1001 for update
  • 14:07 urandom: Re-enabling puppet on xenon.eqiad.wmnet, forcing a run, and restarting Cassandra : T137419
  • 13:52 mobrovac: change-prop restarting on scb1002 for update
  • 13:45 mobrovac: change-prop deploying 2161403c
  • 13:26 urandom: Restarting Cassandra on xenon.eqiad.wmnet to apply 2G file cache : T137419
  • 12:51 urandom: Restarting Cassandra on xenon.eqiad.wmnet : T126629
  • 12:33 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/extensions/LdapAuthentication/LdapPrimaryAuthenticationProvider.php: deploy gerrit:293459 to fix wikitech API login / morebots (T137377) (duration: 00m 47s)
  • 12:19 tgr: !log deploying gerrit:293459 to fix morebots (T137377)
  • 12:11 urandom: !log Restarting Cassandra on xenon.eqiad.wmnet : T126629
  • 12:06 urandom: !log Temporarily disabling puppet on xenon.eqiad.wmnet to test settings : T126629
  • 11:33 Amir1: !log manually restarting ores-uwsgi and celery-ores-worker in scb100[12]
  • 10:51 urandom: !log Restarting Cassandra on {cerium,praseodymium}.eqiad.wmnet (RESTBase staging) : T126629
  • 09:16 gehel: !log lowering disk high watermark to rebalance disk usage on elasticsearch eqiad cluster
  • 09:05 Amir1: !log restarting uwsgi-ores celery-ores-worker in scb1001 and scb1002
  • 08:55 moritzm: !log installing libtasn security updates
  • 08:38 moritzm: !log rolling restart of app server canaries for libtasn security update
  • 07:22 moritzm: !log removed /var/log/logstash/logstash.1 on logstash1001, logspam (similar to the what is described in https://github.com/logstash-plugins/logstash-output-elasticsearch/issues/144) depleted the space on the root partition
  • 02:55 logmsgbot: !log l10nupdate@tin ResourceLoader cache refresh completed at Thu Jun 9 02:55:20 UTC 2016 (duration 6m 19s)
  • 02:53 mutante: !log ms-be2012 ran out of disk due to huge syslog, deleted log, restarted rsyslogd
  • 02:49 logmsgbot: !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 11m 02s)
  • 02:26 logmsgbot: !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 11m 00s)
  • 00:03 twentyafterfour: !log Preparing to deploy phabricator update. Tagged release/2016-06-08/1

2016-06-08

  • 23:17 logmsgbot: !log maxsem@tin Synchronized php-1.28.0-wmf.5/extensions/WikimediaEvents/: https://gerrit.wikimedia.org/r/#/c/293439/ (duration: 00m 23s)
  • 23:15 logmsgbot: !log maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/293438 (duration: 00m 25s)
  • 23:07 logmsgbot: !log maxsem@tin Synchronized php-1.28.0-wmf.4/extensions/LiquidThreads/: https://gerrit.wikimedia.org/r/#/c/293247/ (duration: 00m 26s)
  • 23:05 logmsgbot: !log maxsem@tin Synchronized php-1.28.0-wmf.5/extensions/LiquidThreads/: https://gerrit.wikimedia.org/r/#/c/293247/ (duration: 00m 26s)
  • 22:51 hoo: !log Re-started dumpwikidatattl on snapshot1003
  • 22:44 logmsgbot: tgr@tin Synchronized wmf-config/InitialiseSettings.php: enable AuthManager on group1 for reals T135504 (duration: 00m 25s)
  • 22:27 logmsgbot: tgr@tin Synchronized wmf-config/InitialiseSettings.php: enable AuthManager on group1 T135504 (duration: 00m 23s)
  • 22:21 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.5/extensions/OpenStackManager/: backport gerrit:293130 for AuthManager deploy T135504 (duration: 00m 28s)
  • 22:05 ottomata: starting kafka broker on kafka1012 after swapping disk and copying data directory
  • 22:01 logmsgbot: krinkle@tin Synchronized wmf-config/CommonSettings.php: Bump wgResourceLoaderStorageVersion (T134368) (duration: 00m 28s)
  • 21:12 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.5
  • 21:04 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/includes/specials/SpecialSearch.php: Add a visual clear to Special:Search input box and profile-tabs (duration: 00m 23s)
  • 20:57 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/extensions/Renameuser/RenameuserSQL.php: Use master DB when touching the user to signal rename end (duration: 00m 22s)
  • 20:50 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/includes/libs/objectcache/WANObjectCache.php: Avoid getWithSetCallback() warnings on unversioned key migration (duration: 00m 24s)
  • 20:21 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.5/extensions/Kartographer/styles/kartographer.less: Fixed <maplink> autostyling (duration: 00m 26s)
  • 20:18 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.4/extensions/Kartographer: late SWAT: Fix color extraction (duration: 00m 36s)
  • 19:30 mobrovac: change-prop deploying 08a1b1d
  • 19:27 hashar: gallium enabling puppet again now that zuul/jenkins are back
  • 19:18 hashar: Bringing back Jenkins and Zuul on gallium T137265
  • 18:59 logmsgbot: ori@palladium conftool action : set/pooled=yes; selector: name=scb1002.eqiad.wmnet
  • 18:57 yurik: switched kartotherian to scap3, deployed, restarted
  • 18:20 gehel: switching maps to scap3 deployment
  • 16:50 jynus: cloning /var/lib/jenkins from db1085 to contint1001
  • 16:46 ottomata: stopping kafka broker and puppet on kafka1012 to replace sdf
  • 16:37 ottomata: powercycling scb1002
  • 16:36 hashar: Disabled puppet on contint1001 to prevent it from bringing back Jenkins
  • 16:32 logmsgbot: otto@palladium conftool action : set/pooled=no; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mathoid'])
  • 16:32 logmsgbot: otto@palladium conftool action : set/pooled=no; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 16:32 logmsgbot: otto@palladium conftool action : set/pooled=no; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mobileapps'])
  • 16:32 logmsgbot: otto@palladium conftool action : set/pooled=no; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=cxserver'])
  • 16:32 logmsgbot: otto@palladium conftool action : set/pooled=no; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=citoid'])
  • 16:32 logmsgbot: otto@palladium conftool action : set/pooled=no; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=graphoid'])
  • 16:24 ottomata: restarting hadoop-yarn-resourcemanager on analytics1002 to make analytics1001 active
  • 16:07 mobrovac: scb1002 enabling back puppet
  • 16:02 elukey: temporary set a 10TB upperbound to the Kafka webrequest_text topic to free space (T136690)
  • 15:43 ottomata: restarting zk in codfw and eqiad 1 by 1 to apply maxClientCnxns=1024
  • 15:12 ottomata: restarting zookeeper 1 by 1 in eqiad
  • 15:03 _joe_: contint1001: systemctl mask zuul,zuul-merger
  • 14:57 elukey: rolling out the new Varnishkafka version in cache misc (didn't do it before since there was an outage ongoing)
  • 14:53 jynus: rebooting gallium with netboot for hardware maintenance
  • 14:44 mobrovac: scb1001 enabling and running puppet on scb1001
  • 13:44 jynus: running fsck.ext3 /dev/sda2 in read-write mode for gallium
  • 13:42 ottomata: powercycling scb2001 and scb2002
  • 13:30 akosiaris: disabling puppet on scb1001 & scb1002
  • 13:30 mobrovac: change-prop stopped on scb1002
  • 13:29 akosiaris: stopping changeprop on scb1001
  • 13:26 ottomata: powercycling scb1002
  • 13:18 ottomata: powercycling scb1001
  • 13:08 elukey: rolling out new varnishkafka package in cache misc
  • 12:09 jynus: mounted temporarily / partition from gallium sda on db1085:/mnt
  • 10:40 moritzm: uploaded jenkins 1.651.2 for jessie-wikimedia to carbon
  • 10:13 elukey: rolling out the new varnishkafka package to cache maps
  • 10:04 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/includes/deferred/LinksDeletionUpdate.php: fd44d649787ede78687b4cd2ef21e44a4c8b843b (duration: 00m 33s)
  • 08:28 hashar: stopping Jenkins / zuul / zuul-merger / puppet on gallium
  • 08:15 elukey: lowering down webrequest_text kafka topic retention time from 7 days to 4 days to free disk space (T136690)
  • 08:14 hashar: Jenkins has bunch of executors dead for what ever reason preventing jobs from running :(
  • 07:53 mobrovac: change-prop deploying 84d56e53a
  • 06:59 moritzm: enabling ferm on palladium (will lead to temporary puppet failures)
  • 02:58 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jun 8 02:58:28 UTC 2016 (duration 6m 31s)
  • 02:51 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.5) (duration: 06m 49s)
  • 02:51 legoktm: / on gallium is currently read-only for some reason
  • 02:29 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 11m 11s)
  • 00:11 awight_: update fundraising-tools from b2425aef2154d6b689900f4848cca02880321230 to 28bc2da677caa795c58f906db76a1f8d612ac899

2016-06-07

  • 23:46 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.5/includes/deferred/LinksUpdate.php: 6d85caaa9bb5918cb2888fc82f2c7c346cf746a2 (duration: 00m 25s)
  • 23:36 SMalyshev: redeploying WDQS to update the Updater for T128947 fix
  • 23:35 logmsgbot: tgr@tin Synchronized wmf-config/InitialiseSettings.php: SWAT gerrit:292518 User rights configuration for meta. wmf-supportsafety group (duration: 00m 26s)
  • 23:20 logmsgbot: tgr@tin Finished scap: (no message) (duration: 24m 51s)
  • 23:02 awight: update paymentswiki from 28e10141454ef53085aed4c6619a34d3a4b43c58 to de11bfe2273d0bcaa0e713389b2d91e8b3567a1d; add PP cert
  • 22:56 tgr: scapping AuthManager backports + feature switch enabled on group0 T135504
  • 22:56 logmsgbot: tgr@tin Started scap: (no message)
  • 22:10 mutante: icinga config broken: Error: Could not find any host matching 'relforge1001'
  • 21:35 twentyafterfour: restarted apache on iridium to deploy D250
  • 20:02 andrewbogott: dist-upgrade on labvirt1010, in hopes of resolving a nova-compute lockup (possibly related to a kvm upgrade earlier today)
  • 20:00 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.28.0-wmf.5
  • 19:44 jynus: restarting es2017 due to a bunch of ACPI errors (probably memory-caused)
  • 19:35 logmsgbot: thcipriani@tin Finished scap: testwiki to php-1.28.0-wmf.5 and rebuild l10n cache (duration: 26m 40s)
  • 19:08 logmsgbot: thcipriani@tin Started scap: testwiki to php-1.28.0-wmf.5 and rebuild l10n cache
  • 18:30 andrewbogott: rebooting labvirt1011
  • 17:51 ottomata: restarting broker on kafka1020
  • 17:44 Dereckson: `mwscript initSiteStats.php --wiki kshwiki --update` on Terbium (T137234)
  • 17:33 mutante: furud - shutdown, decom, deleteV VM
  • 17:30 ejegg: updated payments-wiki from 3df3329f75fdbc679baf37bfd3955880091b3ae1 to 28e10141454ef53085aed4c6619a34d3a4b43c58
  • 17:06 logmsgbot: krinkle@tin Synchronized wmf-config/CommonSettings.php: clean-up
  • 17:05 ejegg: rolled back payments-wiki to 3df3329f75fdbc679baf37bfd3955880091b3ae1
  • 17:04 thcipriani: starting branch-cut for mediawiki and extensions for version 1.28.0-wmf.5
  • 17:04 ejegg: updated payments-wiki from 3df3329f75fdbc679baf37bfd3955880091b3ae1 to 413bd3ea92ac570c081532c71891c31391194984
  • 16:01 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Update audit hooks for AuthManager (duration: 00m 24s)
  • 15:53 logmsgbot: thcipriani@tin Synchronized wmf-config/wikitech.php: SWAT: Do not set wgAuth to LdapAuth when AuthManager is enabled (duration: 00m 23s)
  • 15:48 logmsgbot: thcipriani@tin Synchronized portals: SWAT: T135902 adding readme and license to wikipedia.org portal (duration: 00m 25s)
  • 15:48 logmsgbot: thcipriani@tin Synchronized portals/prod/wikipedia.org/assets: SWAT: T135902 adding readme and license to wikipedia.org portal (duration: 00m 25s)
  • 15:41 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: huwiki: Enable Popups A/B test for 50% of users (duration: 00m 24s)
  • 15:32 logmsgbot: thcipriani@tin Synchronized wmf-config: SWAT: Revert "Send wmf.4 search and ttmserver traffic to codfw" (duration: 00m 26s)
  • 15:24 logmsgbot: thcipriani@tin Synchronized wmf-config/PrivateSettings.php: SWAT: Use bot password for TNBot after touch wmf-config/PrivateSettings.php (duration: 00m 25s)
  • 15:16 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Use bot password for TNBot (duration: 00m 34s)
  • 15:15 logmsgbot: thcipriani@tin Synchronized private/PrivateSettings.php: SWAT: password update for Translation Notification Bot (duration: 00m 41s)
  • 14:47 elukey: installing varnishkafka 1.0.10-1 on cp1046 manually to test the new version.
  • 14:23 jynus: stopping mysql and the OS @ es2017 for hardware maintenance
  • 13:53 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Pool new s5 db hosts: db1082, db1087, db1092 with low weight (duration: 00m 23s)
  • 13:52 logmsgbot: jynus@tin Synchronized wmf-config/db-codfw.php: Add new coredb servers to alias configuration (duration: 00m 38s)
  • 13:49 jynus: about to pool new dewiki/wikidata servers T133398
  • 12:27 moritzm: rolling out gdk-pixbuf security updates
  • 12:23 moritzm: rolling restart of sca cluster for libxml2 security update
  • 11:27 moritzm: restarting apache2 on californium (hosting horizon dashboard) for libxml2 update
  • 11:23 moritzm: restarting apache2 on silver (hosting wikitech) for libxml2 update
  • 11:08 hashar: restarted apache2 on gallium for libxml2 update
  • 10:53 moritzm: restarting apache2 on iridium (hosting Phabricator) for libxml2 update
  • 10:18 moritzm: rolling restart of hhvm on eqiad appservers to pick up libxml2 update
  • 09:09 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1070 after maintenance (duration: 00m 27s)
  • 09:04 hashar: Upgrading Jenkins IRC plugin 2.25..2.27 and instant messaging plugin 1.34..1.35 . The former should fix a deadlock on shutdowning Jenkins
  • 09:00 moritzm: rolling restart of hhvm on codfw appservers to pick up libxml2 update
  • 08:53 moritzm: rolling restart of hhvm on appserver canaries to pick up libxml2 update
  • 08:28 moritzm: deploying libxml2 security updates on Ubuntu systems (Debian systems already upgraded last week)
  • 07:19 jynus: stopping and cloning db1070 to new s5 servers
  • 07:08 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1070 for cloning (duration: 00m 29s)
  • 02:30 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jun 7 02:30:57 UTC 2016 (duration 5m 32s)
  • 02:25 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 09m 36s)
  • 01:10 logmsgbot: aude@tin Synchronized php-1.28.0-wmf.4/extensions/Wikidata: Fix bug (T136093) in display of labels after edit (duration: 02m 03s)
  • 00:39 Krenair: (TXT record for SPF, actually)
  • 00:39 Krenair: Created MX and SPF records directly for wmflabs.org. for https://phabricator.wikimedia.org/T137160#2359786
  • 00:35 ejegg: updated settings on payments-wiki
  • 00:26 logmsgbot: ori@tin Synchronized php-1.28.0-wmf.4/extensions/CentralAuth/includes/CentralAuthHooks.php: I79cbb1dc: Prefetch $wgCentralAuthLoginWiki DNS (T92864) (duration: 00m 29s)

2016-06-06

  • 23:41 logmsgbot: maxsem@tin Synchronized wmf-config/abusefilter.php: https://gerrit.wikimedia.org/r/#/c/292758/ (duration: 00m 24s)
  • 23:32 logmsgbot: maxsem@tin Synchronized php-1.28.0-wmf.4/extensions/GeoData/: (no message) (duration: 00m 25s)
  • 23:29 logmsgbot: maxsem@tin Synchronized private/PrivateSettings.php: Updated Zero password (duration: 00m 25s)
  • 23:21 Amir1: deploying ae71d84 into ores in prod
  • 23:17 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/293037/ (duration: 00m 24s)
  • 23:14 logmsgbot: maxsem@tin Synchronized portals: https://gerrit.wikimedia.org/r/#/c/292992/ (duration: 00m 31s)
  • 23:13 logmsgbot: maxsem@tin Synchronized portals/prod/wikipedia.org/assets: https://gerrit.wikimedia.org/r/#/c/292992/ (duration: 00m 30s)
  • 23:04 logmsgbot: maxsem@tin Synchronized docroot/wikipedia.org/.well-known/apple-app-site-association: https://gerrit.wikimedia.org/r/#q,287190,n,z (duration: 00m 25s)
  • 22:05 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.4/includes/api/ApiStashEdit.php: 50ce579046e07 (duration: 00m 23s)
  • 20:25 arlolra: updated Parsoid to version e8d6092e
  • 20:09 arlolra: starting Parsoid deploy
  • 19:15 ottomata: restarting kafka broker on kafka1020 to test python consumption client
  • 19:12 bblack: restarted nginx on rcs1002 (was stuck half-shut-down for reload?), started nginx on rcs1001 (wasn't running at all)
  • 19:08 mutante: ran puppet on carbon because icinga said fail, saw it change STS headers, but no fail
  • 19:06 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.4/includes/page/WikiPage.php: 661c22db3a352 (duration: 00m 30s)
  • 18:08 ori: Running rebuildrecentchanges.php for test2wiki for T133225
  • 17:14 gehel: deploying latest GUI for wikidata query service
  • 16:58 logmsgbot: tgr@tin Synchronized wmf-config/PrivateSettings.php: (no message) (duration: 00m 23s)
  • 16:57 logmsgbot: tgr@tin Synchronized private/PrivateSettings.php: (no message) (duration: 00m 23s)
  • 16:44 logmsgbot: tgr@tin Synchronized wmf-config/PrivateSettings.php: (no message) (duration: 00m 23s)
  • 16:39 tgr: PrivateSettings changes were for T135074
  • 16:39 logmsgbot: tgr@tin Synchronized wmf-config/PrivateSettings.php: (no message) (duration: 00m 27s)
  • 16:37 logmsgbot: tgr@tin Synchronized private/PrivateSettings.php: (no message) (duration: 00m 26s)
  • 16:23 _joe_: rebooting mw1262
  • 16:22 logmsgbot: tgr@tin Synchronized wmf-config/CommonSettings.php: creating zeroscript grant group on zerowiki, gerrit: 292951 (duration: 00m 28s)
  • 16:00 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Math: Set wgMathFullRestbaseURL to point to wikimedia.org in production (duration: 00m 24s)
  • 15:45 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: ULS: Stop using /static/current (duration: 00m 24s)
  • 15:37 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.4/extensions/VisualEditor/modules/ve-mw/init/targets/ve.init.mw.MobileArticleTarget.js: SWAT: Fix config of mobile surfaces (duration: 00m 24s)
  • 15:32 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Use wfLoadExtension for LocalisationUpdate (duration: 00m 27s)
  • 15:21 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT Switch Wikivoyages to Single Edit Tab mode for VE Beta Feature (duration: 00m 24s)
  • 15:14 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable VisualEditor by default for logged-in users on four Wikipedias PART II (duration: 00m 30s)
  • 15:14 logmsgbot: thcipriani@tin Synchronized dblists/visualeditor-default.dblist: SWAT: Enable VisualEditor by default for logged-in users on four Wikipedias PART I (duration: 00m 29s)
  • 15:04 jynus: dropping old outreach databases on m1
  • 14:10 jynus: dropping old bugzilla databases from m1
  • 14:00 jynus: dropping database blog from m1
  • 12:34 hashar: restarted Jenkins, deadlock in IRC plugin
  • 10:46 elukey: re-added kafka1001 to eventbus.svc.eqiad.wmflabs without rebooting since some concerns were raised from the Services team. Will have a discussion with them before proceeding.
  • 10:45 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: kafka1001.eqiad.wmnet
  • 10:33 moritzm: installing perl updates (bugfixes and CVE-2015-8853)
  • 10:27 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: kafka1001.eqiad.wmnet
  • 10:25 elukey: rebooting kafka100[12] for kernel upgrades (one at the time with de-pool/re-pool actions)
  • 09:12 moritzm: installing dpkg bugfix updates on jessie systems
  • 08:45 mobrovac: change-prop deployed 9b04e475
  • 08:27 gehel: lowering elasticsearch high watermark on eqiad cluster to rebalance disk space
  • 08:17 _joe_: rebooting mw1262
  • 07:57 jynus: enabling GTID on pending coredb servers on eqiad
  • 06:18 logmsgbot: aaron@tin Synchronized php-1.28.0-wmf.4/includes/cache/LinkBatch.php: c2ba764f38e44e7 (duration: 00m 30s)
  • 05:34 robh: db2034 locked up via serial console. details on T137084, rebooting since its unresponsive to ssh or serial.
  • 02:28 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Jun 6 02:28:50 UTC 2016 (duration 5m 56s)
  • 02:22 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 09m 34s)

2016-06-05

  • 14:55 Dereckson: `mwscript initSiteStats.php --wiki csbwiki --update` (T137060)
  • 02:27 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Jun 5 02:27:38 UTC 2016 (duration 5m 35s)
  • 02:22 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 08m 55s)

2016-06-04

  • 20:18 apergos: rebooting mw1135, unresponsive to ssh or console login
  • 09:51 elukey: restarted hhvm on mw1144 after the host was hanging (OOM killer restored basic host functionalities but not hhvm)
  • 09:47 elukey: removed temporary Analytics Kafka upload retention override
  • 09:38 elukey: Lowering down temporarily the Analytics kafka upload retention time to 24h to free space (T136690)
  • 02:30 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jun 4 02:30:50 UTC 2016 (duration 5m 39s)
  • 02:25 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 09m 08s)

2016-06-03

  • 22:57 Krinkle: Purged https://en.wikipedia.org/static/images/project-logos/bawiki.png
  • 22:53 logmsgbot: krinkle@tin Synchronized static/images/project-logos/bawiki.png: (no message) (duration: 00m 24s)
  • 21:57 YuviPanda: started copying graphite data from usb back
  • 21:27 awight: update paymentswiki from 28b98ec254b2a15c8df61c568b62f221b328222f to 3df3329f75fdbc679baf37bfd3955880091b3ae1
  • 20:47 ejegg: updated payments-wiki de86eadcd98922ee4207a0c46112585f3ba5c48d to 28b98ec254b2a15c8df61c568b62f221b328222f
  • 20:25 ejegg: updated GatewayReady hook on paymentswiki
  • 19:37 logmsgbot: krinkle@tin Synchronized php-1.28.0-wmf.4/extensions/WikimediaEvents/extension.json: T136920 (duration: 00m 28s)
  • 19:04 mutante: releases apt repo on bromine: export fresh jessie-mediawiki indexes
  • 17:41 mutante: uploaded parsoid 0.5.1 to releases
  • 17:14 robh: bast4001 coming down for second hdd installation. (there are currently no active users on system)
  • 16:58 mutante: magnesium - shutdown -h now, bye
  • 15:30 logmsgbot: tgr@tin Finished scap: revert AbuseFilter + config to pre-extension-registration state T136929 (duration: 06m 13s)
  • 15:24 logmsgbot: tgr@tin Started scap: revert AbuseFilter + config to pre-extension-registration state T136929
  • 14:38 gehel: un-freezing writes from CirrusSearch to eqiad cluster during upgrade (T133126)
  • 13:27 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: kafka2001.codfw.wmnet
  • 13:22 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: kafka2001.codfw.wmnet
  • 13:22 logmsgbot: elukey@palladium conftool action : set/pooled=yes; selector: kafka2002.codfw.wmnet
  • 13:16 hasharAway: Reenabling puppet on gallium. Forgot to put it back yesterday
  • 13:14 logmsgbot: elukey@palladium conftool action : set/pooled=no; selector: kafka2002.codfw.wmnet
  • 13:11 elukey: rebooting kafka200[12] (codfw EventBus) for kernel upgrades
  • 11:18 gehel: freezing writes from CirrusSearch to eqiad clsuter during upgrade (T133126)
  • 10:48 gehel: taking elasticsearch eqiad cluster down for upgrade to 2.3 (T133126)
  • 10:39 gehel: Starting upgrade of elasticsearch eqiad cluster to 2.3 (T133126)
  • 10:35 moritzm: restarting apache on bohrium (serving piwik.wikimedia.org) for libxml2 security update
  • 10:23 moritzm: restarting apache on planet1001 (serving planet.wikimedia.org) for libxml2 security update
  • 08:42 moritzm: rolling restart of scb cluster (mathoid, ores-uwsgi) in eqiad to pick up libxml2 security updates
  • 08:38 jynus: archiving again syslog.1 from ms-be2012 on /srv/swift-storage/sdl1/tmp
  • 08:35 jynus: created new LDAP group grafana-admin, gid=1007
  • 08:34 elukey: rebooting kafka1012 for kernel upgrades.
  • 08:08 moritzm: installing libxml2 security updates on jessie systems
  • 07:19 kart_: Update cxserver to 19a71f1
  • 06:29 moritzm: installing nginx security updates on Ubuntu systems (Debian installs updated some days ago)
  • 02:36 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jun 3 02:36:39 UTC 2016 (duration 5m 58s)
  • 02:30 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 08m 35s)
  • 01:09 mutante: bromine - puppet currently stopped needs some permission fixes for release upload
  • 01:08 mutante: uploaded parsoid 0.5.0 deb to releases.wm.org
  • 00:24 logmsgbot: awight@tin Finished scap: Deploying labtestwiki AuthManager config; Enabling Popups experiment; CentralNotice fixes for T136408, T136387; Special:Notifications fixes (duration: 25m 08s)

2016-06-02

  • 23:59 logmsgbot: awight@tin Started scap: Deploying labtestwiki AuthManager config; Enabling Popups experiment; CentralNotice fixes for T136408, T136387; Special:Notifications fixes
  • 23:32 logmsgbot: awight@tin Synchronized wmf-config/InitialiseSettings.php: Add namespace translation 'Portal' for diq (duration: 00m 24s)
  • 23:28 logmsgbot: awight@tin Synchronized wmf-config/InitialiseSettings.php: Enable AuthManager on beta wikitech (duration: 00m 25s)
  • 23:24 logmsgbot: awight@tin Synchronized wmf-config/InitialiseSettings.php: Enable Hovercards experiment for 1% of users on huwiki (duration: 00m 24s)
  • 23:23 logmsgbot: awight@tin Synchronized php-1.28.0-wmf.4/extensions/Popups: Do not show Hovercards when NavPopups gadget is enabled on huwiki (duration: 00m 24s)
  • 23:21 logmsgbot: awight@tin Synchronized wmf-config/extension-list-labs: Test PageAssessments on Beta Labs (duration: 00m 25s)
  • 23:20 logmsgbot: awight@tin Synchronized wmf-config/InitialiseSettings-labs.php: Test PageAssessments on Beta Labs (duration: 00m 26s)
  • 23:20 logmsgbot: awight@tin Synchronized wmf-config/CommonSettings-labs.php: Test PageAssessments on Beta Labs (duration: 00m 24s)
  • 22:37 logmsgbot: ori@tin Synchronized wmf-config/InitialiseSettings.php: I9dc532b3: Enable "purge" log group (duration: 00m 42s)
  • 22:20 mutante: removed my gerrit admin flag
  • 20:20 mutante: magnesium (formerly RT) remove from puppet and icinga, revoked cert and salt key, just waiting another day or before shutdown
  • 20:18 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: logging: disable Wikibase\Client\Changes\WikiPageUpdater channel (duration: 00m 26s)
  • 20:12 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.4
  • 19:53 ottomata: stopping kafka broker and restarting kafka1014
  • 19:52 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.4/extensions/CheckUser/specials/SpecialCheckUser.php: Fix Special:Checkuser for log entries when cuc_title = "" (duration: 00m 31s)
  • 19:37 ejegg: re-enabled adyen job runner
  • 19:35 logmsgbot: akosiaris@palladium conftool action : set/pooled=yes; selector: scb2002.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=ores'])
  • 19:35 logmsgbot: akosiaris@palladium conftool action : set/pooled=yes; selector: scb2001.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=ores'])
  • 19:35 logmsgbot: akosiaris@palladium conftool action : set/pooled=yes; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 19:35 logmsgbot: akosiaris@palladium conftool action : set/pooled=yes; selector: scb1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 18:57 ejegg: disabled adyen job runner
  • 18:47 jynus: restarting replication on db1016
  • 18:43 YuviPanda: powercycle labmon1001 again, get into bios
  • 18:29 YuviPanda: going to try to intentionally trip the NFS check on tools-checker. This will not page
  • 18:24 YuviPanda: powercycle labmon1001 again
  • 18:19 mutante: db2007, revoke puppet cert, delete salt key, nuke from stored configs / icinga
  • 18:19 bearND: mobileapps deployed b2fee30
  • 18:18 mutante: db2007 shutdown, schedule eternal downtime
  • 18:04 bearND: starting mobileapps deploy
  • 17:40 subbu: finished deploying parsoid version 7188080b
  • 17:34 subbu: synced new code; restarted parsoid on wtp1001 as a canary
  • 17:29 subbu: starting deploy of new parsoid code
  • 17:21 mutante: ran ALTER TABLE character set utf8 .. (https://phabricator.wikimedia.org/T119112#2311402) on RT db
  • 17:16 mutante: running RT database upgrade from 4.0.4 to 4.2.8
  • 17:13 awight: update paymentswiki from d26426c4225080c95f0bd5a6a31c54e4826287b1 to de86eadcd98922ee4207a0c46112585f3ba5c48d
  • 17:05 mutante: stopped exim on magnesium
  • 17:05 jynus: stopping replication from db1001 to db1016 (pasive m1 node) before schema change
  • 16:52 mutante: magnesium (RT), tmp. stopped RT and puppet
  • 16:50 YuviPanda: begin reinstall of labmon1001
  • 15:19 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.4/extensions/Math: SWAT: Use img instead of meta tags for SVGs and Fix iterator in batchGetMathML (duration: 00m 28s)
  • 15:12 logmsgbot: thcipriani@tin Synchronized portals: deploying new localized top-links on wikipedia.org (duration: 00m 31s)
  • 15:11 logmsgbot: thcipriani@tin Synchronized portals/prod/wikipedia.org/assets: deploying new localized top-links on wikipedia.org (duration: 00m 32s)
  • 14:33 jynus: acked ores icinga checks on some scb hosts and pointing to T124201 (it seems the checks arrived before the actual setup)
  • 13:52 moritzm: installing imagemagick security updates on Ubuntu systems (but affected decoders already neutralised by policy changes) (also Debian systems already addressed)
  • 13:34 hashar: Downgrading Zuul back to zuul_2.1.0-95-g66c8e52-wmf1precise1_amd64.deb . Paramiko cant acquire ssh connection with Gerrit for some reason... https://phabricator.wikimedia.org/P3204
  • 12:10 hashar: Upgraded Zuul upstream code being 66c8e52..30a433b package is 2.1.0-151-g30a433b-wmf1precise1
  • 11:39 logmsgbot: jmm@tin Synchronized wmf-config/CommonSettings.php: disable firejail security hardening for image scalers, needs more work for the Score extension (duration: 00m 36s)
  • 10:55 hashar: Restarted Zuul and reenabled puppet on gallium
  • 10:50 hashar: gallium: stopped puppet agent
  • 10:49 hashar: gracefully stopping Zuul, will upgrade / take traces etc over the next half hour or so
  • 10:14 jynus: archiving again syslog.1 from ms-be2012 on /srv/swift-storage/sdl1/tmp
  • 10:08 mobrovac: restbase enabling puppet back in production
  • 08:40 mobrovac: restbase deploy end of 19f25925
  • 08:29 mobrovac: restbase deploy start of 19f25925
  • 08:09 mobrovac: restbase disabling puppet in production for testing https://gerrit.wikimedia.org/r/#/c/292109/ in staging
  • 07:23 moritzm: rebooting etherpad1001 (hosting etherpad.wikimedia.org) for upgrade to Linux 4.4
  • 07:02 jynus: performing schema change for db1057
  • 03:04 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Jun 2 03:04:44 UTC 2016 (duration 6m 40s)
  • 02:58 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 15m 37s)
  • 02:24 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.3) (duration: 10m 06s)
  • 01:52 mutante: scb1001/2001 ores - connection refused
  • 01:52 mutante: mw1136 service hhvm restart
  • 01:37 mutante: labsdb1001 /etc/init.d/mysql start
  • 01:32 YuviPanda: service mysql start on labsdb1001
  • 01:25 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Set $wgSpamBlacklistEventLogging to true on testwiki (duration: 00m 22s)
  • 01:25 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Set $wgSpamBlacklistEventLogging to true on testwiki (duration: 00m 23s)
  • 01:23 YuviPanda: reboot labsdb1001
  • 01:21 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.4/extensions/Flow/handlebars/: HACK: Hide reply form for locked topics (T135848) (duration: 00m 24s)
  • 01:19 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.4/extensions/Echo/includes/special/NotificationPager.php: Fix notification pager (T136759) (duration: 00m 25s)
  • 01:18 YuviPanda: restart mysql on labsdb1001
  • 01:00 bearND: mobileapps reverted to 8d6d648c943074b7d3999baf31d60ad99249cd51
  • 00:55 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings-labs.php: Revert "Test PageAssessments extension on Labs" (no-op) (duration: 00m 22s)
  • 00:55 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings-labs.php: Revert "Test PageAssessments extension on Labs" (no-op) (duration: 00m 23s)
  • 00:26 logmsgbot: awight@tin Synchronized php-1.28.0-wmf.4/extensions/CentralNotice: Fix for T136387 (duration: 00m 38s)
  • 00:05 urandom: Deploy of cdff5e3 to RESTBase production complete
  • 00:03 YuviPanda: started nfs-exports on labstore1001

2016-06-01

  • 23:57 urandom: Deploying cdff5e3 to RESTBase production
  • 23:51 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Revert Use extension registration for SpamBlacklist (T119117) (duration: 00m 24s)
  • 23:49 urandom: Deploying cdff5e3 to restbase1008.eqiad.wmnet (canary node)
  • 23:44 urandom: Deploy of RESTBase to staging environment complete
  • 23:40 urandom: Deploying RESTBase to staging environment
  • 23:39 urandom: RESTBase deploy to xenon.eqiad.wmnet (canary node) complete
  • 23:38 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings-labs.php: Test PageAssessments extension on Labs (no-op) (duration: 00m 26s)
  • 23:37 logmsgbot: dereckson@tin Synchronized wmf-config/InitialiseSettings-labs.php: Test PageAssessments extension on Labs (no-op) (duration: 00m 30s)
  • 23:36 urandom: Deploying RESTBase to xenon.eqiad.wmnet (canary node)
  • 23:26 logmsgbot: dereckson@tin Synchronized php-1.28.0-wmf.4/extensions/VisualEditor/modules/ve-mw/init/targets/ve.init.mw.DesktopArticleTarget.js: Simplify teardown of toolbar save button (T136421) (duration: 00m 23s)
  • 23:21 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Use full URL in $wgNoticeHideUrls (T130442) (duration: 00m 23s)
  • 23:17 urandom: Deploying d8fa5c0 to RESTBase production
  • 23:10 logmsgbot: dereckson@tin Synchronized wmf-config/CommonSettings.php: Use HTTPS URL to citoid instead of protocol-relative (T136423) (duration: 00m 32s)
  • 23:06 urandom: Update restbase staging to f05b66f
  • 22:36 cwd: updated paymentswiki from 44bd699d6700ac4faf3c2d772ba713b093ae8cb8 to d26426c4225080c95f0bd5a6a31c54e4826287b1
  • 22:30 logmsgbot: twentyafterfour@tin Synchronized php-1.28.0-wmf.4/extensions/CentralNotice/: deploy https://gerrit.wikimedia.org/r/#/c/292279/ (duration: 00m 26s)
  • 21:38 twentyafterfour: train has left the station
  • 21:37 logmsgbot: twentyafterfour@tin Synchronized wmf-config/InitialiseSettings.php: deploy /wmf-config/InitialiseSettings.php for eranroz ( T132972 ) (duration: 00m 25s)
  • 21:31 logmsgbot: twentyafterfour@tin Synchronized php-1.28.0-wmf.4/includes/specials/SpecialPrefixindex.php: sync https://gerrit.wikimedia.org/r/#/c/292228/ ( T136738 ) (duration: 00m 26s)
  • 21:26 logmsgbot: twentyafterfour@tin Synchronized php-1.28.0-wmf.3/includes/specials/SpecialPrefixindex.php: sync https://gerrit.wikimedia.org/r/#/c/292234/ ( T136738 ) (duration: 00m 30s)
  • 20:58 bearND: mobileapps deployed ed0e2e4
  • 20:56 gehel: restarting postgresql on maps2001
  • 20:55 bearND: starting mobileapps deploy
  • 20:49 ejegg: updated paymentswiki from 7d222320b35ad8a44d8c77a4c3019364a49e53f2 to 44bd699d6700ac4faf3c2d772ba713b093ae8cb8
  • 20:44 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.28.0-wmf.4
  • 20:39 logmsgbot: twentyafterfour@tin Synchronized php-1.28.0-wmf.4/includes/cache/LinkBatch.php: deploy https://gerrit.wikimedia.org/r/#/c/292217/ (duration: 00m 27s)
  • 20:17 subbu: finished deploying parsoid sha afb0d522
  • 20:17 urandom: Rolling restart of RESTBase (redistribute Cassandra client connections?) : T126629
  • 20:10 subbu: synced new code; restarted parsoid on wtp1001 as a canary
  • 20:07 subbu: starting parsoid deploy
  • 19:43 ema: cp* hosts rebooted (T131928)
  • 19:40 bblack: restarting pybals for healthcheck config changes
  • 18:25 urandom: restarting Cassandra on restbase1007.eqiad.wmnet
  • 18:19 ejegg: updated payments-wiki from 5bb160e9898224e1d7d0a5c57fe408edb998a262 to 7d222320b35ad8a44d8c77a4c3019364a49e53f2
  • 18:16 ottomata: stopping kafka broker on kafka1018 and rebooting node
  • 17:51 urandom: Restarting Cassandra on restbase1007.eqiad.wmnet : T126629
  • 17:48 ema: depooled reboot of cp4* hosts (T131928)
  • 17:47 urandom: Temporarily disabling puppet to test setting on restbase1007.eqiad.wmnet : T126629
  • 17:15 ejegg: rolled back payments-wiki from a335a3a6f8909d1e7e1a79877512a12a0561aa2a to 5bb160e9898224e1d7d0a5c57fe408edb998a262
  • 17:06 ejegg: updated payments-wiki from 5bb160e9898224e1d7d0a5c57fe408edb998a262 to a335a3a6f8909d1e7e1a79877512a12a0561aa2a
  • 17:05 akosiaris: powered on lvs2006. disk change did not happen
  • 17:05 akosiaris: powered off lvs2006 for disk swap
  • 16:54 logmsgbot: tgr@tin Synchronized wmf-config/InitialiseSettings-labs.php: T135504: enable AuthManager in beta (duration: 00m 32s)
  • 16:39 logmsgbot: tgr@tin Synchronized php-1.28.0-wmf.4/extensions/NewUserMessage/: backport gerrit:292168 to update NewUserMessage for AuthManager (duration: 00m 29s)
  • 16:22 urandom: Disabling traces on restbase1008-a.eqiad.wmnet : T126629
  • 16:01 logmsgbot: thcipriani@tin Finished scap: SWAT: Update for AuthManager (duration: 26m 05s)
  • 15:58 urandom: Setting trace probability on restbase1008-a.eqiad.wmnet to 5% : T126629
  • 15:58 jynus: updating dns entry for db1080.eqiad.wment
  • 15:58 urandom: Disabling trace probability on restbase1007-a.eqiad.wmnet : T126629
  • 15:48 urandom: Setting trace probability to 5% on restbase1007-a.eqiad.wmnet : T126629
  • 15:35 logmsgbot: thcipriani@tin Started scap: SWAT: Update for AuthManager
  • 15:33 logmsgbot: thcipriani@tin Synchronized php-1.28.0-wmf.4/resources/src/moment-locale-overrides.js: SWAT: Avoid passing integers to mw.RegExp.escape (duration: 00m 24s)
  • 15:29 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Remove centralauth-autoaccount right (duration: 00m 25s)
  • 15:26 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable bot passwords on zerowiki (duration: 00m 24s)
  • 15:19 paravoid: Re-enabling OSPF on all cr1-codfw row subnets
  • 15:18 paravoid: Re-enabling cr1-codfw et-0/* interfaces
  • 15:18 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Enable RC patrol on ta.wikiquote" (duration: 00m 25s)
  • 15:15 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove no longer used Echo configuration PART II (duration: 00m 26s)
  • 15:14 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Remove no longer used Echo configuration PART I (duration: 00m 33s)
  • 15:13 paravoid: Rebooting cr1-codfw FPC 0
  • 15:09 paravoid: Upgrading cr1-codfw FPC 0 all PICs firmware
  • 15:08 logmsgbot: thcipriani@tin Synchronized static/images/sul: SWAT: Make SUL icons square and use global defaults (duration: 00m 41s)
  • 15:07 paravoid: Disabling cr1-codfw et-0/* (all row uplinks)
  • 15:03 akosiaris: restarted grrrit-wm after gerrit restart
  • 15:03 paravoid: Disabling OSPF on all cr1-codfw row subnets to drain FPC0
  • 15:02 akosiaris: restarted gerrit to enforce 100m maxObjectSizeLimit
  • 14:59 paravoid: Restoring VRRP priority on cr2-codfw
  • 14:57 bblack: depooled reboot of cp3048 (T131928)
  • 14:57 paravoid: Re-enabling OSPF on all cr2-codfw row subnets
  • 14:54 paravoid: Re-enabling cr2-codfw et-0/* interfaces
  • 14:49 paravoid: Rebooting cr2-codfw FPC 0
  • 14:48 paravoid: Upgrading cr2-codfw FPC 0 all PICs firmware
  • 14:42 paravoid: Disabling cr2-codfw et-0/2/0, et-0/2/1 (row C/D uplinks)
  • 14:34 paravoid: Disabling cr2-codfw et-0/0/0 (row A uplink)
  • 14:29 paravoid: Disabling cr2-codfw et-0/0/1 (row B uplink)
  • 14:15 paravoid: Disabling OSPF on all cr2-codfw row subnets to drain FPC0
  • 14:08 ema: depooled reboot of cp1* hosts (T131928)
  • 12:49 paravoid: draining cr2-codfw for firmware upgrade
  • 12:26 bblack: upgrade nginx to 1.11.1-1+wmf1 on all clusters
  • 11:50 elukey: rebooting kafka1022 for kernel upgrade (4.4)
  • 11:05 ema: rebooting cp3* spares (T131928)
  • 10:47 Dereckson: Script done for uca-it collation on itwiki: 10 599 758 rows processed
  • 10:47 ema: depooled reboot of cp3046 (T131928)
  • 10:47 ema: depooled reboot of cp3003 (T131928)
  • 10:45 ema: depooled reboot of cp3034 (T131928)
  • 10:39 ema: depooled reboot of cp3005 (T131928)
  • 10:38 ema: depooled reboot of cp3044 (T131928)
  • 10:35 ema: depooled reboot of cp3047 (T131928)
  • 10:31 ema: depooled reboot of cp3004 (T131928)
  • 10:28 ema: depooled reboot of cp3009 (T131928)
  • 10:14 ema: depooled reboot of cp3037 (T131928)
  • 10:11 jynus: moved syslog1 to ms-be2012:/srv/swift-storage/sdl1/tmp to avoid / fillup
  • 10:10 ema: depooled reboot of cp3008 (T131928)
  • 10:09 ema: depooled reboot of cp3035 (T131928)
  • 09:37 moritzm: installing libgd security updates
  • 09:28 ema: depooled reboot of cp3039 (T131928)
  • 09:23 ema: depooled reboot of cp3045 (T131928)
  • 09:21 ema: depooled reboot of cp3010 (T131928)
  • 09:18 ema: depooled reboot of cp3006 (T131928)
  • 09:16 ema: depooled reboot of cp3007 (T131928)
  • 09:10 ema: depooled reboot of cp3036 (T131928)
  • 08:25 mobrovac: mobileapps deploying 8d6d648
  • 08:24 ema: depooled reboot of cp3049 (T131928)
  • 08:22 hashar: Nodepool came back up just fine after labnodepool1001 reboot and is fully operational.
  • 08:15 jynus: deleting mysql logrotate scripts to avoid root spam
  • 08:14 moritzm: reboot labnodepool1001 for update to Linux 4.4
  • 07:56 elukey: event logging restarted on eventlog1001.eqiad.wmnet
  • 07:46 elukey: stopping kafka on kafka1020.eqiad and rebooting the host for Linux 4.4 upgrades
  • 07:43 moritzm: rolling reboot of scb in eqiad for update to Linux 4.4
  • 07:32 moritzm: restarted hhvm on mw1180
  • 07:05 mobrovac: change-prop restarting to apply https://gerrit.wikimedia.org/r/291201
  • 05:41 mobrovac: restbase deploy end of 5c99693
  • 05:26 mobrovac: restbase deploy start of 5c99693
  • 04:31 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.4/includes/: reapplied new version of I03739e94 (duration: 01m 21s)
  • 04:27 logmsgbot: demon@tin Synchronized php-1.28.0-wmf.3/includes/: reapplied new version of I03739e94 (duration: 01m 34s)
  • 03:11 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jun 1 03:11:11 UTC 2016 (duration 6m 39s)
  • 03:04 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.4) (duration: 15m 42s)
  • 02:30 logmsgbot: mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.3) (duration: 09m 30s)
  • 00:04 Dereckson: Started `mwscript updateCollation.php itwiki --previous-collation=uppercase` on Terbium (T136647)


Archives